<<

CHAPTER 8: EXTENDED TETRACHORD CLASSIFICATION

Chapter 7 introduced the notion of strange circles: using various circles of musical intervals as equivalence classes to which input pitch-classes are assigned. It illustrated this concept by ana- lyzing the internal structure of a network that was trained to identify four different types of tetra- chords. Chapter 8 provides a more complex example of this concept. It describes additional formulae that can be used to define twelve different types of tetrachords for each of the twelve major musical keys. It then reports the training of a multilayer perceptron that learned to classify an input tetrachord into these different tetrachord types. This is a more complex network, requir- ing seven hidden units to converge on a solution to this classification problem. However, this more complicated network can still have its internal structure interpreted. One reason for this is because it, like the Chapter 7 network, organizes input pitch-classes using strange circles. We provide an interpretation of this network, introducing an additional interpretative technique (exam- ining bands in jittered density plots). We then illustrate how the structure of this extended tetra- chord network provides an elegant example of coarse coding.

8.1 Extended Tetrachords ...... 2 8.2 Classifying Extended Tetrachords ...... 5 8.3 Interpreting the Extended Tetrachord Network ...... 7 8.4 Bands and Coarse Coding ...... 22 8.5 References ...... 25

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 2

8.1 Extended Tetrachords

Figure 7-27. Musical notation for twelve different tetrachord types, each using C as the root note.

8.1.1 Extended Chords Therefore the formula for the C major triad is 1-3-5. Adding the seventh note of the scale, Chapter 7 described training a multilayer B, produces the C major seventh tetrachord, perceptron to classify four different types of which follows the formula 1-3-5-7. This was tetrachords (major 7, minor 7, dominant 7, one of the added note tetrachords that we minor 7 flat 5), and presented a detailed studied earlier in this chapter. analysis of the internal structure of this net- work. The point of that example was to use More chords can be created by manipu- a fairly simple musical problem to illustrate lating formulae like those provided in the how an artificial neural network can organize previous paragraph. For instance, one input pitch-classes in terms of their mem- could flatten the third and the fifth note in the bership in various ‘strange circles’ (i.e. inter- formula 1-3-5-7. This produces the formula val-based equivalence classes). In this 1-♭3-5-♭7; if C is the root this is the set of chapter we turn to a more complicated mu- sical problem, involving a larger set of differ- notes [C, E♭, G, B♭], which defines the C ent types of tetrachords. As this problem is minor seventh tetrachord. Note that the flat- more complex, the multilayer perceptron that tened third and seventh notes do not belong solves it requires more hidden units. How- to the C . ever, these hidden units also organize inputs into a variety of strange circles which assists In jazz one often finds extended chords, the interpretation of the network’s internal which use formulae that add notes that fall structure. beyond the range of a major scale. For example, if one adds the D that is an The four tetrachords that we explored in octave higher than the second note in the C Chapter 7 were all examples of added note major scale to the C major triad, then one tetrachords. That is, each tetrachord could produces the Cadd9 tetrachord (C, E, G, D). be described as being constructed from a The formula for this chord is 1-3-5-9. triad based upon different notes that be- longed to a scale, with an fourth note added Figure 8-1 provides the musical notation, on top of this triad (see Figure 7-15). The and the musical chord symbol, for twelve fourth note also belonged to the scale. different types of tetrachords. Each of these example tetrachords uses C as the root note Another general approach to building tet- of the chord. Four of these tetrachord types rachords produces a greater variety of chord were used to train the multilayer perceptron types. One begins with a triad formula. For that was described earlier in this chapter. instance, if one takes the first, third, and fifth The other eight are new; the formula for notes of the C major scale (C, E, G, see each is provided in Table 8-1. Figure 7-15) the result is the C major triad.

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 3

Tetrachord Type Formula Example Forte Number Diminished Seventh 1-♭3-♭5-♭♭7 C7 4-28(3) Minor Seventh 1-♭3-5-♭7 Cm7 4-26(12) Minor Sixth 1-♭3-5-6 Cm6 4-27 Minor, Major Seventh 1-♭3-5-7 Cm(maj7) 4-19 Minor Added Ninth 1-♭3-5-9 Cm(add9) 4-14 Seventh, Flat Fifth 1-3-♭5-♭7 C7♭5 4-25(6) Augmented Seventh 1-3-♯5-♭7 C+7 4-(24(12) Seventh 1-3-5-♭7 C7 4-27 Sixth 1-3-5-6 C6 4-26(12) Major Seventh 1-3-5-7 Cmaj7 4-20(12) Added Ninth 1-3-5-9 Cadd9 4-22 Seventh, Suspended Fourth 1-4-5-♭7 C7sus4 4-23(12) Table 8-1. The names and formulas for twelve different types of tetrachords. An example of each is provided in Figure 8-1.

The formulae that are provided in Table 8-1 are designed to work in the context of When we train a multilayer perceptron to any major scale. The numbers in each for- classify the twelve different types of tetra- mula refer to a note’s position in a particular chords in Table 8-1, we will again be using scale. That is, 1 is the first note in a particu- pitch-class representation. Because of this, lar scale, 3 is the third note in a particular notes in extended chords like the added scale, and so on. This means that there are ninth chord will be moved back into the 12 different versions of each of the chord range of a single octave. As well, when we types listed in Table 8-1: one for each of the interpret the internal structure of the net- twelve possible major scales. work, we will be exploiting the properties of some of the circles of intervals that were When these formulae are used to create introduced in earlier sections of the current tetrachords in different keys, some interest- chapter. ing relationships between chords arise. Consider the 6 chord, whose formula is 1-3- For these reasons it is useful to repre- 5-6. In the context of the C major scale this sent the various tetrachords in another visu- produces the C6 chord whose notes are [C, al format. In particular, we can illustrate a E, G, A]. Now consider applying the formula tetrachord in a circle of pitch-classes (in par- for the minor seventh tetrachord (1-♭3-5-♭7) ticular, a circle of minor seconds) by drawing in four spokes that represent which four in the context of the A major scale. This notes are present in a particular chord. produces the Am7 chord whose notes are Drawing such a diagram will illustrate a par- [A, C, E, G]. Note that these notes are iden- ticular chord in the context of a specific ma- tical to those of C6; musically speaking Am7 jor key. However, this diagram represents is identical to an inversion of C6. Similarly, the structure of a tetrachord type for any the dominant seventh chord is the inversion key: if one rigidly rotates the spokes do a of a minor sixth tetrachord in a different key. different position in the circle, then it will

provide the notes for the same type of tetra- In other words, the same set of four chord, but relative to some other musical pitch-classes can have two different chord key. names. If we train a network to identify tet- rachord types, then it must be trained to Pitch-class diagrams of the first six tetra- generate both of these chord names to one chords provided in Figure 8-1 or in Table 8-1 set of four input pitch-classes. Table 7-8 are provided in Figure 8-2. Figure 8-3 pro- also provides the Forte numbers of each of vides similar diagrams for the other six tet- these chord types. Note that tetrachords rachord types. that are related by inversion – or tetrachords that can represent different names for the same set of input pitch-classes – have the same Forte number.

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 4

Figure 8-2. Pitch-class diagrams of the first Figure 8-3. Pitch-class diagrams of the sec- six tetrachords from the musical score in Fig- ond six tetrachords from the musical score in ure 8-1. Figure 8-1.

All of the chords presented in Figures 8-2 and 8-3 are created in the context of the C major scale. The structure of the spokes in the diagrams provides an interesting per- spective on the similarities and differences between various tetrachord types. For in- stance it is immediately apparent that both the diminished tetrachord and the seventh flattened fifth tetrachord include two pairs of notes that belong to the same circle of tri- tones, because both diagrams include two long spokes that bisect the circle. Similarly, one can see the similarity in spoke structure between the minor seventh and the sixth tetrachords, as well as between the seventh and the minor sixth tetrachords.

In the next section we will describe train- ing a multilayer perceptron to identify these twelve different types of tetrachords. Then we will interpret the internal structure of the network. At times during this interpretation it will be useful to come back to Figures 8-2 and 8-3 in order to achieve a quick visual understanding of the similarities between different types of tetrachords.

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 5

8.2 Classifying Extended Tetrachords

Figure 8-1. The architecture of the multilayer perceptron trained to identify twelve different types of tetrachords. See text for details.

8.2.1 Task of the appropriate output units, and to turn the remaining ten output units off. Our goal is to train an artificial neural network, when presented with four notes 8.2.2 Network Architecture that define a tetrachord, to identify the type of tetrachord, ignoring the tetrachord’s key. The architecture of the current network is This is exactly the same task that faced the an elaboration of the earlier tetrachord net- network that was described earlier in Sec- work, and is illustrated in Figure 8-4. The tion 7.3. The difference between the current current network uses twelve input units to network and that previous one two networks represent input pitch-classes, which is iden- is that the current network learns to classify tical to the earlier network. It differed from input chords into twelve different categories, the earlier network in requiring twelve output and not simply the four tetrachord types that units instead of four, because it identifies a were given to the earlier network. greater variety of tetrachord types. Also, we discovered that this problem was more At the end of training, the multilayer per- complicated than the previous one. As a ceptron used for this task typically turned result, the current network requires seven one output unit ‘on’ to identify tetrachord hidden units in order to discover a solution type, and turned the remaining eleven out- to the extended tetrachord problem. All of put units ‘off’, when presented a tetrachord. the output units and all of the hidden units in The exception to this occurred for the situa- the current network were value units that tion in which two different tetrachord types employed the Gaussian activation function. (e.g. 6 and m7) could be applied to the same four input pitch-classes. In this situa- 8.2.3 Training Set tion, the network was trained to turn on both

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 6

The training set consisted of 144 stimuli: pattern presentation was randomized before the twelve different tetrachords that could be each epoch. created in the context of a particular major scale (see Figure 8-1). We created these All connection weights in the network tetrachords for each of the twelve different were set to random values between -0.1 and major scales. Each was encoded as input 0.1 before training began. In the network to pattern in which four input units were acti- be described in detail below, each µ was vated with a value of 1, and the remaining initialized to 0, but was then modified by eight input units were all activated with a training. A learning rate of 0.01 was em- value of 0. Each input pattern was paired ployed. Training proceeded until the net- with an output pattern that indicated the tet- work generated a ‘hit’ for every output unit rachord type that the input pattern belonged for each of the 144 patterns in the training to. The network was trained to turn on the set. Once again a ‘hit’ was defined as activi- output unit(s) that represented the input pat- ty of 0.9 or higher when the desired re- terns type(s), and to turn all other output sponse was 1 or as activity of 0.1 or lower units off. when the desired response was 0.

8.2.4 Training This problem was solved fairly readily by a network that contained seven hidden value The multilayer perceptron was trained units, typically converging after between with the generalized delta rule developed for 7000 and 10,000 epochs of training. The networks of value units (Dawson & network described in more detail in the next Schopflocher, 1992) using the Rumelhart section converged after 7236 epochs of software program (Dawson, 2005). During a training. single epoch of training each pattern was presented to the network once; the order of

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 7

8.3 Interpreting the Extended Tetrachord Network 8.3.1 Jittered Density Plots variable, and can be thought of as a one dimensional scatter plot. Consider produc- The extended tetrachord network is the ing a jittered density plot for the activities most complicated one that we have yet en- generated by one hidden unit to each of the countered in this book. For instance, it has patterns of a training set. Each pattern is seven hidden units, making it very difficult to represented by one dot in the plot. The po- orient an interpretation by graphing the hid- sition of the dot along the x-axis of the graph den unit space. For this reason, we will represents the activity produced in the hid- begin to interpret the network by examining den unit by that pattern. The position of the two different characteristics of each hidden dot along the y-axis is a random number that unit: the weights of the connections that feed has no meaning; this random ‘jittering’ is into a hidden unit and the activity produced used to prevent different dots in the plot by the hidden unit when it is presented each from overlapping as much as possible. of the 144 input patterns. An example jittered density plot for Hid- With respect to patterns of connectivity, den Unit 1 of the current network is provided we will see shortly that each of the hidden in Figure 8-5 below. Note that the x-axis units organizes input pitch-classes into ranges from 0 to 1, because this is the range some of the strange circles that were intro- of activity that can be generated by a value duced in Chapter 7. This is particularly help- unit. There are 144 different dots in this ful for interpreting this more complicated plot, one for each of the 144 tetrachords in network. This is because instead of consid- the training set. ering the effect of the twelve different pitch- classes on the hidden unit, we can consider smaller sets of pitch-classes that are treated as being equivalent. For example, we will see that an account of Hidden Unit 1’s role in the network can be achieved fairly easily by considering input pitch-classes as be- longing to one of the two circles of major seconds, or as belonging to one of the six circles of tritones.

With respect to hidden unit activity, we will take advantage of a property that we have not yet encountered, a characteristic that is frequently exhibited by value units (Berkeley, Dawson, Medler, Schopflocher, & Hornsby, 1995), although in some cases Figure 8-5. The jittered density plot for Hidden may be found in other types of processors Unit 1 in the extended tetrachord network. (Berkeley & Gunay, 2004). When the activi- See text for details. ties of a hidden value unit are graphed using a jittered density plot, this plot is often orga- Berkeley et al. (1995) discovered that in nized into different bands. Each band con- many cases the jittered density plots of hid- tains a subset of input patterns that share den value unit activities were organized into certain properties which, when identified, distinct bands. This is true of the jittered help understand the features being detected density plot in Figure 8-5. It is organized by the hidden unit. Let us describe the gen- into three different bands: in Band A 24 of eral use of banded jittered density plots in the input patterns generate 0 activity in this more detail before using them to interpret unit; in Band B 48 of the patterns generate the extended tetrachord network. activity that ranges between 0.11 and 0.20, and in Band C the remaining 72 patterns A jittered density plot is a type of graph generate activity between 0.99 and 1. that can be used to plot the distribution of a

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 8

Berkeley et al. (1995) discovered that weight to pairs of pitch-classes that belong patterns that belonged to the same band in to the same circle of tritones (Figure 7-13). a jittered density plot shared certain proper- The variation in weights permits the hidden ties. By taking just the subset of patterns unit to distinguish one circle of tritones from that fell into one band and examining their another. The same is true for the six nega- characteristics, one could interpret the fea- tive weights. tures they shared and use these features to determine the unit’s function in the network. This technique was used to successfully interpret the internal structure of a number of different networks of value units (Dawson, Medler, & Berkeley, 1997; Dawson, Medler, McCaughan, Willson, & Carbonaro, 2000; Dawson & Piercey, 2001).

Figure 8-5 demonstrates that distinct banding is present when the activities of one of the extended tetrachord’s hidden units are graphed in a jittered density plot. Fortunate- ly for us banding is present for almost all of the hidden units of this network. We will take advantage of this banding by taking just those input patterns that fall into a particular band, and determining what features these tetrachords have in common. Furthermore, this interpretation will be informed by our understanding of the strange circles found in the connection weights in each hidden unit. Together these two properties will lead to a fairly detailed understanding of the internal structure of the extended tetrachord net- work, in spite of its complexity.

Let us begin by considering the connec- tion weights and the jittered density plot for each hidden unit in turn.

8.3.2 Hidden Unit 1

Figure 8-6 provides a graph of the con- Figure 8-6. The connection weights and the nection weights that feed into Hidden Unit 1 jittered density plot for Hidden Unit 1. The from the twelve input pitch-class units, as three bands in the jittered density plot are well as the jittered density plot that was al- labeled A, B, and C. ready presented in Figure 8-5. It is obvious from Figure 8-6 that this hidden unit organ- What does this hidden unit detect? To izes input signals in terms of strange circles. begin, let us note that at the end of training First, all of the positive weights come from this unit’s µ had a value of -0.01, indicating pitch-classes that belong to one of the cir- that in order to turn it on a near-zero net in- cles of major seconds (Figure 7-7), and all of put is required. With this fact in mind, and the negative weights come from pitch- recognizing that Hidden Unit 1 appears to classes that belong to the other circle of ma- use equivalence classes involving circles of jor seconds. Second, if one examines the minor seconds and circles of tritones, let us set of six positive weights, then it becomes consider the patterns that fall into each of apparent that there is some variation in the three bands of the jittered density plot. strength. This variation is due to the fact that this hidden unit assigns the identical

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 9

Let us begin with the subset of patterns tern of sampling exhibited by these chords that belong to Band A in Figure 8-6. There (Figure 8-7A) four adjacent tritone circles are only two types of tetrachords in this sub- are sampled. Note that because weights set: all of the +7 chords and all of the 7♭5 are organized by circles of major seconds, this pattern means that two negative and chords belong to this band. What do these two positive weights are involved, producing tetrachords have in common? Each chord zero net input. The same is true for the includes four pitch-classes that all belong to second pattern of sampling, which involves only one of the circles of major seconds. As two adjacent tritone circles, not sampling a result, all four of the signals sent to Hidden from the next, and then sampling from the Unit 1 by one of these chords pass through next two adjacent tritone circles. Only the weights that all have the same sign. These diminished seventh (º7) tetrachords fail to signals cannot cancel one another out; the exhibit this pattern, but this is because they hidden unit will receive either an extreme represent a special case of Figure 8-7B: positive or an extreme negative net input they sample two circles of tritones twice, and which cause it to turn off because of its near these two samples are from circles that are zero µ. 90 in the diagram (see Figure 8-2).

Now let us turn to the opposite extreme by examining the subset of patterns that fall into Band C in Figure 8-6. These patterns consist of all the 6, 7sus4, º7, m(add9), m7 and maj7 tetrachords. What does this large collection of different types of chords have in common?

First, unlike the chords which belong to the band near zero, all of these tetrachords have two pitch-classes that belong to one circle of major seconds, and two others that belong to the other circle of major seconds. This permits the signals sent from this set of patterns to cancel each other out, producing a near zero net input, and turning Hidden Unit 1 on.

Second, the tetrachords which belong to this band (with the exception of the º7 chords which are a special case) include Figure 8-7. Three patterns of tritone sampling pitch-classes that each belong to a different for tetrachords. A and B are patterns that turn Hidden Unit 1 on; C is a pattern that gen- circle of tritones. In other words, four differ- erates weak activity in Hidden Unit 1. See text ent circles of tritones are represented in for details. each chord. Furthermore, the particular cir- cles of tritones selected are important: two The importance of which circles of tri- of the sampled circles have negative tones are sampled by a tetrachord emerges weights, while the other two have positive. when we consider the final band of patterns that produce weak activity in Hidden Unit 1 As a result, one finds in these tetra- (Band B, Figure 8-6). This band includes all chords two specific patterns of tritone sam- of the remaining types of tetrachords (7, pling. These are illustrated in Figure 8-7A add9, m(maj7), m6). Half of these chords and Figure 8-7B. In these figures each tri- fall into this band because they sample from tone circle is a line that bisects the pitch- three different circles of tritones, not four. In class diagram. Tritone circles that are sam- other words, they sample one pitch-class pled by these tetrachords are represented each from two different circles of tritones, as solid lines; dashed lines indicate tritone and two pitch-classes from a third. As a circles that are not sampled. In the first pat-

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 10 result, the input signals do not cancel one another out.

However, the remaining tetrachords that belong to this band sample pitch-classes from four different tritone circles. Why do these chords not turn Hidden Unit 1 on? The answer to this question is that they sample these tritone circles following a dif- ferent pattern than the two that were dis- cussed above. As shown in Figure 8-7C, they sample pitch-classes from three adja- cent tritone circles, skip the next, and then sample from the next. This pattern of sam- pling produces an unbalanced signal, gen- erating weak activity in this hidden unit.

8.3.3 Hidden Unit 2

Figure 8-8 provides the connection weights and the jittered density plot for Hid- den Unit 2 of the extended tetrachord net- work. This hidden unit organizes input pitch- classes into circles of minor thirds (Figure 7- 9), assigning a weight of 0.79 to those pitch- classes that belong to the first circle, a weight of -0.07 to those pitch-classes that belong to the second, and a weight of -0.50 to those pitch-classes that belong to the third. At the end of training, the value of µ for this unit was -0.13.

The jittered density plot is similar to the Figure 8-8. The connection weights and the one for Hidden Unit 1, as it is organized into jittered density plot for Hidden Unit 2. The three distinct bands. The first is near zero, three bands in the jittered density plot are the second is between 0.2 and 0.4, and the labeled A, B, and C. third is between 0.8 and 1.0. The bands for Hidden Unit 2 are slightly more dispersed What properties do the tetrachords that than those observed for Hidden Unit 1. belong to this band share? All of these tet- rachords (except the diminished sevenths, Let us first consider the patterns that be- which are a special case) select pitch- long to Band C in Figure 8-8. There are 52 classes from each of the three circles of mi- such patterns, representing 7sus4, add9, nor thirds. That is, they select one pitch- +7, º7, m(add9), m(maj7) and maj7 tetra- class from each of two of these circles, and chords. Interestingly, the band does not select two pitch-classes from the third circle. capture all instances of each chord type: it captures 4 instances of the diminished sev- Furthermore, 24 of the tetrachords in enth chord, and 8 instances of each of the Band C include one pitch-class associated other chord types. Whatever property be- with a weight of 0.79, a second associated longs to the chords in this band does not with a weight of -0.07, and two pitch-classes characterize all 12 instances of each chord associated with a weight of -0.50. This re- type. sults in a net input of about -0.30 which is close enough to µ to produce activity of about 0.90. Another 24 of the tetrachords include two pitch-classes associated with a weight of -0.07, and two others associated

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 11 with each of the other two weights. This that belong to a tetrachord type captured by produces a net input of 0.14, resulting in the band, but which are not part of the band. activity of just over 0.80. Let us next consider the band of patterns The diminished seventh chords that fall that produce weak activity (ranging between in this band are a special case, because 0.2 and 0.4) in Hidden Unit 2 (Band B). they are composed of all four pitch-classes There are 24 such patterns, representing that are associated with a weight of -0.07, m6, 6, m7, 7, and 7♭5 tetrachords. Again, which all belong to the same circle of minor the band does not capture all instances of thirds. These four weights sum to -0.28, a each chord type. All of the chords that fall in net input that produces activity of 0.88 in this band share one property: they do not Hidden Unit 2. include a pitch-class from one of the three

circles of minor thirds. They either include Why are only subsets of different tetra- three pitch-classes from one circle and a chord types found in this band? The struc- fourth from one other, or they include two ture of the four diminished seventh tetra- pitch-classes from one circle and two others chords provides an answer to this question. from another. In either case, the weights The other eight diminished seventh chords associated with these sets of pitch-classes are composed of four pitch-classes that all cannot cancel each other out; these chords belong to one of the other two circles of mi- produce net inputs of either -0.72 or 0.58. nor thirds. When these weights are summed together the resulting net input is From the discussion above it is clear that too extreme to produce high activity in Hid- high activity in Hidden Unit 2 indicates that a den Unit 2. This removes them from this tetrachord characterized by one of two dif- band. ferent patterns has been detected. One pat-

tern involves four pitch-classes associated A similar story can be told for the other with a particular combination of connection types of tetrachords in this band. Recall that weights (one strong positive, one weak neg- the band captures eight instances of each ative, two strong negatives). The second type, but four other instances do not belong pattern involves four pitch-classes each of to the band. This is because the specific set which is associated a weak negative con- of weights for Hidden Unit 2 is such that nection weight. these subsets of tetrachords generate an extreme net input that removes them from The patterns that belong to Band A in the band. Figure 8-8 produce zero activity in Hidden

Unit 2 because they fail to exhibit either of For example, Gmaj7, Cmaj7, Fmaj7, and these combinations of weight signals. As a G#maj7 are similar to all of the other major result the 68 patterns that belong to this seventh chords in that they include two band represent a diversity of tetrachord pitch-classes from one circle of minor thirds, types. Indeed, all twelve different types of and one from each of the other two. How- tetrachords have instances that belong to ever, given the weights for Hidden Unit 2, this band. their particular combination of notes produc- es a net input that removes them from the When banding in the jittered density plots band. of value units was first discovered (Berkeley

et al., 1995), it was realized that patterns In particular each of these chords in- associated with a band involving near zero cludes two pitch-classes from the circle of activity in a hidden unit were patterns that minor thirds assigned a weight of 0.79 by did not share any defining positive feature. this unit, and one pitch-class from each of Instead, they shared a negative feature: they the other two circles. As a result these four all lacked the feature or features that the major seventh chords generate a net input hidden unit detected, and which produced of 1 which turns Hidden Unit 2 off. This higher activity. As a result in many cases a separates these four tetrachords from the detailed interpretation of the features of pat- other eight that fall in the high band. A simi- terns that belong to a ‘zero band’ is neither lar account holds for all of the other chords

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 12 informative nor possible. Band A in Figure 8-8 is an example of this situation.

8.3.4 Hidden Unit 4

Band C for the jittered density plot of Hidden Unit 2 (Figure 8-8) indicated that this unit generated high activity to a number of different types of tetrachords. However, for each of these different types, it generated this high activity to only eight of the twelve possible instances. What does the network do to the four instances of each chord type that are omitted from this band in Hidden Unit 2? We show below that they are the only chords that produce high activity in Hidden Unit 4.

Figure 8-9 provides the connection weights and the jittered density plot for Hid- den Unit 4. An examination of the weights indicates that this hidden unit, like Hidden Unit 2, organizes input pitch-classes into circles of minor thirds (Figure 7-9), assigning a weak negative weight to those pitch- classes that belong to the first circle, a more extreme negative weight to those pitch- classes that belong to the second, and a strong positive weight to those pitch-classes that belong to the third. At the end of train- ing, the value of µ for this unit was -0.06.

An examination of the weights also indi- Figure 8-9. The connection weights and the cates that pitch-classes are also organized jittered density plot for Hidden Unit 3. The into equivalence classes based upon circles two bands in the jittered density plot are la- of tritones: pitch-classes that are in the beled A and B. same circle of tritones are assigned identical weights. Indeed, this organization is cleaner The lower part of Figure 8-9 indicates than the organization in terms of circles of that the jittered density plot of Hidden Unit 4 minor thirds, because there is some varia- can be viewed as being organized into two tion of weight values assigned to pitch- fairly broad bands: patterns that belong to classes in the same circle of minor thirds. Band A generate activity that ranges be- tween 0.00 and 0.50, while patterns that belong to Band B generate activity that ranges between 0.80 and 1.00. These are considered to be different bands because there are no patterns in between them.

Band B in Figure 8-9 consists of 24 pat- terns, representing four instances each of 7sus4, add9, +7, m(add9), m(maj7) and maj7 tetrachords. Importantly, these are exactly the same types of tetrachords found in Band C of Hidden Unit 2, with one excep- tion: Band B does not include any dimin- ished seventh chords. More importantly, the

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 13 four instances of each type of tetrachord Figure 8-10 presents the connection found in Band B are precisely the four in- weights and the jittered density plot for Hid- stances that are not found in Band C of Hid- den Unit 7 of the extended tetrachord net- den Unit 2. work. Importantly, at the end of training the value of µ for this hidden unit was -0.02. What do all of the tetrachords in Band B Thus in order for this unit to generate high have in common? Each chord includes two activity, the four signals being sent to it from pitch-classes associated with a small nega- input units must cancel each other out to tive weight, one pitch-class associated with provide a near-zero net input. a strong negative weight, and one pitch- classes associated with a strong positive weight. Variation in the weights (for in- stance, the small negative weight could be either -0.12 or -0.33) produces variation in net input, which is why Band B is wide. On average, a pattern that belongs to this band will generate a net input of -0.18 which is close enough to µ to produce strong activity in Hidden Unit 4.

Why does this band capture a different subset of tetrachord instances when Hidden Unit 4 and Hidden Unit 2 can be described as organizing input pitch-classes according to the same strange circles? The answer to this question comes from comparing the weights in Figure 8-9 to those in Figure 8-8. Note that different weight values are as- signed to the same strange circles in the two hidden units. For instance, Hidden Unit 2 assigns a strong positive weight to pitch- classes that belong to the first circle of minor thirds, while Hidden Unit 4 assigns a weak negative weight to the same pitch-classes. These differences cause some instances of a tetrachord type to generate strong activity in one hidden unit, but to also generate weak activity in the other.

What about Band A in Figure 8-9? None of these patterns are defined with the same combination of pitch-classes (two small Figure 8-10. The connection weights and the negative weights, one large negative weight, jittered density plot for Hidden Unit 3. The and one large positive weight) that defines three bands in the jittered density plot are membership in Band B. Of course, some labeled A through C. other combinations of weights produce moderate Hidden Unit 4 activity, but none The connection weights for this network are as optimal as the Band B combination. indicate that it organizes input pitch-classes High activity in Hidden Unit 4 represents the into equivalence classes based upon the detection of this particular combination, four circles of major thirds (Figure 7-11). All which serves to capture 24 tetrachords that pitch-classes that belong to the first circle of (musically) should have been in Band C of major thirds are assigned a strong negative Hidden Unit 2, but were not. weight; those that belong to the second are assigned a weak positive weight; those that 8.3.5 Hidden Unit 7 belong to the third are assigned a strong

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 14 positive weight; and those that belong to the same effect: two balanced signals from each third are assigned a weak negative weight. pair of pitch-classes, generating near zero net input and turning Hidden Unit 7 on. In addition to organizing pitch-classes in terms of circles of major thirds, the connec- Obviously the balancing described for the tion weight values of Hidden Unit 7 are such three types of tetrachords above is not true that an interesting balancing of pairs of of any of the patterns that belong to the oth- pitch-classes emerges. Pairs of pitch- er two bands in Figure 8-10. Band B con- classes that are a (e.g. A, B), sists of 24 different input patterns comprised a tritone (e.g. A, D#), or a minor seventh of six instances each of 6, aug7, m(add9), (e.g. A, G) apart are balanced, because they and m7 tetrachords. Each of these patterns are assigned weights that are equal in mag- generates small activity in Hidden Unit 7 nitude but opposite in sign. Pairs of pitch- (ranging between 0.09 and 0.19) because classes separated by any other musical in- they are partially balanced in the sense de- terval will not cancel each other’s signal out scribed above. That is, each of these tetra- because of differences in magnitude or sign chords contains one pair of pitch-classes of their respective connection weights. that are balanced because they are sepa- rated by a major second, a tritone, or a mi- As was the case for Hidden Units 1 and nor seventh. However, the other pair of 2, the jittered density plot for Hidden Unit 7 tones is not balanced. Interestingly for each is organized into three distinct bands. Two of these chords the balanced pair of pitch- of these bands (Band A and Band B in Fig- classes always involves one weight that is ure 8-10) are associated with low activity in an extreme negative and one that is an ex- Hidden Unit 7, while patterns that belong to treme positive. They produce some activity Band C turn Hidden Unit 7 on. in Hidden Unit 7 because the unbalanced pitch-classes involve smaller weights, mak- Band C in Hidden Unit 7’s jittered density ing net input slightly less extreme than is the plot contains 36 input patterns that comprise case for the remaining tetrachords. all twelve instances of just three different types of tetrachords: 7♭5, 7sus4, and 7. The remaining tetrachords all belong to Band A in Figure 8-10, and all fail to exhibit What do these three different types of the kind of balancing that has been dis- chords have in common? cussed above. There are 84 different pat-

terns that belong to this band. 60 of these All three of these different types of tetra- tetrachords are completely unbalanced: chords include four pitch-classes that are none of their pitch-classes can be described completely balanced in this network be- as being separated by a major second, a cause pairs of these pitch-classes are sepa- tritone, or a minor seventh. The remaining rated by a major second, a tritone, or a mi- 24 are the ‘cousins’ of those that belong to nor seventh. For instance, a diminished Band B. That is, one of their pitch-class seventh chord is composed of two pairs of pairs is balanced, but the other is not. The pitch-classes which are both a tritone apart difference between these 24 patterns and (Figure 8-2). The two pitch-classes in each the 24 that belong to Band B is that they all pair cancel each other’s signal out, produc- involve balancing of a weakly negative and a ing a net input of zero, which turns Hidden weakly positive weight. As a result, their Unit 7 on. unbalanced weights are both either extreme-

ly positive or extremely negative. As a re- Similarly, a 7sus4 chord can be de- sult, they generate an extreme net input, scribed as two pairs of pitch-classes with which turns Hidden Unit 7 off, and for this each pair separated by a major second reason they belong to Band A. (Figure 8-3). As well, a 7♭5 can be de- scribed either as two pairs of pitch-classes 8.3.6 Hidden Unit 6 with each pair separated by a major second, or as two pairs of pitch-classes with each Figure 8-11 provides the connection pair separated by a tritone (Figure 8-2). weights and the jittered density plot for the These different descriptions amount to the next hidden unit to be considered, Hidden

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 15

Unit 6. At the end of training this hidden unit of weights assigned to the next three pitch- had a value of µ equal to 0.07. classes (C, C#, D) or to the last three pitch- classes (F#, G, G#). An examination of the weights presented in this figure indicates that this hidden unit Table 8-2 below provides a more accu- groups pitch-class inputs into equivalence rate indication of which pairs of input pitch- classes based upon the six different circles classes cancel each other out given the par- of tritones (Figure 7-13). That is, pairs of ticular connection weights in Figure 8-11. It pitch-classes that are a tritone apart are as- was created by only turning on two of the signed the same connection weight. input units that feed into Hidden Unit 6 at a time. The resulting net input was simply the sum of the weights associated with each of the activated input units.

Each net input was fed into a Gaussian activation function (with µ = 0.07) to deter- mine the activity produced in Hidden Unit 6 by each possible pair of inputs. It is this ac- tivity that is reported in each cell in Table 8- 2, where the column label indicates one of the input units that was turned on, and the row label indicates the other. (Pairs that correspond to the diagonal of the matrix were not presented, because in the multi- layer perceptron it is not possible to simulta- neously send two signals from one input units.) If the Gaussian activity that resulted was 0.90 or higher, then this indicated that the signals from the two input units can- celled each other out, turning Hidden Unit 6 on. The input pairs that cancel each other out are indicated with grey cells in Table 8-2.

If tritone balancing was the only kind of balancing evident in Table 8-2, then only six different pairs of pitch-classes would cancel each other’s signal out. An inspection of Table 8-2 indicates that there are in fact 9 different pairs of inputs that cancel one an- other out (note that the table is symmetric, and that each pair is represented twice in the table). Each of these is highlighted in a dark grey cell in the table. In addition, there Figure 8-11. The connection weights and the are four other pairs of pitch-classes that jittered density plot for Hidden Unit 6. The nearly cancel one another out, in the sense five bands in the jittered density plot are la- that they produce activity of 0.58. The beled A through E. weaker activity produced by these pairs of inputs are highlighted with lighter grey cells It is also obvious from the weights illus- in the table. trated in Figure 8-11 that this hidden unit appears to balance, or nearly balance, adja- The pattern of grey cells in Table 8-2 is cent triplets of pitch-classes. For instance, very regular, consistent with the regular pat- consider the first three pitch-classes (A, A#, tern of alternating connection weights in B). The pattern of weights assigned to Figure 8-11. In general, pitch-class pairs these three inputs seems nearly identical in that are separated by a or by a magnitude but opposite in sign to the pattern major sixth cancel one another out. There

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 16 are two caveats that need to be added to to produce moderate activity. Second, even this general description. First, in some in- though A and D# are a tritone apart, their stances (e.g. A paired with C) the two connection weights are so close to zero that weights are different enough in magnitude this pair produces high activity in Hidden that they do not completely cancel one an- Unit 6 too. other out, but cancel each other out enough

A A# B C C# D D# E F F# G G#

A 0.00 0.03 0.00 0.58 0.05 0.00 1.00 0.03 0.00 0.58 0.05 0.00 A# 0.03 0.00 0.00 0.28 0.98 0.27 0.03 0.00 0.00 0.29 0.98 0.27 B 0.00 0.00 0.00 0.01 0.26 0.99 0.00 0.00 0.00 0.01 0.26 0.99 C 0.58 0.28 0.01 0.00 0.00 0.00 0.58 0.28 0.01 0.10 0.00 0.00 C# 0.05 0.98 0.26 0.00 0.00 0.00 0.05 0.98 0.26 0.00 0.00 0.00 D 0.00 0.27 0.99 0.00 0.00 0.00 0.00 0.28 0.99 0.00 0.00 0.00 D# 1.00 0.03 0.00 0.58 0.05 0.00 0.00 0.03 0.00 0.58 0.05 0.00 E 0.03 0.00 0.00 0.28 0.98 0.28 0.03 0.00 0.00 0.28 0.98 0.28 F 0.00 0.00 0.00 0.01 0.26 0.99 0.00 0.00 0.00 0.01 0.26 0.99 F# 0.58 0.29 0.01 0.10 0.00 0.00 0.58 0.28 0.01 0.00 0.00 0.00 G 0.05 0.98 0.26 0.00 0.00 0.00 0.05 0.98 0.26 0.00 0.00 0.00 G# 0.00 0.27 0.99 0.00 0.00 0.00 0.00 0.28 0.99 0.00 0.00 0.00 Table 8-2. The activity produced in Hidden Unit 6 by all possible pairs of different input pitch-classes. Pairs that cancel each other’s signal out, producing high activity in the hidden unit, are indicated by the dark grey cells. Pairs that weakly cancel each other out, producing moderate activity, are indicated by the lighter grey cells. See text for details.

With this understanding of the connection patterns in subsets of adjacent pitch- weight structure in Figure 8-11, let us now classes. Let us use the connection weights consider the nature of the bands that are in Figure 8-11 to identify three different sets revealed in the jittered density plot for Hid- of three pitch-classes: let Subset 1 be [A, den Unit 6. A#, B], let Subset 2 be [C, C#, D], let Subset 3 be [D#, E, F], and let Subset 3 be [F#, G, The jittered density plot for Hidden Unit 6 G#]. Our previous discussion of the connec- is organized into five different bands. Ex- tion weights for each of these subsets (see cluding Band A (which again appears to be Figure 8-11 and Table 8-2) suggests that if a ‘zero loading’ band with no interpretable the same pattern of input activity is present structure), these bands share one interest- in two of these subsets, then their activities ing qualitative characteristic: all of the tetra- will all cancel out, producing high activity. chords that belong to the same band are all missing a pair of pitch-classes. Patterns in For instance, imagine an input pattern Band E are missing both A and D#; patterns that includes both A and B as pitch-classes. in Band D are missing both D and G#; both This corresponds to the pattern of activity [1, of these pairs belong to the same tritone 0, 1] in Subset 1. If this same pattern of ac- circle. Patterns in Band C are all missing tivity is present in Subset 2 or Subset 4, then both A# and G, which are separated by a the signals from the two different subsets minor third. The two patterns that belong to will cancel out, producing high activity in Band B (C#m6 and Gm6) are missing A and Hidden Unit 6. However, if this same pat- D#, B and F, and C and F#. All three of tern of activity is present in Subset 3, the these pairs belong to the same tritone circle. activities will not cancel out, because these two subsets of pitch-classes have the same Quantitatively all of the bands in the Fig- connection weights. ure 8-11 jittered density plot can be ex- plained in terms of the balancing of different

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 17

We analyzed each input-pattern that be- The remaining 8 patterns in Band D have longed to a band in terms of the patterns of less balance between subsets, but still pro- activity present in each of the four pitch- duce small enough net inputs to generate class subsets for that pattern. We per- high Hidden Unit 6 activity. Two of these formed this analysis both qualitatively (e.g. chords are augmented sevenths that include is the pattern of activity in Subset 1 the either an A or a D# (Aaug7, D#aug7). Their same as the pattern in Subset 2) and quanti- patterns of activity across subsets are simi- tatively (e.g. what is the contribution to net lar to the two augmented seventh chords in input from Subset 1 or from Subset 2). Band E, but their net input is slightly more These analyses indicated that band mem- extreme (around -0.17) because A or D# are bership could be explained by patterns of involved with weaker balance (Table 8-2). activity balancing (or not) across the four The remaining six input patterns in this band different subsets. involve balance between two of the subsets, but the other two are not balanced. Again, For example, consider Band E in Figure the weights of the particular pitch-classes 8-11. In consists of 14 different tetrachords, involved are such that net input is low including 7, aug7, m7, and 6 chords. All enough to generate strong activity in Hidden but two of these input patterns are complete- Unit 6. ly balanced in the sense that they have the same pattern of activity in both Subsets 1 The remaining bands in the Figure 8-11 and 2, and also have the same pattern of jittered density plot involve less balance be- activity in both Subsets 3 and 4. This pro- tween subsets and more extreme net inputs, duces net inputs near 0.07, producing high decreasing Hidden Unit 6 activity even fur- activity in Hidden Unit 6. ther. For instance, Band C consists of 8 tetrachords, half of which are sixths and half The only exception to this are the two of which are minor sevenths. None of the augmented seventh chords (F#aug7 and subsets balance any of the others for any of Caug7) found in Band E. These two tetra- these input patterns. However, each of chords have identical patterns of activity in these 8 tetrachords has one subset that has Subsets 1 and 3, which do not balance, and a zero net input; the net inputs from the re- which produce a net input of 1.09 from each maining three subsets sums to either -0.34 subset. However, they also have patterns of or 0.34 producing activity of about 0.60 activity that produce a net input of -0.39 from a third subset, and a net input of -1.66 from Band B consists of only two tetrachords, the fourth. When all four net input compo- C#m6 and Gm6. Both of these tetrachords nents are combined, the final net input for have the same pattern of activity in Subsets both chords is 0.13, producing Hidden Unit 6 1 and 3, producing net input of 1.09 in each. activity of 0.99. This is the same situation observed for the two augmented seventh chords that belong Band D in Figure 8-11 contains 20 differ- to Band E. The difference emerges in terms ent input patterns, representing a variety of of the net inputs produced for these two mi- different types of tetrachord (7, aug7, m7, nor sixth chords for the other two subsets, 6, m6, add9, and m(maj7)). Of these 20 which is -1.66 and -0.94 respectively. In patterns, 12 are like those described for sum these two chords generate a net input Band E: Subsets 1 and 2 have the same of -0.42, which results in only moderate Hid- pattern, as do Subsets 3 and 4. However, den Unit 6 activity. because the tetrachords in this include A and D#, these two pitch-classes do not The remaining 98 tetrachords belong to completely cancel out corresponding pitch- Band A. These are instances of nine of the classes in the other subsets (see Table 8-2). twelve different types of tetrachord, including As a result, the net inputs for these patterns all of the 7♭5, 7sus4, m(add9), and maj7 are slightly larger, producing slightly lower chords. Only the 6, m7 and 7 tetrachords Hidden Unit 6 activities. This is true even are not found in this band. In general, there when the patterns of activities in comple- is less and less balance amongst the four mentary subsets are identical. subsets of inputs as one inspects the chords

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 18 that belong to this band. When balance Unlike the previous hidden units that we does occur, it is typically between only two have analyzed, Hidden Unit 5 does not ap- of the subsets; the remaining two subsets pear to organize pitch-classes into equiva- are so unbalanced that extreme net input is lence classes based upon musical intervals. the result. The net inputs found for the pat- Instead, it exhibits tritone equivalence: pairs terns in this band range from -4.11 to 3.43. of pitch-classes that are a tritone apart are There is substantial variability in this range, assigned weights that are equal in magni- and sometimes net input is fairly small (e.g. tude but opposite in sign. Tritone equiva- around -0.57. This explains why this band is lence is a property that has been seen in moderately broad in Figure 8-11. several networks that were discussed in ear- lier chapters. 8.3.7 Hidden Unit 5 Although it is less evident than was the Figure 8-12 provides the connection case in Figure 8-11, Figure 8-12 indicates weights and the jittered density plot for the that Hidden Unit 5 is also structured to pro- next hidden processor to be considered, duce balance between patterns of activity Hidden Unit 5. At the end of training its µ defined over subsets of three adjacent input was equal to -0.03. pitch-classes. Again, let Subset 1 be [A, A#, B], let Subset 2 be [C, C#, D], let Subset 3 be [D#, E, F], and let Subset 3 be [F#, G, G#]. An inspection of Figure 8-12’s connec- tion weights indicates that two pairs of these subsets appear to balance one another: Subset 1 is balanced by Subset 3, while Subset 2 is balanced by Subset 4.

A quantitative examination of this pattern of connection weights reveals a tremendous amount of balancing or near balancing with- in its structure. As was done with Hidden Unit 5, we presented every possible pair of input pitch-classes to this hidden unit. The net input for one such presentation is simply the sum of the weights of the two pitch- classes. We then computed the activity produced in Hidden Unit 5 by passing each of these net inputs through a Gaussian acti- vation function (with µ = -0.03). The results are presented in Table 8-3.

Table 8-3 indicates that there is a great deal more balancing possible with the set of connection weights for Hidden Unit 5 than there was for Hidden Unit 6. There are 19 different pairs of pitch-classes that generate activity of 0.9 or higher, indicating near per- fect balance. These cells are highlighted in grey in the table. (Again, this table is sym- metric, so that each of these pairs is repre- sented twice.) There are an additional 28 different pairs of pitch-classes that are near- Figure 8-12. The connection weights and the ly balanced, generating activity that ranges jittered density plot for Hidden Unit 5. The between 0.5 and 0.9. five bands in the jittered density plot are la- beled A through G.

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 19

A A# B C C# D D# E F F# G G# A --- 0.72 0.37 0.14 1.00 0.23 1.00 0.21 0.51 0.85 0.03 0.68 A# 0.72 --- 0.93 0.98 0.18 1.00 0.18 1.00 0.81 0.47 0.71 0.65 B 0.37 0.93 --- 0.73 0.45 0.88 0.46 0.86 1.00 0.83 0.36 0.95 C 0.14 0.98 0.73 --- 0.79 0.55 0.80 0.53 0.87 1.00 0.13 0.97 C# 1.00 0.18 0.45 0.79 --- 0.63 0.03 0.66 0.32 0.11 1.00 0.20 D 0.23 1.00 0.88 0.55 0.63 --- 0.64 0.69 0.97 0.95 0.22 1.00 D# 1.00 0.18 0.46 0.80 0.03 0.64 ------0.67 0.33 0.12 1.00 0.21 E 0.21 1.00 0.86 0.53 0.66 0.69 0.67 --- 0.95 0.96 0.21 1.00 F 0.51 0.81 1.00 0.87 0.32 0.97 0.33 0.95 --- 0.69 0.50 0.85 F# 0.85 0.47 0.83 1.00 0.11 0.95 0.12 0.96 0.69 --- 0.83 0.51 G 0.03 0.71 0.36 0.13 1.00 0.22 1.00 0.21 0.50 0.83 --- 0.67 G# 0.68 0.65 0.95 0.97 0.20 1.00 0.21 1.00 0.85 0.51 0.67 --- Table 8-3. The activity produced in Hidden Unit 5 by all possible pairs of different input pitch-classes. Pairs that cancel each other’s signal out, producing high activity in the hidden unit, are indicated by the grey cells. See text for details.

With this degree of balancing and near same as that discovered for Hidden Unit 6: balancing between pairs of connection there is a growing imbalance between the weights, and with the potential for balancing various pitch-classes that are combined in between pairs of subsets of input patterns, it the patterns that belong to a band, produc- is perhaps not surprising that the jittered ing more extreme net inputs and lower activ- density plot in Figure 8-12 exhibits a large ity in Hidden Unit 5. number of fairly narrow bands. In order to begin to understand the nature of this band- There are some interesting parallels be- ing, we examined Hidden Unit 5 in terms of tween the contents of some of the bands in the relationships between patterns of activity Figure 8-12 and the contents of some of the amongst the four different subsets of input bands in Figure 8-12. For example, Band E pitch-classes. Again, this analysis was both for Hidden Unit 6 contained only two aug- qualitative (do the subsets have the same mented seventh chords; another two aug- input pattern) and quantitative (what is the mented seventh chords were the only mem- net input generated by each subset). bers of Band B for that unit. For Hidden Unit 5, Band G contains only two m(maj7) Perhaps not surprisingly the account of chords; the two patterns that belong to Band banding for Hidden Unit 5 is very similar to E of the Figure 8-12 jittered density plot are the account that was detailed for Hidden also chords of this type. Unit 6 in the preceding section. For the 40 input patterns that belong to Band G, two Another similarity is that almost all of the different situations emerge. In one, the pat- bands for Hidden Unit 5 include a diversity tern for both Subsets 1 and 3 is identical, as of tetrachord types. Indeed, this property is the pattern for both Subsets 2 and 4. As a seems to be true of almost all of the bands result, near perfect balance is achieved and for each of the hidden units for the extended Hidden Unit 5 turns on. In the other, the tetrachord network. This property – as well patterns in the various subsets do not bal- as a detailed listing of the tetrachord types in ance. However, specific pairs of pitch- each band – will be the subject of Section classes – from the large number available 8.4 later in this chapter. given Table 8-3 – are combined to balance, again turning this hidden unit on. One difference between the bands for Hidden Unit 5 and the bands for Hidden Unit As we proceed through the various 6 is that the current set of bands does not bands associated with less activity in Hidden contain patterns that are defined by the ab- Unit 5, the general story that emerges is the sence of specific pairs of pitch-classes. This

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 20 property is clearly a consequence of the specific patterns of connection weights, and At the end of training, the value of µ for their possible balances, associated with this unit was -0.01. We again computed each hidden unit. the activity generated by every possible pair of input pitch-classes, using this value for µ. 8.3.8 Hidden Unit 3 The results are shown below in Table 8-4.

Figure 8-13 provides the connection An inspection of Table 8-4 indicates that weights and the jittered density plot for the there are in fact 15 different pairs of inputs final processor to consider, Hidden Unit 3. that cancel one another out perfectly (note An examination of the connection weights in that the table is symmetric, and that each Figure 8-13 indicates that, like Hidden Unit pair is represented twice in the table). 5, it exhibits tritone balance. Furthermore, if These pairs produce activity of 0.99 or high- we consider the weights in terms of the er, and have their corresponding cells high- same four subsets that have been applied to lighted in grey in the table. In addition to the previous two hidden units, Subset 1 bal- these pairs, there are 30 pairs that when ances Subset 3, and Subset 2 balances combined nearly balance each other’s sig- Subset 4. This pattern was previously ob- nal, producing activity in Hidden Unit 3 that served in Figure 8-12. ranges between 0.5 and 0.9.

It is particularly interesting to compare the pattern of connection weights in Hidden Unit 3 (Figure 8-13) to those for Hidden Unit 5 (Figure 8-12). At first glance, the two pat- terns seem very similar. However, a closer inspection reveals differences between the two.

Consider the weights for Subset 1 (A, A#, B) in Figure 8-12, which has a strong nega- tive followed by a moderate positive and a weak negative. This pattern is also evident in Figure 8-13 – but for Subset 4 (F#, G, G#). Similarly the pattern for Subset 2 in Figure 8-12 is found instead for Subset 1 in Figure 8-13; the pattern for Subset 3 in Fig- ure 8-12 is found for Subset 2 in Figure 8- 13; and the pattern for Subset 4 in Figure 8- 12 is found for Subset 3 in Figure 8-13. In short, it would appear that both Hidden Units 3 and 5 use the same patterns of connection weights (defined over the four subsets of input units), but assign these same patterns to different subsets of input pitch-classes.

This raises the question: what is the rela- tionship between the responses of Hidden Units 3 and 5 to the set of input patterns, given that there are both interesting similari- ties and differences between their patterns of connection weights?

Figure 8-13. The connection weights and the jittered density plot for Hidden Unit 3. The five bands in the jittered density plot are la- beled A through E.

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 21

A A# B C C# D D# E F F# G G#

A --- 0.87 0.44 0.86 0.46 0.83 1.00 0.17 1.00 0.17 0.99 0.76 A# 0.87 --- 0.83 0.05 0.81 0.43 0.16 1.00 0.19 1.00 0.20 0.52 B 0.44 0.83 --- 0.82 0.51 0.88 0.99 0.21 1.00 0.20 1.00 0.81 C 0.86 0.05 0.82 --- 0.80 0.42 0.15 1.00 0.19 1.00 0.19 0.50 C# 0.46 0.81 0.51 0.80 --- 0.89 0.99 0.22 1.00 0.21 1.00 0.82 D 0.83 0.43 0.88 0.42 0.89 --- 0.73 0.55 0.79 0.53 0.80 1.00 D# 1.00 0.16 0.99 0.15 0.99 0.73 --- 0.89 0.42 0.88 0.44 0.81 E 0.17 1.00 0.21 1.00 0.22 0.55 0.89 --- 0.85 0.06 0.84 0.46 F 1.00 0.19 1.00 0.19 1.00 0.79 0.42 0.85 --- 0.83 0.49 0.86 F# 0.17 1.00 0.20 1.00 0.21 0.53 0.88 0.06 0.83 --- 0.82 0.45 G 0.99 0.20 1.00 0.19 1.00 0.80 0.44 0.84 0.49 0.82 --- 0.87 G# 0.76 0.52 0.81 0.50 0.82 1.00 0.81 0.46 0.86 0.45 0.87 --- Table 8-4. The activity produced in Hidden Unit 2 by all possible pairs of different input pitch-classes. Pairs that cancel each other’s signal out, producing high activity in the hidden unit, are indicated by the grey cells. See text for details.

In order to answer this question, we in Figure 8-13 would be very similar to the computed the correlation between the activi- accounts provided earlier for both Hidden ties of these two hidden units to the entire Unit 6 and Hidden Unit 5. This is indeed the set of input patterns. We also correlated the case. activities of each of these units with the ac- tivities of Hidden Unit 6. We included this For the 38 input patterns that belong to unit because, like the other two, it exhibits Band G, two different situations emerge. tritone balance, and it can be considered as For 24 of the patterns in this band the pat- grouping input signals into four different tern of activity for both Subsets 1 and 3 is subsets. The resulting correlations are pro- identical, as is the pattern for both Subsets 2 vided in Table 8-5. Importantly, this table and 4. As a result, near perfect balance is reveals very low correlations between the achieved and Hidden Unit 3 turns on. In the activities of different hidden units. This other, the patterns in the various subsets do means that while there are definite similari- not balance. However, specific pairs of ties amongst these units in terms of general pitch-classes (for the potential, see Table 8- patterns of connectivity, their connection 4) are combined to balance, again turning weights are arranged in different orders. As this hidden unit on. a result, the tetrachords that cause high ac- tivity in one hidden unit do not do so for the As we proceed through the various other two. This will be important in our con- bands associated with less activity in Hidden sideration of coarse coding in the next sec- Unit 3, the general story that emerges is the tion of this chapter. same as that discovered for Hidden Units 6 and 5: there is a growing imbalance be- HID3 HID5 HID6 tween the various pitch-classes that are combined in the patterns that belong to a HID3 1.00 band, producing more extreme net inputs HID5 0.11 1.00 and lower activity in Hidden Unit 3. HID6 0.00 0.02 1.00 Table 8-5. Correlations amongst activities of three Of course, each of the bands in Figure 8- hidden units to the 144 input patterns. 13 picks out a variety of different tetra- chords. A detailed list of those for the vari- Given that Hidden Unit 3 uses similar ous bands in Hidden Unit 3’ jittered density patterns of connections, but assigns them to plot is presented in the next section’s dis- different subsets of inputs, it would be ex- cussion of coarse coding in the extended pected that an account of the various bands tetrachord network.

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 22

8.4 Bands and Coarse Coding 8.4.1 Hidden Unit Structure found in each band of the seven jittered density plots. Section 8.3 presented a detailed exami- nation of the structure of each of the seven Unit Band Tetrachords In Band hidden units in the extended tetrachord net- A 7b5, aug7 work. This examination revealed a great 1 B 7, add9, m(maj7), m6 many details about the connection weight 6, 7sus4, dim7, m(add9), C structure of each hidden unit, as well as the m7, maj7 types of tetrachords that produced varying 6, 7, 7b5, 7sus4, degrees of activity in each hidden unit. Out A add9, aug7, dim7, m(add9), m(maj7), m6, m7, maj7 of these details three general points emerge. 2 6, 7, 7b5, m6, B m7 First, the connection weight structure of 7sus4, add9, aug7, dim7, C each hidden unit is highly regular, and this m(add9), m(maj7), maj7 add9, 6, maj7, m(add9), structure can be related to musical intervals. A 7sus4, m7 Four of the hidden units assigned input B m6, 7, add9, m(maj7) pitch-classes to equivalence classes based m(add9), 7sus4, maj7, m7, 3 C upon strange circles. Hidden Unit 1 groups 6, aug7 pitch-classes using the two circles of major D m(maj7), add9, m6, 7 seconds and the six circles of tritones. Hid- m7, aug7, 7b5, dim7, E den Unit 2 assigns pitch-classes to equiva- maj7, 6 lence classes based upon the three circles add9, 6, m6, m7, of major thirds. Hidden Unit 4 organizes A 7, aug7, 7b5, dim7, m(add9), maj7, m(maj7), 7sus4 pitch-classes using both the three circles of 4 maj7, 7sus4, 6, m7, B minor thirds and the six circles of tritones. 7, m(add9), 7b5 Hidden Unit 7 assigns pitch-classes to m(add9), m(maj7), add9, maj7, C equivalence classes based upon the four 7sus4, aug7 m(add9), m7, m(maj7), 7sus4, circles of major thirds. The remaining three A hidden units (Hidden Units 3, 5, and 6) em- add9, 6, maj7, m6, 7 ploy tritone balance, assigning pitch-classes B m6, 7 that are separated by a tritone weights that C m(add9), aug7, 7sus4, add9 aug7, m(add9), 6, maj7, are equal in magnitude but opposite in sign. 5 D m7, 7sus4 E m(maj7) Second, the connection weight structure F add9, m6, m(maj7), 7 of each hidden unit produced distinct band- 6, maj7, 7b5, dim7, G ing when hidden unit activities were plotted m7, aug7, m(maj7) in a jittered density plot. The four hidden add9, m(add9), maj7, m(maj7), A units that organize pitch-classes using 7sus4, 7b5, aug7, m6, 7 strange circles exhibit either two or three B 7, dim7, m6, aug7 distinct bands, while the three hidden units C add9, 7, m(maj7) that exhibit tritone balance exhibit five or six 6 D m6 distinct bands. For all hidden units these E 6, m7 bands emerged because signals from differ- 6, m6, m7, aug7, F ent pairs of pitch-classes were assigned dim7, add9, m(maj7) connection weights that produced balance G 6, m7, dim7, aug7 or near balance for some pairs, but not oth- add9, 6, m6, maj7, ers. A m(maj7), 7, aug7, m(add9), m7 7 B m(add9), m7, 6, aug7 Third, almost all of the bands in each hidden unit’s jittered density plot were heter- C 7sus4, 7b5, dim7 Table 8-6. The types of tetrachords found in each ogeneous. That is, almost every band in- band in each jittered density plot that was present- cluded instances of more than one type of ed in Section 8.3. tetrachord. This is apparent in Table 8-6 which lists each tetrachord type that can be

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 23

Table 8-6 indicates that of the 31 differ- tetrachord network are highly structure, but ent bands listed in Table 8-6, only four are individual hidden units do not appear to de- pure: Hidden Unit 5 Band B contains only tect the presence or absence of particular m6 chords (which are identical to the 7 tetrachord types. That is, knowing the activi- chords that it also contains); Hidden Unit 5 ty of a particular hidden unit rarely indicates Band E contains only m(maj7) chords; Hid- the presence or absence of a single type of den Unit 6 Band D only contains m6 chords; tetrachord. and Hidden Unit 6 Band E contains only 6 chords (which are identical to the m7 chords How then do the output units of the ex- that it also contains). Every other band con- tended tetrachord network process hidden tains at least two different types of tetra- unit activity to correctly identify an input pat- chords. Many of these bands contain six or tern’s chord type when it is presented to the more different types of tetrachords. multilayer perceptron?

Table 8-6 indicates some additional The general (and largely uninformative) properties concerning the similarity and dif- answer to this question is that the extended ferences between different bands and their tetrachord network is another example of contents. For instance, several different coarse coding, a concept that was intro- types of chords appear to be similar to one duced in Chapter 5. In coarse coding, indi- another because they are frequently found vidual hidden units serve as fairly inaccurate together in the same band. For instance detectors of input pattern properties. How- tetrachords that belong to the four types 7, ever, particularly if each hidden unit views add9, m(maj7), and m6 are found together the inputs from a different perspective, then in 8 of the 31 different bands in Table 8-6. when these different inaccurate representa- tions are combined then an accurate classi- However, there are also substantial dif- fication emerges. ferences between individual hidden units in terms of their sensitivity to such groups of The reason that this general nod to chords. For instance, bands that contain 7, coarse coding is uninformative is because if add9, m(maj7), and m6 chords are associ- it goes no further than this then we really ated with different levels of activity when have no idea about how coarse coding bands from different hidden units are com- works in the current network. Fortunately pared (compare Hidden Unit 1 Band B to the summary of band contents that is pre- Hidden Unit 5 Band F). As well, some units sented in Table 8-6 places us in a position to that have bands that contain these four obtain a stronger sense of coarse coding in chord types also contain other tetrachord this particular network. types as well, and these typically differ from one another. For instance Hidden Unit 7 Imagine that I present one input pattern Band A groups these four chord types along to the trained extended tetrachord network with instances of 6, maj7, aug7, m(add9), and only observe the activity that it produces and m7 chords. In contrast, Hidden Unit 4 in each of the seven hidden units. The pat- Band A includes these four types along with tern that I use produces the following activity instances of 6, m7, aug7, 7♭5, dim7, pattern, given in ascending order of hidden m(add9), maj7, and 7sus4 tetrachords. units: [0.14, 0.02, 0.01, 0.99, 0.32, 0.05, 0.00]. Given this activity pattern, what type Furthermore, these four chord types are of tetrachord was presented? not always found in the same band. For example Hidden Unit 6 Band C contains 7, To answer this question, I could begin by add9, and m(maj7) chords, but does not relabeling each hidden unit activity value contain any m6 tetrachords. with the jittered density plot band to which that activity value corresponds. When this is 8.4.2 Bands and Coarse Coding done, the set of hidden unit bands to which the pattern belongs (in the same order as The general summary of network struc- before) is: [B, A, A, C, C, A, A]. ture that was provided above indicates two . key facts: the hidden units of the extended

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 24

With this pattern of bands in hand, I turn Table 8-8 represents each of these to Table 8-6 and delete any bands that are bands in terms of the tetrachord types that not present. The result is presented as Ta- are found in each. An examination of this ble 8-7: table indicates that the only tetrachord type found in every band is the dominant sev- Unit Band Tetrachords In Band enth. Therefore the stimulus that was pre- 1 B 7, add9, m(maj7), m6 sented to the network was a 7 chord. 6, 7, 7b5, 7sus4, 2 A add9, aug7, dim7, m(add9), m(maj7), m6, m7, maj7 add9, 6, maj7, m(add9), 3 A 7sus4, m7 The two examples of coarse coding that m(add9), m(maj7), add9, maj7, are illustrated in Tables 8-7 and 8-8 were 4 C 7sus4, aug7 chosen deliberately. We noted earlier that in 5 C add9, m6, m(maj7), 7 terms of band contents add9 and 7 chords add9, m(add9), maj7, m(maj7), were similar because they were often seen 6 A 7sus4, 7b5, aug7, m6, 7 in the same band. However, the coarse add9, 6, m6, maj7, coding examples show that there are differ- 7 A m(maj7), 7, aug7, m(add9), m7 ences between the two chord types; there Table 8-7. The types of tetrachords found in each are some bands where one is found, but not band to which the first single pattern presented to the other. This band intersection technique the network belongs. See text for details. takes advantage of this property, which ex- plains how the messy contents of the 31 Note that each band in Table 8-7 is inac- hidden unit bands can be used to identify curate, in the sense that it contains four or the type of chord presented to the extended more types of tetrachords. However, only tetrachord network. one tetrachord type is present in all seven of these bands: add9. (The 7, m6 and the The band intersection technique has m(maj7) are all absent from Hidden Unit 3 been used previously make sense of coarse Band A, while none of the other chords coding in network of value units (Dawson & (apart from add9) are present in Hidden Unit Piercey, 2001). However, it is a technique 1 Band B.) This means that the only type of that outsiders use to look inside the network. chord that generates this particular pattern The output units of the extended tetrachord of activity across the seven hidden units. network do not themselves literally identify chord types by seeking intersections be- Consider a second example, a stimulus tween sets of features that are captured by that produces the following pattern of hidden different hidden unit activities. unit activity: [0.16, 0.24, 0.20, 0.00, 0.12, 0.14, 0.00]. In terms of band labels, this Instead, the output units operate geomet- pattern is equivalent to: [B, B, B, A, A, B, A]. rically: hidden unit activities provide coordi- nates that arrange particular types of input Unit Band Tetrachords In Band patterns along a plane, and the output units 1 B 7, add9, m(maj7), m6 carve this plane out of the hidden unit space 6, 7, 7b5, m6, 2 B (e.g. Figures 7-25 and 7-26). m7 3 B m6, 7, add9, m(maj7) Functionally speaking, however, this ge- add9, 6, m6, m7, ometric process of identifying tetrachord 4 A 7, aug7, 7b5, dim7, m(add9), maj7, m(maj7), 7sus4 types is equivalent to the band intersection m(add9), m7, m(maj7), 7sus4, method described above. Tetrachord types 5 A add9, 6, maj7, m6, 7 that belong to the same band will have near- 6 B 7, dim7, m6, aug7 ly the same coordinate along one of the di- add9, 6, m6, maj7, mensions of the hidden unit space. One 7 A m(maj7), 7, aug7, m(add9), type of tetrachord will be separated from the m7 other types in this dimension by being locat- Table 8-8. The types of tetrachords found in each band to which the second single pattern presented ed at a different coordinate from the others to the network belongs. See text for details. in one or more of the other dimensions.

© Michael R. W. Dawson 2014 Chapter 8 Classifying Extended Tetrachords 25

8.5 References Berkeley, I. S. N., Dawson, M. R. W., Medler, D. A., Schopflocher, D. P., & Hornsby, L. (1995). Density plots of hidden value unit activations reveal interpretable bands. Connection Science, 7, 167-186. Berkeley, I. S. N., & Gunay, C. (2004). Conducting banding analysis with trained networks of sigmoid units. Connection Science, 16(2), 119- 128. doi: Doi 10.1080/09540090412331282278 Dawson, M. R. W. (2005). Connectionism : A Hands-on Approach (1st ed.). Oxford, UK ; Malden, MA: Blackwell Pub. Dawson, M. R. W., Medler, D. A., & Berkeley, I. S. N. (1997). PDP networks can provide models that are not mere implementations of classical theories. Philosophical Psychology, 10, 25-40. Dawson, M. R. W., Medler, D. A., McCaughan, D. B., Willson, L., & Carbonaro, M. (2000). Using extra output learning to insert a symbolic theory into a connectionist network. Minds and Machines, 10, 171-201. Dawson, M. R. W., & Piercey, C. D. (2001). On the subsymbolic nature of a PDP architecture that uses a nonmonotonic activation function. Minds and Machines, 11, 197-218. Dawson, M. R. W., & Schopflocher, D. P. (1992). Modifying the generalized delta rule to train networks of nonmonotonic processors for pattern classification. Connection Science, 4, 19-31.

© Michael R. W. Dawson 2014