HOW DISSONANT IS THE AUGMENTED TRIAD ?

By

JOSHUA CLEMENT BROYLES

B.A. Music, California State University, Hayward, 1992

A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE

OF

MASTER OF ARTS

in

THE FACULTY OF GRADUATE STUDIES

(School of Music, )

We accept this thesis as conforming

to the required standard

THE UNIVERSITY OF BRITISH COLUMBIA

April 1999

© Joshua C. Broyles, 1999 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission.

Department of

The University of British Columbia Vancouver, Canada

Date

DE-6 (2/88) 11

ABSTRACT:

HOW DISSONANT IS THE AUGMENTED TRIAD ?

Throughout the centuries, music theorists have consistently designated the augmented triad as dissonant, but not for entirely consistent reasons. In one interpretation of this

"dissonant nature," an interpretation with which this thesis is concerned, the augmented triad is less harmonically "stable" than the major and minor triads in position or in and, at most, only as stable as the second inversions of the major and minor triads. The various arguments against the stability of the augmented triad have largely been of the three basic types: acoustic/numerological, psychoacoustic/perceptual, and cognitive/tonal-syntactic.

A small number of theorists, from very early on, have not been entirely committed to the intrinsic instability of the augmented triad as compared to major and minor triads.

In recent decades research in music perception has drawn into question the absolute validity of this designation, but has stopped short of demonstrating specific conditions under which an augmented triad would actually be likely to sound more harmonically stable than a major or minor triad.

This thesis documents a perceptual experiment and its results which statistically support the claim that conditions exist under which listeners may perceive an augmented triad as more harmonically stable than a major triad. These conditions are specific but they are not abnormal in twentieth-century music, and they are not totally absent in earlier

Western music. iii

TABLE OF CONTENTS

ABSTRACT , ii

LIST OF ILLUSTRATIONS iv

ACKNOWLEDGEMENTS v

Chapter

1. INTRODUCTION 1

2. SOME RATIONAL PERSPECTIVES OF MUSIC THEORISTS 8

3. SOME EMPIRICAL PERSPECTIVES '. 27

4. THE SPECIAL PERSPECTIVE OF HELMHOLZ 46

5. SUMMARY OF EXPERIMENT 51

6. THE STIMULI 56

7. THE DATA 65

8. CONCLUDING REMARKS 74

Bibliography 76

Appendix

1. EXAMPLE RESPONSE FORM 80

2: TEST ENGINE 82

3: DATA TABLES 83

4: THE PARTICIPANTS 86 iv

ILLUSTRATIONS

Figure 1. The Experimental Response Form, Side 1 80 2. The Experimental Response Form, Side 2 81 3. The Experimental Test/Treatment Engine 82

Table 1. Simplified Pre-test and Post-test Table for Groups 1 and 2 Combined 66 2. Data Table for Treatment Group 1 84 3. Data Table for Treatment Group 2 and Table for Incomplete Responses 85

Example 1. Notation of Audio Example 1 53 2. Pitch Content of Pretest/Post-test Examples 61 3. Pitch Content of Treatment Examples 64 ACKNOWLEDGEMENTS

I wish to thank Dr. John Roeder, Dr. William Benjamin and Dr. Richard Kurth for the excellence they have shown me as teachers, and for the special efforts each of them has made which have enabled me to get to this point in my education.

I wish to thank Dr. Eugene Narmour, with whom I have neither met nor spoken, but whose book, The Analysis and Cognition of Basic Melodic Structures, showed me that music theory need not continue to be so much like doing crossword puzzles, or worse, the reading of tea leaves. It was this new view of music theory which prevented me from entirely abandoning music theory upon receipt of my B.A. in music composition.

I wish to thank my previous employers, Christina and James Bennett, for giving me the job flexibility I needed in order to pursue graduate school, and for encouraging me to move forward even when it meant certain difficulties for them.

I wish to thank my grandparents, Robert and Arlene Dart, for their continued financial assistance, for their apparently blind faith in me as a scholar of music, and for their amazing patience.

Most of all, I wish to thank my partner Heather O'Connor-Shull for encouraging me to return to higher education, for every possible type of support imaginable, which she has given me most unselfishly throughout my years of graduate school, and most of all for believing in my scholastic ability even in the face of sometimes overwhelming counterevidence. 1

Chapter 1: Introduction

In this thesis, the word "normal" will be used primarily in the statistical sense. That is, something like the opposite or complement of "abnormal" (such as in the expression

"psychologically normal")!; plausibly representing a much larger population, the study of which should be of general rather than of special interest. In this case the population intended to be represented includes types of humans. Although certain sample biases must be acknowledged, neither was the study deliberately confined to participants who might be considered poorly representative (such as criminals, or the mentally ill), nor were any participants intentionally excluded in order to maintain an artificial standard of normality. It is not my intention, through the frequent use of the word "normal," to glorify normality. However, I will assume that normal listeners are the people whose perceptions are of greatest interest to music theory for three reasons:

1) because music theorists I have met and those whose works I have read so frequently invoke the authority of normal listeners (usually without identifying them, or verifying their normality),

2) because the perception of normal listeners has formed the core point of interest of so much scientific research already, and

3) because, while abnormal listening does interest me personally, I have my doubts about the immediate or foreseeable usefulness of abnormal listening behaviours to analysis or to composition.

For the purposes of this thesis, the terms "dissonance" and "harmonic instability" will be used interchangeably, not because I believe they are exactly the same thing, but because this interchangeability is a widespread convention in the literature to which I will refer, and among the numerous theory teachers who have made a point over the years of defining for me the augmented triad as having such qualities. I suspect that if one considers these

Music theorists will not be considered to be normal listeners. 2 two terms not to be interchangeable, one will also see in my conclusion a partial

clarification of one way in which the definitions of these terms should differ. Indeed, the lack of clarity between these definitions is perhaps part and parcel of the main problem I

intend to address in the pages to follow. That is, while "roughness" and "lack of resolution"

may correlate musically, this does not necessarily mean that they are one thing, or even

that the correlation is completely dependable.

For the purposes of this thesis, the following terms will also be used somewhat

interchangeably; augmented triad (or {C,E,G#}, etc.) and 3-12 (or {0,4,8}, etc.). Other pitch- class structures will receive similar treatment. It is not my intention to use this

interchangeability to execute semantic sleights of hand, but only to emphasize, in each

case, the immediate context in which the object is to be best understood at that point of the discussion. The reader may assume that any of these terms may apply to any spelling of the

object to which it refers, and to any inversion or system of intonation unless otherwise specified. I understand that standard classification system for pc sets assumes equal

temperament, but both this system and more traditional tonality-oriented vocabularies

have been widely used both in the analysis of equal-tempered music, and in the analysis of

music not necessarily adhering to equal temperament, such as string music and vocal

music. Therefore I ask the reader to accept that both the terms "augmented triad" and "3-

12" currently imply equal temperament, but that the older one simply implies this less strongly.

Before the advent of equal temperament, verticalities that could be octave-transposed to

two "stacked" major thirds were regarded as awkward-sounding at least for the reason that

they contained an augmented or diminished fourth, neither interval of which were expressions of the simple mathematical ratios defined as consonant by divisions of a string

on the monochord, or arising from the parrials of overblown wind instruments. ^

2 Theorists referring to arithmetic ratios as representative of musical intervals, even merely in concept, will be referred to hereinafter as "monophonists", alluding to the 3

Later claims by tonal theorists to the effect that the harshness of the augmented triad may be softened by using it in first inversion seem to deduce the treatment of the augmented triad from treatment of the , but may also derive in part from

(or, at least converge with) the grammars of much earlier music (not using equal temperament) in which such verticalities did not occur, such as free organum in three parts. That is, to the monophonist, the pitch collections {B,D,F} and {C,F,G} are similar in that they sound most consonant when the lowest voice is consonant with both upper voices, though these upper voices are dissonant with each other (the perfect fourth was considered a consonance until the Renaissance) . It follows that if, in a piece of music, pitch structures preceding or following a specific instance of {0,4,8} enable us to hear 8 within

[0,4,8] either as "G#" or as "Ab" (but not both), we must understand 8 as representing a nonsimple mathematical ratio with either 0("C") or with 4("E"), thereby determining through tonal context the preferred inversion. It also follows that, as long as we hear equally tempered thirds as representing just or Pythagorean thirds, we will easily accept that {0,4,8}, always containing an augmented or diminished interval, will always sound less stable than {0,3,7} in the same register, regardless of transposition or inversion.

But, unless we can recognize the dissonant interval purely by its vertical sonority, it is necessary, in order for us to understand the augmented triads we hear as containing a specific interval that makes them dissonant, that we assume the spelling of the interval to be audible by context and also to assume that spelling is indicative of the intended size of the interval. Therefore, this most obvious understanding of the as a dissonance assumes reference to earlier systems of intonation. Moreover, it is ironic that a monophonist might even regard the equal-tempered perfect fifth as even more unstable than

original application of this way of thinking to the ratio of string lengths on a monochord. Theorists referring to aerophone partials will also be included with monophonists for present purposes inasmuch as the partials are of interest here for the same reason that the partials, like monochord intervals, are arithmetic and refer to a single, lower pitch. the justly intoned augmented fifth, since the former does not even derive from a rational division of the string.

What, then, happens to the dissonance of {0,4,8} if we fail to distinguish between G# and

Ab? Does the presumedly intrinsic dissonance of the augmented triad invariably obtain without a defining tonal context? Perhaps not.

There is some research to suggest that complexity of the ratios of fundamental frequencies may not be as strongly indicative of perceived pitch, and thus also dissonance as it is understood here, as is the proximity of audible overtones (or audible "secondary

frequencies," as the case may be for some instruments). That is to say that in sine waves, pure-octave timbres (such as Shepard's tones), or certain other timbres not audibly conforming to the natural overtone series (such as metallophone timbres), the augmented triad might be generally regarded as less dissonant (or more harmonically stable) than the

major triad, because its fundamental tones are maximally distant from each other. Although an argument could be made that some musical timbres may cause even the most

mathematically simple intervals to become acoustically dissonant, no one to my knowledge has made much of this point .3

Clearly, tonal theory cannot refer primarily to such timbres as those found among the metallophones, or to other timbres resulting from nonstandard internal harmonic structures, as tonal theory is indisputably used most often to describe events in more harmonic timbres.

Moreover, we may understand monophonic theory to refer to the natural overtones

because they are necessary to distinguish simple intervals from the more complex ones of equal temperament (intonation is difficult to hear in sine waves). An understanding of the

3 Helmholz seems to be aware of such properties, but provides no visible prescriptions for the composer or analyst. See:

Hermann Helmholz, On the Sensations of Tone (New York: Dover Publications, Inc., 1954) pp.192-193 5 augmented triad as dissonant simply because it contains the augmented fifth (ultimately a

monophonist criterion) also depends greatly on the assumption of normal overtone structures, such as those found on the monochord, and upon the ability of this effect to

obtain in less idealized systems of intonation, such as equal temperament.

But equal temperament, as applied to tonal theory, assumes that one does not hear these overtones clearly enough to order one's sense of consonance and dissonance from them. That is, equal-tempered tonal music seems to adhere to two contradictory assumptions in regard

to the triadic dissonance: it assumes that one does not hear the overtones of a major triad well enough to appreciate the mistuning of its intervals, but that one does hear the

overtones of an augmented triad this well, even if they are not explicitly present.

Presumably, if one heard both sets of overtones, both chords would be dissonant in the

monophonist model because of the misalignment of the overtones in equal temperament

Conversely, if one heard neither set of overtones, both chords would be consonant, at least

from the standpoint of the dissonance of the augmented fifth, since we could not distinguish

any particular interval as a dissonance; the augmented fifth could be heard as a minor sixth and a major could be heard as a diminished fourth.

What one may be tempted to infer from this is that tonal listeners may be hearing J0,4,8} as consistently more dissonant or less harmonically stable than various forms of 10,3,7}

primarily because tonal music utilizes only timbres for which this is true, or because tonal

music utilizes pitch collections in ways which always cause one member of the augmented

triad to sound like an "outsider" to larger horizontal and or diagonal pitch collections to which vertical presentations of {0,3,7} are heard as referring.^ In other words, the

4 This is consistent with the intimations of

Rene Van Egmond and David Butler, "Diatonic Connotations of Pitch-Class Sets", Music Perception 15/1 (Fall 1997): 1-29

that listeners to tonal music tend to judge set classes included in 7-35 to be more tonally stable than sets not so included. 6

perceived instability of {0,4,8} as compared with {0,3,7} may be more a function of

conditioning than of innate psychoacoustic preferences. This inference, though consistent with my earliest observations, goes very strongly against what I have been taught, and it is

this contradiction which has now provoked me to conduct my own limited scientific

inquiry.

The last few paragraphs have explained the basic problem, but do not truly show the

lengths to which I have been pushed in recent years, by reading the works of others, both to

accept and to reject the inferences which I have been tempted to draw. The problem of the

augmented fifth is but one facet of a larger problem. There are several arguments against

the consonance of the augmented triad and there is a conditional but plausible

counterargument for each. Therefore, the first three chapters of this thesis will illustrate

my reasons for considering the perceived instability of the augmented triad to be a serious

point of contention.

Keeping this purpose in mind, it should be clear why my survey of available literature

is by no means exhaustive; it is intended to demonstrate that a problem exists which I

intend to help solve, and to explain this problem concisely. It is absolutely not done in

order to "take a vote" among theorists of record through the ages, or to convince the reader

that I have left no stone unturned. There is potentially no limit to the number of opinions

on the subject, and it is my own opinion that a visceral consensus of normal listeners is as

valuable as any of them in modeling the visceral consensus of more normal listeners.

Therefore, I have invested most of my effort into determining the basic nature of this

consensus and not into overstating my familiarity with pertinent literature.

It is necessary to provide background and context, but the ultimate purpose of this

thesis is to explain a test of the hypothesis that conditions exist under which short-term

conditioning can cause significant changes to the perception of {0,4,8} or {0,3,7} as the

more harmonically stable. In the end, it is my conclusion that at least one set of such conditions does indeed exist; that under such conditions even small amounts of conditioning may be sufficient to induce a slight tendency in normal listeners to hear specific forms of 3-12 as more stable than specific forms of 3-1.1. It should be added that these conditions were neither difficult to create, nor did they utilize musically abnormal timbres, nonstandard intonation, or complex pitch structures; each of the experimental stimuli could easily occur in tonal or in atonal music. 8

Chapter 2: Some Rational Perspectives of Music Theorists

Many manuals of theory that one would expect to treat the augmented triad either fail to mention it or describe its interval content without making any statement about its consonance or dissonance. In manuals that do treat it, there is some agreement that it is

"dissonant." Although the nature of this dissonance is not always well defined, there is enough implied that one may conclude the criteria of dissonance are not consistent throughout the literature, or through the ages.

Music theorists do, however, often characterize the augmented triad using such terms as

"unstable," "rough," or lacking a clear "root." Although these terms may not be interchangeable it may be possible to resolve them somewhat. It is not clear that an

"unstable" chord is either "rough" or "rootless." But if a chord is "rough" or "rootless," one might also have to consider it to be "unstable." For this reason, I will be using the terms "dissonant" and "unstable" somewhat interchangeably in my discussion of them in the literature, which suggests that the "instability" of the augmented triad is what causes it to be considered "dissonant."

Possible standpoints from which the augmented triad may seen as unstable may be grouped into three basic types: acoustic/numerological, psychoacoustic/perceptual, and cognitive/tonal-syntactic. Clearly these perspectives are not completely independent, as the material of each of these phenomenological realms generally proceeds from the last. But it should be understood both that psychoacoustics demonstates the limitations of the human senses in capturing acoustic events, and that psvchoacoustic events therefore provide only a series of cues to cognition, whether the intention is to capture acoustic events or to capture that information which is intended to be transmitted through their production.

Moreover, the notation of an augmented triad represents only part of the acoustic signal which produces it, and provides only part of the information which is needed in order to understand it. Therefore, while it may not always be possible to distinguish a particular

music theorist's view of the augmented triad primarily from one of the three standpoints

mentioned here, I will make some effort to do so, in order to clarify what I believe is most

significant in what the theorist says. Also, some liberty will be taken in the inclusion of

some perspectives into these categories. For example, numerological perspectives will be

grouped with acoustical perspectives, because they are concerned more with physics than

with physiology. It is also not possible to completely differentiate simple physiological

responses from complex cognitive responses, but it is necessary to accept and to utilize

some distinction between these things in order to appreciate the role of learning in the

construction and conception of dissonance.

THE ACOUSTIC/NUMEROLOGICAL STANDPOINT

Although some important music theorists such as Johann Mattheson (1681-1764) have

considered some forms of the augmented triad to be consonant, ^ the prevailing view of the

mathematically-minded music theorists is that it is dissonant. Early Western music

theorists such as Pythagoras considered the minor sixth a dissonance, and would

undoubtedly have considered the augmented fifth to be at least comparably dissonant. When

Franco of Cologne and his contemporaries began to regard the major and minor thirds as

consonant, both the major and minor sixths were still regarded as dissonant. 6 Eventually,

the sixths also came to be regarded as consonant, but as one criterion of dissonance was

^ Matthew Shirlaw, The Theory and Nature of Harmony (Sarasota, FL: Dr. Birchard Coar, 1970), p. 17. Note that while Mattheson uses terms such as "augmented" and "diminished" differently from most music theorists when referring to chords, the choice of terms is somewhat arbitrary and should not be understood as affecting the actual sounds or functions of the chords discussed. Moreover, he describes as consonant a chord which is spelled very much as one might spell an "augmented triad" (using current terminology). There is no indication that Mattheson has reversed the terms "consonant" and "dissonant" in the way he seems to have reversed the terms "augmented" and "diminished." 6 Ibid., p. 1 10 relinquished, others were always acquired which would enable the musician to recognize as dissonant sonorities describable today as augmented triads.

Zarlino considers harmony to be the result of the union not of like, but of unlike or diverse elements. Thus from the union of two intervals of the same species, whether perfect or imperfect, there result inharmonious, that is, dissonant combinations.7

Significantly, numerological objections to other types of duplication, such as the use of consecutive harmonic intervals of the same size, in the Sixteenth-Century style often appeal to practical utility. For example, the case could be made that objections to parallel fifths and octaves emerge from the objection to direct fifths and octaves, which is first motivated not by their perceived ugliness, but by the observation that vocalists tend to sing them incorrectly, or may lose the ability to sing their lines independently after a parallel gesture. Thus, we can only be less than certain that early apprehensiveness to verticalities

now recognizable as augmented triads do not result from similar observations. Duplication of intervals, no matter how smooth sounding when properly executed, may confuse performers and be difficult to execute properly.

An additional consideration of Sixteenth-Century style, in which the chord we presently discuss first becomes relevant, is intonation. Given that the proper arithmetic derivation of the interval was a point open to debate during this time,^ the deliberate superimposition of two debatable intervals would only serve to compound existing disagreements.

7 Ibid., p. 33- o ...... ° For a view of intonation which clearly demonstates the marked vacillation of musical authorites in regard to the size of the major third, see: Harry Partch, Genesis of a Music, second edition, enlarged (New York: Da Capo Press, 1974), pp. 361-397 11

THE PSYCHOACOUSTIC/PERCEPTUAL STANDPOINT

The sound is rich, perhaps a bit oversweet, suggesting nostalgia or a poignant turn of feeling. The sound is also unstable, but not strongly dissonant. ^

In this passage, Leonard G. Ratner uses two separate sentences to address the qualities

of the augmented triad. Read separately, one sentence seems to address simple sensation

and the other seems to address syntactic effect. But read in succession, it must be

understood that, in using the word "sound" in successive sentences without explaining how

the use of this word has changed in the second sentence, Ratner should have anticipated

that the reader would likely consider both sentences to be using this word in the same way.

The use of the word "also" further invites this interpretation. Therefore, because we can not

consider "oversweetness" very strongly indicative of grammatical function, we must conclude that the word "unstable" (as used here by Ratner) refers more to the sound of such

a chord in isolation than to its referential value in larger tonal contexts.

Implicitly, the instability of the augmented triad (and thus its dissonance) is in some way related to those sentiments Ratner suggests it tends to elicit or induce in the listener.

If this is true, then the instability of the chord is contigent upon our understanding of

certain related emotional states (such as nostalgia) as also being particularly "unstable."

THE COGNITIVE/TONAL-SYNTACTIC STANDPOINT

Most music theorists appear to be more concerned with the tonal function of the augmented triad than with its acoustic content or its effect in isolation from other musical events. That is, they proceed from standpoints of how the augmented triad is cognized, rather than from those of psychoacoustics or mathematics. Let us consider a few examples:

J Leonard o. Katner, MUSIC, tne Listeners Art (iNew lorn: Mcoraw-riin BOOK L.O., inc., 1957), p. 59 12

Much that has been said about the diminished triad applies equally well to the augmented triad. Like the diminished chord, it has no beginning and no end. It belongs to no key and wanders through space. 10

This passage is considered to be of cognitive focus, because it requires the listener to understand the augmented triad not simply in comparison to other individual sonorities,

but also in reference to keys; points of reference which, although automatic to many listeners, must nonetheless be first learned and then mentally constructed while listening.

The point in this case may be well taken if we read it to refer to the fact that that an augmented triad may be part of any of three minor-key collections, assuming audible equivalence. But what, then, of the major and minor triads, which may each be considered I,III, IV,V,VI,or VII and i,ii,iii,iv,or vi, respectively ? Their common-practice key membership potentials should make them less stable than the augmented triad if key membership is the primary criterion for stability. Therefore key membership can not be the primary criterion of instability. Implicitly, then, it is inclusion in key systems, rather than inclusion in pitch collections which is of greater importance here.

The aural effect of this sonority [the augmented triad spelled in major thirds] does not identify the two major thirds. H, [and later] The augmented triad's root can be detected only by the classification of root movements. 12

These statements are confusing when compared to the theories of Rameau and his followers, in that they imply the acceptance of an inverted ontology between perceptual acoustics and the practice of tasteful fundamental bass movement; it hints that any isolated voicing of the triad actually has no fundamental bass, about which more later. But if the root can not be detected without reference to additional pitches, how are we to identify the

10 Lawrence Abbott, The Listener's Book on Harmony (London: George G. Harrap & Co. Ltd., 1943), pp. 46-47 11 Allen Irvine McHose, Basic Principles and Technique of 18th and 19th Century Composition (New York: Appleton-Century-Crofts, Inc., 1951), p. 236 12 Ibid., p. 240 13 augmented fifth relationship between chord members? If we can not identify one of three aurally similar intervals as a dissonance, then where is the dissonance that we hear?

While augmented and diminished triads are found less often in tonal music than are major and minor triads, they contribute a unique color and tension. Overuse, however, can weaken the tonal center of a piece. 13

It may be tempting to assume that the word "tension" in this passage refers to

"dissonance," which one might further conceive as a vertical phenomenon. But the passage actually suggests a different meaning even more strongly: one in which tension is created through momentary destabilization of a tonal center. "Tension" in this sense entails progression, and could arguably be produced almost as easily through clumsy use of distantly related (but consonant) major triads as through injudicious use of augmented triads. The tension to which the passage refers may include vertical dissonance but is not dependent upon it, and so we must not conclude from this passage that a chord that produces such tension is intrinsically dissonant.

Most manuals that mention the augmented triad state that it is dissonant or "a dischord", but, at most, explain that this is due to the presence of the augmented fifth or its inversion. 14 None of these manuals clearly explains what, other than spelling, distinguishes the augmented fifth from the minor sixth. If we take them very literally, we

16 William Duckworth, A Creative Approach to Music Fundamentals fifth edition (Belmont, CA: Wadsworth Publishing Company, 1995), p. 259 14 Joseph Brye, Basic Principles of Music Theory (The Ronald Press Company, 1965), p. 28; Frederick J. Horwood, The Basis of Harmony (Toronto. Canada: Gordon V. Thompson, Ltd., 1948), p. 4; Stewart MacPherson, Practical Harmony (London: Joseph Williams, Ltd., 1907), p.20; Walter Piston, Harmony third edition. (New York: W.W. Norton & Company, Inc., 1962), p. 20; Yizhak Sadai, Harmony in its Systemic and Phenomenological Aspects (Jerusalem: Yanetz, Ltd., 1980), pp. 23,145,150; Felix Salzer and Carl Schachter, Counterpoint in Composition (the Study of Voice Leading) (New York: McGraw-Hill Book Company, 1969), p.27; Elie Siegmeister, Harmony and Melody. Vol.1: the Diatonic Style (Belmont, CA: Wadsworth Publishing Company, Inc., 1965), p.294; D. E. Williams, A Music Course for Students Entering for Certificate and Others (London: Oxford University Press, 1937), p. 149 14 may conclude that it is possible to add more dissonance to a purely diatonic piece of music by enharmonically respelling all of its minor sixths. It may be safe to assume that the authors treated here do not intend to give such an impression. On the other hand, they do little to specify what is contextually necessary for the listener to hear such differences, other than to provide examples of application, thus leaving ample room for error.

A few manuals explicitly address the treatment of the augmented triad as a dissonance, but are either noncommital about this dissonance or imply that such treatment is more a matter of convention than an intrinsic characteristic, or show a certain inconsistency on the subject.

The augmented triad on III of minor is also considered dissonant because of the presence of a dissonant interval, the augmented fifth (+5), above the lowest voice. 1^

This passage uses the word "considered" in a way which suggests either that a chord containing a dissonant interval may not be dissonant, that the augmented fifth may not actually be a dissonant interval, or both, but does not tell us which of these things the author intends to say. Possibly he intended a stronger word than "considered", or he may have meant that listeners may consider the specific interval differently under different conditions.

Outside of a diatonic context, therefore, the listener cannot determine which note of the chord is dissonant and cannot assign a specific direction to the chord, [and later] Such passages occur in both ascending and descending direction, but the former predominates, probably because the upper tone of the augmented 5th tends to move up.^6

^ , Preliminary Exercises in Counterpoint. Leonard Stein, ed. (London: Faber and Faber, Ltd., 1963), p. 95, emphasis mine 16 Edward Aldwell and Carl Schachter, Harmony and Voice Leading, second edition (Fort Worth, Harcourt Brace Jovanovich College Publishers, 1989), p.534, emphasis mine. The passages do not contradict each other. The first passage refers to the nature of the chord out of context; the second refers to its function in tonal context.lt is not their inconsistency but their consistency of description across inconsistent contexts which is interest here. 15

These two passages first state that the augmented triad has no inherent directional

tendency, then state that the augmented fifth, when identifiable, has a tendency to move

up. 17 What is not clear from the passage is whether this tendency is merely a convention of

tonal music, or if it is a natural preference of listeners. Notably, Aldwell and Schachter's

first sentence is concerned with cognition, but their second sentence is concerned with

convention. Moreover, their explanation of the dissonance of the augmented triad strongly

implies a defining tonal context, and in additionally treating the chord as an isolated

phenomenon, they acknowledge that it is not possible to determine through hearing it alone what dissonance it contains, and yet suggest that it must contain one.

A few other music theory manuals treat the augmented triad as a dissonance (e.g.,

through inversion, etc.), but do not actually identify it as such.

Many of the transformations are vagrant harmonies because of their constitution (diminished sevenths, augmented triads,...etc.), and also because of their multiple meaning. 18

Here Schoenberg seems to distinguish between the internal morphology of chords and

their valence in tonal contexts, but he fails to explain what, if not internal morphology,

causes the chords to be tonally multivalent. Possibly he is saying that the chords are not

only difficult to identify as representing specific subsets of a tonal scale because of their

shape, but also that they suggest numerous possible scales. If this is what he is saying, he

seems to be saying essentially the same thing twice. Perhaps he is merely trying to

separately emphasize vertical and horizontal aspects of a single phenomenon that relates to

15 Tones do not move themselves, and in a real sense, don't actually "move" at all in most cases. By "tends to move up," the authors (Aldwell and Schachter) must mean "tends to be moved up."Regardless of what it actually says about the augmented triad, the passage shows some level of confounding between form and function. This may partially explain their stated assumption that the augmented triad contains a dissonant interval outside of a diatonic context, even if it can not be identified. 18 Arnold Schoenberg, Structural Functions of Harmony. Leonard Stein, ed. (New York: W.W. Norton & Company, Inc., 1969), p. 44, emphasis mine 16 both musical dimensions. He does not state that the internal morphologies of the chords he is discussing are of a "dissonant" or "unstable" quality; to do so here without clear basis would be to further confound what is notated in the vertical plane with what is notated in the horizontal.

Some manuals suggest special compositional treatment of the augmented triad, as compared with major and minor triads, but do not specifically constrain it more than them, as one might expect.

The upper parts may between them have the root and fifth of a diminished or augmented triad, but the diminished or augmented interval must not be between the lowest voice and either of the upper parts... Augmented and diminished triads are used in first inversion only. 19

The reference to triads as such may be helpful but is somewhat anachronistic; the passage refers to three-voice counterpoint in the Sixteenth-Century style, refers to an era during which even the identity of consonances under registral rearrangement was not widely accepted.20 The author's instruction is undisputably an easy way to get the desired result from modern students, and makes up in practicality what it lacks in historical clarity. But if a student should look to the practice in question to reveal anything about the triadic essence of the verticalities to which she is referring, the student will be slightly misled. What the author is likely referring to is not the practice of inverting these triads, iy Charlotte Smith, A Manual of Sixteenth-Century Contrapuntal Style (Cranbury, NJ: Associated University Presses, Inc., 1989), p.71 20 Matthew Shirlaw op. cit. pp. 29-57. Shirlaw points out that Zarlino, for example, while recognizing the relatedness of what modern theorists recognize as various inversions and voicings of a single triad, did not classify harmonies from an essentially triadic basis. At least one guitar manual survives from this era which treats several voicings as equivalent. See:

Thomas Christensen, "The Spanish Baroque Guitar and Seventeenth-Century Triadic Theory". Journal of Music Theory 36/1 (Spring 1992): 1-42

But in this guitar manual, the letter names which are used to classify chords are purely taxonomic and do not relate in any meaningful way to the letter names of the constituent tones. Also guitar manuals of this time must be understood as somewhat removed from the context with which Smith is concerned, and thus show many other differences of doctrine, such as openly allowing parallel movement of fifths and octaves. 17 as such, but simply the avoidance of stylistically rare intervals between the lowest voice and any other part, as she has also explained.

As mentioned before, another perhaps equally important consideration in 16th century style is a primarily numerological objection to tautology in music of the church, as evidenced by certain arguments against the semidiapent (diminished fifth) as being constructed of consecutive minor thirds, and the avoidance of parallel movement between voices at the octave, fifth, and even the major third.21 Naturally, the highly symmetrical quality of two vertical major thirds would also be included among things to be avoided in this style, regardless of its smoothness or roughness, or "rootlessness" (an historically inappropriate concept which Smith does not use, but whose pertinence she invites through the use of the word "root").

At least one manual of music theory challenges the dissonance of the augmented fifth.

According to David A. Sheldon, the theorist F.W. Marpurg (1718-1795) considered the

augmented fifth a "pseudo-dissonance,"22 and accepted that any of the constituent pitches of an augmented triad may be doubled in proper voice-leading as long as the doubled pitch is not treated as a dissonance 23 (that is, resolved, presumably after being spelled such that any resolved pitch is at an augmented or diminished interval). The fact that Marpurg allows an augmented fifth or diminished fourth to be resolved chromatically in either

direction 24 js telling, in that it shows the dubious nature of any "natural" directional tendency of these intervals. Further, Sheldon asserts that such practices were often used in the galant style, citing Koch for support. 2 5

21 Ibid. 22 David A. Sheldon, Marpurg's Thoroughbass and Composition Handbook: a Narrative Translation and Critical Study (Stuyvesant, NY: Pendragon Press, 1989), p. 86 23 ibid., p. 156 24 ibid., p. 90 25 Ibid., p. 86 18

One manual also openly challenges the common assumption that the dissonance of a chord in tonal contexts is directly related to its harshness as an isolated vertical structure.

All other chords than these two (major and minor triads) are unrestful or incomplete; that is, they seem to require to be followed by a suitable consonant chord in order to complete the sense. They are called dissonant chords. Most of the intervals that enter into them are seen to be consonant; but in every chord at least one interval that is not consonant is found, and one or both of the tones of this interval create the unrestful feeling that characterizes the chord. It is not the harshness of a chord or interval that makes it dissonant. On the contrary, some of the dissonant chords are far more smooth than most arrangements of the major and minor triad; and they are frequently chosen by composers for the pure loveliness of their combination of tones. But they refuse to assert finality; they pass one on to the something which is to follow.2^

This passage does not mention the augmented triad specifically, but one can little doubt

that if one considers this to be true of "several dissonances," that the augmented triad must

be among them, in light of its enharmonically ambiguous dissonant component.

The manuals discussed here span several centuries, and represent diverse practices within Western music, both sacred and secular, and so might be considered a representative

sample of Western thought regarding the augmented triad from the first overt use of

structures to the advent of . Throughout the manuals, taken collectively, several

reasons recur for the assessment of the augmented triad as harmonically unstable:

1) That it contains an arithmetically complex or acoustically rough interval; the augmented

fifth.

2) That its internal morphology inhibits the perception of any internal hierarchy

regardless of voicing; that its creates an uncertain-sounding equality of tones

which is not easily mitigated through any registral positioning; that it has no "root" or

"fundamental bass."

2& George Coleman Gow, "Elementary Theory and Notation", Essentials of Music from The American History and Encyclopedia of Music. Emil Liebling, ed. (New York: Squire, 1910), pp. 180-181 19

3) That it easily ambiguates scale form and/or key center;

that fails to clearly represent a normal tonal elaboration of any scale degree.

Re-examined in this way, these arguments supporting the dissonance of the augmented

triad seem each to derive from the three types of perspectives taken, as categorized here.

Arguments of type 1 and 3 show no inconsistency with this interpretation. The question of

"roots," though, is a complicated one.

If "roots" or "basses" were, in truth, essentially low-level perceptual phenomena, one would expect there to be either a great deal of agreement among music theorists of record on the roots of individual chords, or at least a trend of convergence over the years, in which it becomes easier to determine the "root" of a chord; perceptual evidence (even if not deliberately sought) would simply pile up over time, increasing our certainty. But there is considerable disagreement, and no apparent trend of convergence.

THE QUESTION OF ROOTLESSNESS

Since the time of Rameau, the idea of the "root" or "fundamental bass" of a vertical set of pitches has been generally accepted, but not without continued refinement and reinterpretation.27

While each definition of "bass" or "root" proceeds from some actual percept, some also sustain a certain level of reliability when scrutinized with greater objectivity, and are therefore more relevant to a discussion of empirical perspectives than of primarily rational perspectives. Music theorists including Helmholz and Terhardt who have attempted to

Ll Note: A "fundamental bass" differs from a "root" in that it is a specific pitch which may or may not occur in the chord which identifies it, but a "root" is more of a pitch class than a pitch, and is usually considered to be directly manifest in the chord which identifies it. The terms are treated together here, not because they are the same, but precisely because they illustrate the varying perspectives of the authors on what is probably a single "core" concept in their combined works. 20

resolve the concept of "roots" or "fundamental basses" with the results of scientific

research will be treated in later chapters.

Many music theorists have, though perhaps appealing to the hearing of readers, have

indicated, suggested or modified systematic ways of determining the root of a chord based

primarily on their own observations of musical sound and/or the works of earlier theorists.

The most distinctive basic approaches to this problem are probably those of Rameau, Sorge,

Tartini and Hindemith.28

Jean Philippe Rameau (1683-1764) is often credited as the originator of the concept of

the "root" or "fundamental bass" of a chord. Being apparently influenced by Zarlino and

Descartes, he was well aware of the presence of partials in a musical tone, and reasoned that as partials were heard as referring to a fundamental, a group of musical tones could also be

heard as referring to a low single pitch. In Rameau's system, for chords not easily revoiced

as if a series of partials, pitches could usually be explained as referring to two

fundamentals, of which one was the stronger. The result of this tends to be (although it was

not Rameau's direct intention) that complex chords may often be viewed as a series of

stacked thirds, with the lowest member reflected in the fundamental bass, often one or more octaves below the lowest notated pitch.

In this system, the augmented triad seemingly has no single fundamental bass of its own, but must refer to another chord or key through which the meaning of the triad may be

identified. Although some hierarchical acoustic organization of chroma-equidistant pitches

may well arise as a function of actual pitch height, Rameau did not view triads in this way.

To Rameau, the spelling of an augmented triad would have been relevant to the definition of

its bass, because spelling was indicative of tonal function, and thus would also define the

28 The discussion of these theorists here is derived largely from: Shirlaw, The Theory and Nature of Harmony , with the exception of Hindemith, my understanding of which arises both from reading his treatises, from listening to his compositions, and from numerous classroom discussions of his compositional techniques. 21

location of the augmented fifth and of the two stacked major thirds. Notably, many of

Rameau's other dissonances are as clearly related to specific basses as are his consonances.

Assuming that this point of Rameau's system accurately models some strongly influential

but unobvious aspect of tonal music cognition, this would suggest a very fundamental

difference between the augmented triad and the major and minor triads, if not in an

acoustic sense, then in they way that they are likely to be understood by listeners.

Georg Andreas Sorge's Vorgemach der Musikalischen {Composition (1745-1747)

effectively proposed that chords could be derived simply by taking each tone of a scale and

adding the two tones a third and a fifth above. Thus, according to Sorge, the root of any triad

would be that tone which was at the bottom of the chord when spelled in thirds. The

resulting system of roots is rather like Rameau's system of fundamental basses, except that

Sorge was less concerned with a chord's relation to one or more overtone series, and more

concerned with the internal grammar of scales as extending from the consonance of thirds.

This is reflected in the fact that Sorge considered certain dissonant chords to have roots not

actually included in their spelling; roots not sounded in the bass when the actual pitches

were heard (as somewhat abstractly in Rameau and more concretely in Tartini), but simply

understood as being mentally present in some way which helped aurally organize the actual

sounded pitches. For Sorge, the root was truly more a pitch-class than a pitch. Thus,

although Sorge may have regarded the augmented triad as having a root, even a specific root

of its own, he regarded all chords as having specific roots of their own (even if not

sounded), and this characteristic can not therefore be regarded as a criterion of consonance

in Sorge's system. In Sorge, rootlessness does not constitute dissonance because

rootlessness does not exist, and the dissonance of the augmented triad is heard in the

augmented fifth, which is audible because the triad is derived from a scale which provides

us with the context needed to recognize this interval as different from the minor sixth. 22

In his Trattato di Musica (1754), Tartini proposes a system of harmonic derivation of

chords based on the observed audibility of a third tone when two tones are played together which are related by ratios found in the lower harmonics (ratios using numbers between 1

and 6). This concept of harmony is strikingly like Rameau's in that it relates played

pitches to lower, unplayed pitches which serve as unifying elements. But Tartini's work can

be said to derive as much from Rameau's influences (Zarlino and Descartes) as it does from

Rameau. Unlike Rameau, Tartini did not consider all chords to have roots which could

explain and/or be explained by all of their consituent tones. Tartini did consider some

dissonant chords (such as the dominant seventh) to have roots, but was primarily interested

in the audible physical properties of consonant intervals.

Like Zarlino, Tartini considers to be dissonant any chord built of two consecutive

similar intervals.^9 That is, a diminished triad is dissonant because it contains two minor

thirds, and the tritone is at most a secondary consideration, as would be the sevenths or

ninths produced by stacking perfect fourths or fifths. It is therefore incidental that the

"fifth" of the augmented triad may or may not sound consonant on its own; according to

Tartini, the dissonance of this chord does not depend on the dissonance of any one interval.

Since Tartini's system of harmony was an extension of his own observations on the

audible vibrations of strings and other instruments, he would not have been able to ignore

the similar physical properties of the augmented fifth and the minor sixth (particularly

under equal temperament, which, although not treated by Tartini, is a characteristic of

some music to which his theory is considered relevant), and any observations he may have

made on this point would no doubt have been consistent with his lack of interest in the

dissonance of the augmented fifth in isolation. Moreover, the augmented triad is effectively

rootless in Tartini's system not because of its augmented interval, but because it results

not only in two lower pitches (as with the diminished and quartal/quintal chords), but in

Shirlaw, op. cit., p. 300 23

three. For Tartini, an augmented triad would have been only one of many rootless (and

therefore dissonant) sonorities, any of which may or may not have sounded rough or

contained a specific dissonant interval.

Numerous variants of harmonic theory in general have appeared from the mid-eighteenth

century onward, but the basic concept of the "root" or "fundamental bass" undergirded the

positions of those music theorists already discussed here. One of the more significant

refinements is the theory of Paul Hindemith, which attempted to accomodate harmonic

theory to findings by acoustic scientists such as Helmholz.

Hindemith, though interested in the physical properties of sound, was motivated more

by the practical needs of musicians. While taking a large step from pure science in the

direction of acoustic simplicity, the great usefulness of Hindemith's system to composers

suggests that whatever the extent of the bass's validity from the standpoint of

psychoacoustics, his bass is very functional as a syntactic tool.

In Hindemith's system, all inervals and most chords have roots of varying strengths, and

the tension of chords is more a function of their interval content than of their relationship

to roots.30

According to Hindemith, the augmented triad has no root of its own in the sense that

none of its pitches readily emerges as stronger than the others. But he considers one of the

tones to be audible as a root in tonal contexts where the triad is preceded and/or followed

by chords with clearer roots, which give extra, uneven meaning to the tones of the triad.

While Hindemith does not necessarily consider the augmented triad to be a harsh

dissonance, he does characterize it by using such terms (translated) as "indeterminate,"

"wavering," "unsteadiness," and "uncertainty." Each of these terms relates easily to the

term "unstable" and refers to a group of chords he considers to have weak roots. Therefore, while Hindemith does not clearly define instability as a form of dissonance, he does treat

Paul Hindemith. Craft of Musical Composition, vol.1 (Melville, NY.: Belwin-Mills, 1942) 24

weakness of roots as a criterion of instability. That is, Hindemith does not show that

weakness of roots is a form of "dissonance" perhaps only because he does not treat

"dissonance" as such, and prefers terms such as "tension" and "decrease in value".31

By examining points of ambiguity and disagreement between the views of these authors, which include the question of how one may perceive the "root" or "bass," it is possible to

see that while certain tendencies of root determination have emerged as predominating,

their predominance is as easily attributed to their simplicity as it is to their validity in

light of additional criteria. Clearly, though, the conception of the "root" or "fundamental

bass" has enough basis in perceptual reality to make its consistently weak associations with the augmented triad (under any definition of "root") a legitimate point of interest in assessing the harmonic stability of this chord, even if it may be a point of some disagreement among theorists as it pertains to other chords.

In summary of the general view of music theorists throughout history toward the augmented triad there are several basic points of agreement (with a few exceptions) and several significant points of lack of agreement.

Points of agreement:

1) that the augmented triad should be regarded as "dissonant."

2) that its dissonance is due to some type of "instability."

3) that this instability is manifest in the interval of the augmented fifth.

4) that the augmented fifth differs from enharmonically equivalent, consonant intervals.

Points of lack of agreement:

1) whether dissonance is primarily physical, physiological or psychological

31 Ibid., p.119 25

2) whether "instability" is primarily caused by roughness or rootlessness

3) whether the augmented fifth is essentially an arithmetic or logarithmic interval

4) whether the difference between the augmented fifth and the minor sixth is physical,

physiological or psychological.

What this shows is that while the conclusions of music theorists regarding the augmented triad are largely agreed upon, what are not agreed upon are the premises from which one might like to think that the conclusions should proceed. Several further

interpretations of this situation are possible.

One obvious interpretation is that the conclusions and premises are simply reversed;

that it was the observation that the augmented triad was dissonant which caused the various

explanations of its dissonance to arise. But elements of the various explanations were in operation before triads were recognized as such. Reversals of normal modes of reasoning are

not necessarily wrong, and probably occur with some frequency in the course of music theory history. But this does not provide a full and satisfactory explanation for the

inconsistencies mentioned here.

Another, more plausible interpretation is that sufficient correlation exists between things such as the physics of, physiological response to, and tonal function of the augmented triad that it is usually unnecessary to address these things as independent of each other. Possibly, "roughness" and "rootlessness" are also highly correlated or similar

in some way. Possibly, its sound in a triad or its application in music renders useless the distinction between the arithmetic augmented fifth and the logarithmic perfect fifth, and yet effect of the arithmetic minor sixth is preserved under triadic inclusion and equal temperament. 26

If any of these correlations or propositions are valid, they should be verifiable with

modern technology and methods. In the next part of this thesis, much consideration will be given to this point.

At some point in the history of music theory any of the speculations of music theorists against the consonance of the augmented triad could have been considered completely sustainable considering the dearth of counterevidence. It could even be said that the means of directly evaluating these claims were not practically available until most of the theory

manuals examined here had been broadly accepted for many years. It must also be remembered that if styles existed which could have utilized the augmented triad

differently, such styles were probably not well known to many of the authors of manuals discussed here.

Moreover, as sources of basic guidance to the composer and the analyst, the manuals mentioned here and others like them are certainly useful. They provide the means of

obtaining a reliable result; functional pieces of music and insightful analyses. But this reliability is only possible because common-practice music theory deliberately simplifies

music cognition and does not concern itself with what may be regarded today as readily demonstrable exceptions to tendencies of human hearing as directly referent to cognizable

graphic structure. That is, although common-practice theorists may describe the sounds of

the objects they explain as if these were the sounds which normal listeners have been

demonstrated to hear, the sounds that they actually describe are the sounds that listeners would be expected to hear as extrapolated from a collection of assumptions, tested and

untested, about the auditory salience of certain mathematical relationships. These assumptions have been reinforced by a long history of compositional practice that has not

demanded alternate hearings of the superimposition of two major thirds. 27

Chapter 3: Some Empirical Perspectives

...empirical studies of interval perception have fallen short of confirming the phenomenal reality our concepts describe so confidently. 3 2

- William Thomson

Psychologists are apt to make prescriptions about the nature of music based on a narrow and often primitive understanding of the medium. Musicians, on the other hand, are in the habit of basing analysis on sweeping assumptions about the nature of perception for which there is little experimental evidence.33

- Stephen Walsh

Thomson's observation is not mere understatement. It forces us to confront our true motives for conducting (or for not conducting) empirical studies.

On the other hand, as Walsh points out, the isolated, constrained conclusions of psychologists are not necessarily extrapolable to larger, more complicated musical contexts, and music analysts can scarcely be faulted for refusing to repeatedly modify their methods in order to accomodate numerous new findings which are not necessarily of very general pertinence to real music. While experimental evidence may some day allow us to replace much of current analytic methodology (inasmuch as it depends upon appeals to sensation), it is only recently that any amount of experimental evidence has become broadly available in useful forms.

Work by such scientific minds as Descartes showed well before the twentieth century that musical sound was scientifically approachable (in the modern sense) even with what would now be considered very crude technology. But their results were used (and apparently intended to be used) as mere explanatory supplements to music theory as it

il William Thomson, "The Harmonic Root: A Fragile Marriage of Concept and Percept", Music Perception 10/4 (Summer 1993): 385 33 Stephen Walsh, "Musical Analysis: Hearing is Believing?", Music Perception 2/2 (Winter 1984): 237 28 existed in their time. Even in the late twentieth century, scientific approaches to music have focused on the elaboration of existing models of music perception and cognition implicit in music theory, rather than on determining the strength of these models.

As Thomson's remarks suggest, this approach to musical science, accepted without notable exception for centuries, is no longer without its critics in serious academic circles.

In the last twenty-five years great advances have been made in the publication of scientific approaches to the perception of various musical phenomena. Not all of these advances have been favourable to traditional music theory's appeals to perception. For example, they have demonstrated the quantifiable inaccuracy of such long-standing assumptions about musical perception as the linear correlation between frequency and pitch.34

Some of these scientific investigations have some bearing on the behaviour of the augmented triad under variable conditions, and the perceptual'properties of triads in general. They can be used not only to clarify several special aspects of the augmented triad, but also to assist the music theorist in reformulating questions one may have about the triad in ways which make these questions easier to answer accurately.

Some articles have dealt directly with the augmented triad from various perspectives, and attempted to answer questions about what may be perceived when one hears it.

A few articles have dealt with the nature of vertical dissonance from the standpoint of acoustic definitions of pitch and timbre, which can tell us something about the conditions of dissonance in general, and, therefore, provide some basis for assessing the comparative dissonance of various types of acoustic events, including those which would normally be notated as an augmented triad.

34 Stephen Handel, Listening (An Introduction to the Perception of Auditory Events) 1st paperback ed. (Cambridge, MA: The MIT Press, 1989). See his discussion of mels, p.69, adapted from: B. Scharf, "Audition", Experimental Sensory Psychology (Glenview IL: Scott, Foresman, 1975): 112-149 29

Some research has also dealt with the general perceptual validity of traditional

assumptions about chord progression, voice leading and harmonic closure, which can tell us something about the certainty that any given notated augmented triad will function in a

manner consistent with that described in theory manuals or elsewhere.

Several articles have dealt with the perceptual relevance of the harmonic root and/or

fundamental bass, which can tell us something about the transparency of such constructs in general, and specifically the cognizability of the root/bass of any given augmented triad as

compared to those of more commonly used sonorities.

Several articles have dealt with the influence of timbre on the perception of pitch and

intonation, and the influence of intonation on the perception of interval size, which can tell

us something about the conditionality of perceptual differences between objects not improperly notated as augmented triads and objects not improperly notated otherwise (ie: as major or minor triads).

It is the purpose of this chapter to summarize this research, in order to position the experiment in the pertinent research context.

Again, as with the perspectives of theorists, the perspectives of researchers may be grouped as primarily acoustic/numerological, perceptual/psychoacoustic or cognitive/tonal-syntactic.

THE ACOUSTICAL PERSPECTIVE

Danner (1985), uses acoustic (not perceptual) principles to calculate the roughness of

more than 200 three-part equal-tempered sonorities assuming a very simple, but very

musically normal harmonic spectrum in which the amplitudes of ten arithmetic harmonics

were varied at 1/N, such that the amplitude of the tenth harmonic was 1/10 that of the fundamental. In each sonority, the lowest member was middle C, and the other pitches were 30 within two octaves above.

One thing which Danner's estimates show is that, given such a timbre and such intonation, classification of the sonorities by type was not a reliable predictor of roughness.

For example, the roughness of set type 3-11 varied between values which Danner defined as

0.0307 and 0.1989, fully including the range from 0.1161 to 0.1368, the range of roughness

exhibited by 3-12 in comparable registration.35

What this means is that if we intend to accept that 3-12 is always acoustically rougher

than 3-11 (or always smoother for that matter), we must reject the proposition that acoustic roughness correlates with psychoacoustic roughness, or reject Danner's choice of timbre as

musically representative or reject Danner's choice of temperament.

The first rejection is obviously much less problematic than the others. It is indisputable that, while purely acoustic measures of roughness are theoretically valid in any frequency

range and at any amplitude, human hearing has limitations at either end of the frequency spectrum and also has limitations of amplitude beyond which atmospheric oscillations are

undetectable. Further, it is known that the frequency response of the human ear is not accurately described by straight line segments, or even by simple, highly similar curves at various amplitudes.3^ Thus it should be clear that Danner's roughness estimates, even if

they describe comparative perceptual roughness of the sonorities in question at some amplitudes with some degree of accuracy, this accuracy is predictably unreliable under extreme pitch-transposition and/or significant shifts of amplitude.

Notably, Danner does not suggest that his estimated acoustic roughnesses will map on to perceptual roughness, but only shows how such estimates might be used in music analysis

without defining the analytical result as accurately describing listener response.

3^ Gregory Danner,"The Use of Acoustic Measures of Dissonance to Characterize Pitch- Class Sets", Music Perception 3/1 (Fall 1985): 103-122, see graph 36 Stephen Handel, op. cit., p.64; Juan G. Roederer, Introduction to the Physic and Psychophysics of Music (London: The English Universities Press, Ltd., 1973), pp. 76-82 31

Therefore, it should be understood that Danner's roughness estimates of 3-11 and 3-12, viewed comparatively, do not necessarily show that 3-12 can actually sound less rough than

3-11. On the other hand, it is difficult to accept that the nonlinear response of the human ear is enough to explain away any perceptual validity implicit in the overlap of roughness ranges between 3-11 and 3-12. In other words, should a first-year music student fail to

hear Danner's least rough form of 3-12 as more rough than Danner's most rough form of 3-

11, her teacher would have no acoustic basis for telling her that her "ears are wrong"; her ears might even be abnormally accurate in conveying acoustic information to her brain.

Of course, most music is at least ostensibly composed and performed primarily to be

heard (and perhaps understood) by people with less than absolute acoustic resolution; that

is, by people with normal hearing. Therefore, we must consider to what extent, and in what ways Danner's estimation of acoustic roughness is consistent with scientific measures of

psychoacoustic roughness.

THE PERCEPTUAL PERSPECTIVE

Terhardt (1984) suggests several criteria which may contribute to the perception of

consonance. His discussion and diagrams show several things which suggest that context

may be as important as content in determining the functional consonance or dissonance of a

particular sonority as it appears in a piece of music. As Terhardt defines it, the

dissonance of intervals between pure tones is not consistent across registers. He asserts

that in the 110 Hz range, intervals approaching pitch-interval 4 are the most dissonant

intervals between pure tones, and the size of the most dissonant pitch-interval contracts 32

rather gradually the higher one goes in frequency, to something slightly smaller than

pitch-interval 1 at around 3520 Hz.37

What we may extrapolate from this is that pure-tone manifestations of a trichord type

such as 3-1 might actually sound smoother at around 110 to 220 Hz than might a major or

minor triad (3-11) in the same register. If we can accept this extremely counterintuitive

view of musical sound, we might doubt that 3-12 is more rough than 3-11 in all other

timbres and registers.

Further, Terhardt indicates that for pure tones between 440 Hz and 880 Hz (a very common range for musical pitch), smoothness of intervals consistently increases as interval

size moves from the major second to toward the octave.38 Under these conditions, the pure-

tone tritone is more consonant than the perfect fourth, and even more so in comparison to

the major and minor thirds. More to the point, all interval sizes approximated by the

equally tempered minor sixth, including the augmented fifth, are even more consonant than the tritone. This means that even in reference to a specific musically normal register, pure

tones do not interact with each other as tonal theory would imply that they should (if we should look to theory to describe as physiological response as well as tonal cognition).

Additional tones (possibly harmonics) are needed in order for intervals such as the tritone and the minor seventh to be heard as dissonant, at least when no musical context exists which defines them as dissonant.

Ultimately, Terhardt makes a very clear distinction between what he calls "Sensory

Consonance," and what he calls "Harmony." Sensory Consonance is considered to be innate, and Harmony is considered to be learned through exposure to music. Using this distinction,

the ranking of Sensory Consonance of two sonorities may sometimes be the reverse ranking of their harmonic consonance, if our experience as listeners tells us that the sonorities in

37 Ernst Terhardt, "The Concept of Musical Consonance: A Link Between Music and Psychoacoustics", Music Perception 3/3 (1984): 282, see graph adapted from Kameoka & Kuriyagawa (1969), Plomp & Levelt (1965) and Terhardt (1977) 38 Ibid., p. 281 33

question represent two other sonorities for which Sensory Consonance and Harmonic

Consonance would be directly correlated.

This is a useful model in that it allows us (among other things) to reconcile the

consistent designation of the augmented triad as more dissonant than the major and minor

triads in tonal music with any doubts we may have developed from the standpoints of

acoustics or psychoacoustics. But there are at least three limitations to this reconciliation:

Firstly, that one must hear tonal music "tonally" (in a learned way) for this reconciliation

to have any meaning to begin with; secondly, that it may still be possible to hear certain

pieces of music in tonally meaningful ways without this reconciliation; thirdly, that even if we can hear tonally, the ability to hear "atonally" or "modally" may interfere with our ability to hear "tonally," should we be confronted with examples of complicated tonal syntax in which multiple, perhaps hierarchical inversions of sensory ranking by harmonic ranking are simultaneously operant.

For example, when "I 6/4 prolongs V+" in the lowest octave of a vibraphone, the timbral and registral atypicality of the acoustic events in question may make it difficult for even a well trained listener to recognize "I64"and "V+", thus to understand which chord

"prolongs" which. In such situations, a listener may revert to what I will call

"pregrammatical default modes of auditory cognition" which are probably as consistent with Terhardt's model of hearing as they are with tonal theory.39

39 i use the term "pregrammatical" rather than "premusicai", in that one may, and indeed must perceive music as music before one may come lo understand its musical grammar and the functions of its various components, just as one must recognize speech as language before one may learn to interpret it as having an intended meaning. Needless to say, the types of effects I have typified in the vibraphone example are avoided in tonal music, precisely because they often fail to communicate the "musical meaning" which would be derived from their analysis. But the example should not be dismissed as reduction to the absurd in that comparable events do occur with some regularity in twentieth century music, apparently without much concern as to whether we may recognize them as having grammatic functions derived primarily from their pitch structure (which may be difficult to prioritize as a determinant of dissonance in such cases). These types of considerations are not within the scope of Terhardt's article, which emphasizes general principles of, rather than specific problems with consonance. 34

Franz Loosen, in a 1995 study of preferences of intonation, found that a performer's choice of primary instrument correlated to, but did not determine these preferences.40

Participants were not simply asked what they preferred, but were required to judge examples without knowing what system of intonation they utilized. In general, violinists tended to prefer Pythagorean intonation over equal temperament. Pianists tended to prefer equal temperament over the Pythagorean system. Nonmusicians showed no preference.

Although the study does not demonstrate any type of causality, and does not specifically consider the augmented triad at any point, it is tempting to draw pertinent inferences from the findings presented. For instance, if one demonstrates a preference for a specific intonation system by less frequently judging examples of it as being "out of tune" (as was done in Loosen's study), then it follows that the systems considered to be "out of tune" are heard as referring to a system which is "in tune"; either the preferred system, or something closer to it. It then follows that if one hears one of the two systems treated here as having higher referential value, then it could be said, for all intents and purposes, that individual trained musicians may or may not hear primarily as if in reference to a single harmonic structure like that demonstrated on and extrapolated from the monochord(preferring arithmetic intervals over logarithmic ones), and that the purely vertical meaning of the augmented fifth for an individual listener should therefore depend largely on the presence or absence of this "monophonistic" (arithmetic/naturally harmonic) style of hearing.

This is pertinent to a discussion of the augmented fifth. If one hears a logarithmic augmented fifth (equal-tempered; that is, the square of the third root of two), which does not differ mathematically from a logarithmic minor sixth, as referring to an arithmetic augmented fifth (32/25) which differs both audibly and mathematically from an arithmetic

4U Franz Loosen, "The Effect of Musical Experience on the Conception of Accurate Tuning", Music Perception 12/3 (1995): 291-306 35

minor sixth, there may be upon hearing the logarithmic one, a sense of being "out of tunc,"

although there may also be a sense of reference to the intrinsically dissonant augmented

fifth. If one hears strictly in this way, one of the interval-class 4's in 3-12 must always be

heard as referring to the dissonant arithmetic ratio of 32/25 (or some multiple of it).

Conversely, if one prefers equal temperament, complexity of ratios between constituent

frequencies is not a primary criterion of dissonance. Moreover, for such listeners (from a

purely vertical standpoint), the more dissonant interval between the arithmetic augmented

fifth and the arithmetic minor sixth should be the one which least resembles the square of the third root of two.

Ultimately, the fact that Loosen's nonmusicians collectively showed no preference at all draws into question whether the normal listener (in the demographic sense of the word

normal) has any plausible capacity to distinguish between the augmented fifth and the minor sixth in any form when not provided with a larger tonal context. If we can't trust violinists and pianists to agree on what these intervals actually are, we are likely kidding ourselves if we expect people with any less training or innate ability to identify and to judge these already uncertain intervals. If there ever was a point in history when normal people heard these intervals more acutely (granted, there may have been), we have neither any way of knowing this for certain, nor any way of demonstrating what relevance such a conclusion would have for analysts or for composers today.

Cheryl L. Bruner found in her study of comparative perceptual similarity of the twelve basic trichord set-classes that her participants did not mentally group these sets consistently with purely mathematical estimates by theorists such as Robert Morris.41

41 Cheryl L. Bruner, "The Perception of Contemporary Pitch Structures", Music Perception 2/1 (1984) 25-39 Bruner specifically cites the work of Morris, although Bruner's grouping of is also at odds with mathematical estimates of similarity by Isaacson, Rahn and Teitelbaum. Across all four mathematical estimates the greatest inconsistencies with Bruner were with estimates for trichord types 3-7 and 3-12. For further discussion of trichord similarity see: 36

Bruner's study required participants to estimate similarities for a broad variety of

registrations, degrees of pitch commonality and presented the trichords both as successions of tones in some cases, and as simultaneities in others. Bruner's article gives only a basic description of the stimuli, and does not address timbre. Therefore, Bruner's results may not reflect all possible comparative situations with equal accuracy, but in the sense that they tend to contradict some very general inferences about trichord similarity which one may be tempted to draw from the work of others, the results do warrant serious consideration here.

While the grouping of set types by Bruner is not particularly favourable to an interpretation of 3-12 as similar to 3-11 (major and minor triads), it should be noted that she found 3-11 to be most closely associated with 3-9, which is not only not a continuous tertian structure, but contains no thirds whatsoever. Interestingly, 3-12 is much more closely grouped with 3-8, a class that includes the incomplete dominant seventh, which is also not a continuous tertian structure.

One thing which might be inferred from this grouping is that graphic, or function similarities (ie: tertian structure, tendency to resolve, etc.) are not always good indicators of perceptual similarity. It may be further noted that while 3-8 and 3-11 are considered to be the two most stable trichords in tonal theory and can be prolonged as harmonies in

Schenkerian theory, they are grouped far apart by Bruner, and each in proximity to a set of functionally contrasting features. While 3-12 is inversionally invariant and tertian, 3-8 is

Eric J. Isaacson/'Similarity of Interval-Class Content between Pitch Sets: The IcVSIM Relation.", Journal of Music Theory 34/1 (1990): 1-28

Robert Morris, "A Similarity Index for Pitch Class Sets", Perspectives of New Music 18 (1979-1980): 445-460

John Rahn, "Relating Sets", Perspectives of New Music 18 (1980): 483-497

Richard Teitelbaum, "Intervallic Relations in Atonal Music", Journal of Music Theory 9/1 (1965): 72-128 37

neither of these. While 3-9 is not inversionally invariant, it is symmetrical and clearly

non-tertian. 3-12 is tertian and asymmetrical.

From a purely mathematical standpoint, it is mysterious indeed that a tertian set like 3-

10 (the diminished triad) should be grouped neither with other tertian-symmetrical (3-12) sets, nor with another set class containing the tonally all-important interval class 6 (the

tritone), but with 3-7; a set which is similar only in one interval class (and in their mutual inclusion in the diatonic heptachord, 7-35).

To whatever extent such Findings are replicable, we must question whether the tonal functions of these various trichords are separate from the specific manners and contexts in which they tend to be presented. For a rather simple example, the major and minor triads are undoubtedly highly similar in tonal function, but 3-8 and its inversion are not comparably similar to each other as functional dominants in normal tonal theory.

To put it bluntly, Bruner shows that perceptual similarity does not consistently correlate with mathematical similarity, much less with traditional tonal function.

Trainor and Trehub's " What Mediates Infants' and Adults' Superior Processing of the

Major Over the Augmented Triad?", to judge from the wording of the title, seems to demonstrate that augmented triad verticalities are more difficult to process than major triad verticalities.42 However, their experiments focus on the presentation of tones in succession, rather than as simultaneities. (In the experiment in which simultaneities were used, they were used only in conjunction with successions of tones.) Moreover, the experimentors not only assume that their results may pertain equally well to both situations, they also seem to assume that their subjects are somehow extracting specific

4Z Laurel J. Trainor & Sandra E. Trehub, "What Mediates Infants' and Adults' Superior Processing of the Major Over the Augmented Triad?", Music Perception 11/2 (Winter 1993): 185-196 38 implicit vertical sonorities from successions of tones containing other material; further,

that this is the primary mode of processing.^3

A more constrained conclusion from Trainor and Trehub's work is that the perfect fifth

is a perceptually advantaged melodic interval. This seems to be well supported by their data. We may be able to conclude from their data that melodies which can be easily

analyzed as referring to augmented triads are usually more difficult to mentally process than are melodies which may be easily analyzed as referring to major triads. But we can not

conclude from this that sonorities containing the perfect fifth will always share perceptual advantage with melodies containing them, or that triads of any kind are always used to structure our mental processing of melodies which conspicuously contain their constituent pitches.

One additional concern about Trainor and Trehub's experiments is that it is not clear from their description exactly what timbre or timbres were used. We have no way of being certain that their choice of sine waves, voicelike timbres, piano-like timbres or whatever else they may have chosen did not affect their result. Understanding that not only

perceived pitch, but also the qualities of intervals are greatly affected (if not determined) by the interaction of salient frequency and timbre, we are left to speculate as to what the

infants were actually hearing, or to paint this part of the picture of psychomusical space with a very broad brush.

Ultimately, what Trainor and Trehub's article shows is that the strongest position taken in easily available scientific literature which asserts the naturally superior mental processing of the major triad over the augmented triad makes its case rather poorly.

43 While this implication may seem reasonable in light of the number of Western melodies which contain tone sequences easily recognized as referring to abstract verticalities, the processing of successions and simultaneities must differ in some significant ways, or a mojor scale would simply be mentally reassembled into a close-voiced 7-part sonority, and thus a dissonance, regardless of what chords were used to harmonize it. 39

Moreover, their example is the strongest counterexample to the general pattern of research

I have shown so far, and it is not particularly strong.

THE COGNITIVE PERSPECTIVE

Helen Brown, in a 1988 article, presents experimental data that

...indicate that perception of tonality is too complex a phenomenon to be explained in the time-independent terms of psychoacoustics or pitch-class collections, that perceived tonal relationships are too flexible too to be forced into static structural representations...^

The article describes two different experiments yielding similar conclusions, but deals entirely with examples of tonal music, and does not address the augmented triad. But, at

the risk of seeming to use Brown's findings in an opportunistic fashion, I would like to suggest that these findingsma y have some pertinence to the augmented triad as well. What

her data show is that the tonal "meanings" of pitch-class sets are less contingent on their content than on their context and that, for trained listeners, the primary meaning of an

isolated musical sonority is usually in its function in a musical event which also contains other sonorities.45

While some would say that this is intuitively obvious, it should be pointed out that many things which are scientifically true are counter to intuition, and that the fact that this is now shown not to be one of them is therefore significant; it should be all the less controversial, and we should not fail to recognize its situational pertinences in analysis or in composition.

44 Helen Brown, "Interplay of Set Content and Temporal Context in a Functional Theory of Tonal Perception", Music Perception 5/3 (1988) 45 The findings should not be automatically extrapolated to untrained listeners, although the basic cognitive mechanisms may also operate in this group to some degree. The point is made here deliberately in order to strongly charcterize music theorists as trained listeners (as opposed to normal listeners) and to show one way in which their understanding of isolated musical sonorities may be colored by training. 40

None of this provides cause for conclusion on the nature of the dissonance of the

augmented triad, but it is somewhat supportive of the idea that the perception of a referent

tonality may be a large in a listener's recognition of the augmented triad as a

dissonance. For example: if one hears as consonances in a minor key only verticalities which are present as consonances in a major key, this would exclude the augmented triad;

not because it is more acoustically or psychoacoustically dissonant, but simply because one

fails to recognize it as a subset of any major scale. Brown certainly does not attempt to

simplify hearing to this extent, but the cognitive mechanism she describes might be

expected to operate much in the manner described here when confronted with any vertical

sonorites which may qualify acoustically and psychoacoustically as "nondiatonic

consonances."

THE EMPIRICAL VERACITY OF ROOTS... AND OF ROOTLESSNESS

The empirical perspectives discussed to this point, although not all dealing with

"dissonance" or "instability" solely from the standpoint of "roughness," have neither

directly addressed "rootlessness" which is also a consideration.

The first thing we must consider in a scientific discussion of "harmonic roots" is

whether anything specific distinguishes the harmonic content of a single pitch from the

harmonic content of a group of pitches. From an acoustic standpoint, there is essentially no

distinction; acoustics can be used to describe the mathematical prominence of certain

harmonic relationships over others within a particular pattern of atmospheric oscillation,

but such relationships can not be assumed to translate directly into sensations of pitch (or,

for that matter, absence of pitch). Pitch, after all, is not an acoustic phenomenon. Pitch is a

learned construct (though, perhaps a very easily learned construct) which is used to

ordinally categorize sounds by their perceived "height." Depending on culture and musical 41 context, acoustic pitch regions may overlap, much as the acoustic regions of phonemes may sometimes overlap depending on language and syntactic context. Moreover, the question of

"root" can not be an acoustic question because a "root" is a pitch {or pitch class); something we must understand as extracted aurally from an acoustic event, which is partially derived, but different from a simple physiological response to sound.4^

The second thing we must consider is whether there is an absolute perceptual distinction between pitches and harmonic components of single pitches. There is not."*7

While it may be normally unnecessary to recognize or acknowledge while listening to music, there is a musical "no man's land" of sorts where pitch and timbre intermingle; where organ pipes at an interval of a twelfth may fuse into a harmonically rich single pitch or may be heard as separate voices momentarily positioned at a twelfth through the interaction of two highly dissimilar melodies; where a "single pitch" played on a solo gong may either decay into a registrally disparate "out of tune" dyad (and perhaps even larger pitch collections) or may retain a special pitch identity in aural comparison to several other harmonically similar gongs which provide the context in which the gong in question is to be musically understood.

Presumably many of the neurological processes which are used to organize raw frequency information into pitch information can also be used to reorganize pitch information into higher-order pitch information, or may be reorganized with the object of identifying a different number of pitches. The third thing, then, which we must consider is whether (or perhaps to what extent) the ability of groups of frequencies to elicit pitch salience in a

4b Ian Taylor & Mike Greenhough, "Modelling Pitch Perception with Adaptive Resonance Theory Artificial Neural Networks", Niall Griffith & Peter M. Todd eds. Musical Networks: Parallel Distributed Perception and Performance (Cambridge. MA: Bradford Books, The MIT Press, 1999), p.l Reprinted from: Connection Science 6 (1994): 135-154 47 Adrian J. M. Houtsma, "What Determines Musical Pitch?", Journal of Music Theory 15/1 (1971): 138; Gerald J. Balzano, "What Are Musical Pitch and Timbre?", Music Perception 3/3 (1986): 297-314 42 significant number of listeners extends beyond the "no man's land" of pitch/timbre ambiguity; whether (or to what extent) groups of cognizable pitches, rather than frequencies can also be expected to elicit a higher order of pitch salience ( that is, a main or referentially salient pitch; a "root"). If this were not the case to any significant extent, if "roots" did not have any useful empirical validity, then the question of "rootlessness" would also be irrelevant to a discussion of empirically verifiable harmonic instability and dissonance (at least as "dissonance" has been shown here to pertain to the augmented triad in traditional music theory).

But the "root" does have some basis in quantifiably normal perception. Roy D. Patterson

(1990) explains a series of experiments in which were used pure tones of harmonically related frequencies were used in simultaneous combination to elicit aggregate pitch- identification responses in participants.48 Notably, Patterson defines these frequencies as harmonics, rather than individual pitches, in spite of the fact that their waveforms were in many cases out of phase. More notably, participants were found to report single-pitch

responses regardless of the phase relation of the frequencies.

The result was that listeners preferred as the salient pitch class of the simultaneous tones the pitch-class of the mathematically implicit fundamental frequency of which the tones represented harmonics (when in phase). But participants showed no specific pattern

in assigning a specific octave position of this pitch-class; while salient as a pitch-class, it was not reliably manifested as a specific pitch. This finding supports the hypothesis of the

existence of "roots" of at least some types of chords under some conditions, but only to the extent that we ignore the distinction between chord tones and harmonic partials. But

neither does this finding support a clear bass positioning of the "fundamental bass", nor is it necessarily pertinent to musically common waveforms, most of which diverge markedly

from such simple harmonic profiles as those used in Patterson's experiments.

48 Roy D. Patterson, "The Tone Height of Multiharmonic Sounds", Music Perception 8/2 (Winter 1990): 203-214 43

The observation that perception of pitch height is not completely invariant across listeners and timbres is not confined to Patterson. Diana Deutsch has repeatedly shown that simple timbres exist in which listeners show idiosyncratic regional preferences for the ranking of pitch height (usually hearing one continuous half of the pitch-class spectrum as always higher than the other in these timbres).49 it may be inferred from Patterson and

Deutsch that, in simple timbres at least, listeners cannot be expected to pick out any of an augmented triad's three competing "roots" as being the lowest, or the strongest, although the possible roots resulting from the major (and minor) triads do not compete in this way for listener attention, and give rise much more readily to a prevailing pitch of reference.

This inference is also highly consistent with Terhardt's estimates of root salience for the various triad types, and with Richard Parncutt's refinement of same.^O

Thus, while it is understandable that one might tentatively project specific basses or roots for specific augmented triads systematically from the standpoint of perception, it should also be understood that gross error in making such projections on the behalf of other people is not an impossibility; that even normal listeners may have idiosyncratic biases which will interfere with understanding some of the triads as having the specific bases or roots which we project under any but the best understood timbral conditions.

What this means for the augmented triads, which lack many of the audible morphological clues to root which the major and minor triads have (such as chroma asymmetry, and the presence of fourths or fifths; intervals corresponding to low components of the natural harmonic series, and with a stronger hierarchical psychoacoustic inter-orientation), is that while individual listeners may be able to recognize specific roots for specific augmented triads under specific timbres, these roots are probably less easily derived, and definitely less consistent from chord to chord and from listener to listener than are the roots of major

49 Diana Deutsch, "A Musical Paradox", Music Perception 3/3 (1986): 275-280 50 Richard Parncutt, "Revision of Terhardt's Psychoacoustical Model of the Root(s) of a Musical Chord", Music Perception 6/1 (1988): 65-95 44 and minor triads. Moreover, whether or not we hear the augmented triad as having a root, it

has no root to speak of, and any root we may assert it to have can not be depended upon to carry meaning in musical communications.

In summary, the questions of the "roughness" and "rootlessness" of the augmented triad have already been somewhat addressed by scientific research.

The relative acoustic or psychoacoustic "roughness" of the augmented triad, like any musical sound can not be properly assessed without complex considerations of registration,

intonation and timbre; considerations which traditional music theory has considered as, at best, secondary to pitch content and pitch relations.

The question of "rootlessness," though also complex, allows a greater degree of resolution

between traditional music theory and perceptual science. All else being equal, listeners will tend to have more trouble hierarchically organizing the pitches of an augmented triad

than the pitches of a major or minor triad.

One thing which has not been addressed is the question of what effects the comparative

perceptions of the roughness of augmented triads and major and minor triads have on their

perceived stability ranking in contexts where "roots" have somehow been rendered a lesser

criterion of stability.

While it may not be possible to consider all possible timbral, registral and pitch-set aggregation situations, or to construct scenarios in which root relevance is nullified with

any certitude, the experiment which I have conducted was intended to touch on some of the considerations of instability in isolation from clear tonal contexts in which "roots" are

generally considered to be most grammatically relevant.

Before this, though, I should like to present the case of Hermann Helmholz as a further

cautionary example, illustrating with particular poignancy the general state of poor 45

understanding which continues to surround the augmented triad, even as some of the complicated questions of music perception and cognition have been well answered. 46

Chapter 4: The Special Perspective of Helmholz

Perhaps the most telling treatment of the augmented triad by a well-recognized music theorist is that of Hermann Helmholz. Many regard Helmholz as more of a scientist than a

music theorist, and not without reason.^! p[js contributions to several fields of scientific knowledge are admirable, and perhaps indispensible in modern life. Much of his work in music theory is greatly informed by, and extends from the study of acoustics and of the human auditory mechanism.

In his treatise On the Sensations of Tone. Helmholz explains the percept of dissonance largely as a function of acoustic "roughness," or the mathematical degree of mismatch between any two harmonically complex sound waves with different fundamental frequencies. Helmholz's estimates of roughness refer to the natural overtone series, and his recognition of dissonant sounds usually considers their meaning in tonal vocabulary only secondarily to roughness. But his treatment of the augmented triad represents a departure from his more normal mode of discourse.

The triad C E Ab... is very instructive for the theory of music. It must be considered as a dissonance, because it contains the diminished Fourth E Ab, having the interval ratio of 32/25. Now this diminished Fourth E Ab is so nearly the same as a major Third E G# that on our keyed instruments, the organ and pianoforte, the two intervals are not distinguished... On the pianoforte it would seem as if the triad, which for practical purposes may be written either C E Ab or C E G#, must be consonant, since each one of its tones forms with each of the others an interval which is considered as consonant on the piano, and yet this chord is one of the harshest dissonances, as all musicians are agreed, and as anyone can convince himself immediately. On a justly intoned instrument [ as the Harmonical] the interval E Ab is immediately recognised as dissonant. This chord is well adapted for shewing (sic) that the original meaning of the intervals asserts itself even with the imperfect tuning of the piano, and determines the judgement of the ear. 52

1,1 Richard M. Warren, "Helmholz and His Continuing Influence", Music Perception 1/3 (Spring 1984): 253-275 52 Helmholz, op. cit., p. 213 47

Helmholz's statement that the ratio of 32/25 is dissonant is apparently an appeal to a

monophonist perspective, in that it implies that more complex ratios are generally more

dissonant (a stance which is not entirely consistent throughout his treatise), but he

abandons this approach almost as quickly as he takes it. He admits that the irrational

ratios of the piano may also sound consonant, and that the augmented fourth's homophone,

the major third (as he defines it) is among these intervals.

Through the claim that all musicians are in agreement about the dissonance of the chord

containing the equally tempered augmented fifth, and that anyone can immediately convince

himself of this, Helmholz appeals to the judgement of trained musicians, rather than to

acoustic principles, or (more importantly) to the "sensations of tone"; the physiological

phenomena that are the explicit subject of his book.

Helmholz' statement that the "...original meaning of the intervals asserts itself..." would

not necessarily be the only possible explanation for his observation that "all musicians

agree." Even if this were true, Helmholz at no point establishes that musicians of his time were in agreement as to which intervals were the original ones, or, therefore, what the

original "meanings" of other intervals were.

As Loosen (1995) shows, the claim that musicians universally prefer the "original"

intervals is simply not supported through scientific investigation.53 if pianists tend to

hear the original intervals as less meaningful than the intervals that Helmholz suggest

refer to them, Helmholz's claim, if applied today, would seem to imply that hearing the

original intervals actually interferes with hearing the original meanings. Helmholz did not

have the benefits of Loosen's work to refer to as we do now, but we cannot conclude, simply

from Helmholz's assessment of the augmented triad, that the intonational preferences of

pianists were notably different in his time. Without this conclusion, it is difficult to

accept Helmholz's appeal to "original meanings."

Loosen, op. cit., pp. 291-306 48

It should be noted that Helmholz makes a point to deliberately treat triads from more of

a perceptual/psychoacoustic than cognitive/tonal-syntactic standpoint:

... we shall in this section deal exclusively with the isolated effect of the chord in question, quite independently of any musical connection, mode, key, modulation, and so on. 54

We cannot, therefore, dismiss Helmholz's misleading statements as stemming from a

misunderstanding of context, in which tonal grammar is actually intended to be considered relevant to dissonance.

Although considerable portions of Helmholz's otherwise thorough treatment of tonal sensation do include careful discussion of the effects on perception of consonance and

dissonance by different timbres and systems of intonation, it is noteworthy that the

specific dissonance of the augmented triad receives no such qualifications. Moreover, while

the dissonance of the very interval which Helmholz uses to define the augmented triad as

dissonant is at least sometimes subject to qualification by him, the chord which is defined

by this interval is not. It follows that his assessment of the dissonance of the augmented

triad, being stronger than his assessment of the dissonance of the interval which defines

this chord as such, must depend upon something other than, or in addition to, the presence

of the interval itself; something which Helmholz, in addition to being (implicitly) immune

to changes by timbre or intonation, has asserted is not a function of "musical connection,

mode, key, and so on" ; something which Helmholz fails even to mention at any point, other

than to state that practically no one can fail to hear it.

Ironically, the dissonance which Helmholz heard in the augmented triad of the piano

may have resulted not from the original meaning of 32/25 asserting itself, but from the

intermediate meaning of the third root of two (the equal-tempered major third or

diminished fourth) failing to assert itself. Although Helmholz treats the mechanics of the

piano and other instruments extensively in the same volume from which comes his view of

54 Ibid., p. 211 49

the augmented triad, he has failed to recognize that what is true of consonance and

dissonance on the piano cannot be assumed to be true of consonance and dissonance on other

instruments. While his calculations of roughness assume timbres involving a natural

harmonic series of arithmetic intervals, the harmonics of the piano perceptibly do not

conform to this model. ^5 The piano therefore cannot be consistently depended upon in

order to verify the perceptual reality of roughness assessments. Moreover, because the

intonation of pianos accounts somewhat for string inharmonicity by favoring the

smoothness of octaves and perfect fifths56, the major thirds (and their inversions,

enharmonic equivalents, etc.) are particularly rough and poorly representative of timbrally

normal equivalents under equal temperament, much less timbrally normal equivalents at

ratios of 64/25 ( to say nothing of ratios like 8/5).

If we do as Helmholz suggests, and play an augmented triad on the pianoforte, we will be

listening to intervals which are, at best, mere shadows of what they might be under more

generalizable conditions of timbre and intonation. To compound the problem, when we play

a major or minor triad on the piano, we hear only one of these manifestations of interval

class 4, but when we play the augmented triad, we hear three of them.

If there is anything to be fairly concluded from observing the example which Helmholz suggests we make, it is not that the augmented triad is a harsh dissonance, but rather that

the augmented triad has a special potential to exacerbate the acoustical shortcomings of the

piano in illustrating the harmonic properties of other instruments. Helmholz's conclusion

regarding the augmented triad may not be wrong, but his means of reaching this conclusion should draw some suspicion upon it, particularly in that these means must be seen as

strangely divergent from his normal style.

55 E. Donnell Blackham, "The Physics of the Piano", The Physics of Music. Carleen Maley Hutchins, ed. (San Francisco: W.H. Freeman and Company, 1978), p.24 56 Rudolf A. Rasch & Vincent Heetvelt, "String Inharmonicity and Piano Tuning", Music Perception 12/3 (Winter 1985): 171 50

In summary, unless we are willing to permanently append the timbre and standard

intonation of the piano to music theory as the very definition of musical sound (as seems already to have been done in some circles), we must reject Helmholz's evaluation of the augmented triad as flawed from several standpoints:

• Helmholz appeals to the judgements of musicians to support the dissonance of the augmented triad, although he uses acoustic principles to support the consonance of other triads.

• Helmholz does not explain the mechanism by which the ratio 32/25 is heard when an equal-tempered major third is played on the pianoforte (or, for that matter what would

prevent one from mentally tempering 32/25 to function as 5/4 in the first place).

« Helmholz does not address the fact that the equal-tempered major third is more acoustically rough than 32/25.

One thing that Helmholz's treatment of the augmented triad shows is that even the more meticulously scientific among us may occasionally make logical oversights which allow a

resolution between scientific observation and ingrained cultural belief; that even with the benefits of technology, mathematics, and logic, there may be a temptation to believe that it

is natural or normal to hear what we have learned to hear through the music of our specific cultures. Moreover, Helmholz shows that musical practice has given the augmented triad cultural disadvantages which may hinder our clear understanding of it from acoustic and psychoacoustic standpoints. 51

Chapter 5: Summary of Experiment

In the preceding pages, it has been my intention to demonstrate that although music theorists have considered the augmented triad dissonant primarily because of its function in the musical idioms they have treated, they have also made appeals to acoustics and to perception; appeals which may not be indefinitely sustainable through scientific investigation. I have also attempted to show that the current state of scientific inquiry has already given rise to some degree of doubt as to the general validity of these appeals.

What I have not done to this point is to demonstrate that one may have specific cause to reject the claim that the greater level of dissonance of augmented triads (as compared with major and minor triads) is as much a fact of acoustics or psychoacoustics as it is a fact of tonal grammar. I shall now attempt to demonstrate that claims to the psychoacoustic instability of the augmented triad are weak (in comparison to claims of cognitive instability), by explaining the design, implementation and results of my own scientific inquiry.

The experiment attempted to provide an empirical perspective on the perceptual harmonic stability ranking of some simple musical pc-set classes (3-11 and 3-12) under several related types of registration. 5 7

The treatment variable consisted of external definitions of stability ranking.

The data provided a partial numerical assessment of whether and to what extent such external definitions may affect later perceived stability ranking of these pc-set classes; also, whether one particular set class is preferred by two groups of participants as more stable in any post-test ranking which may be found.

5? The term "stability" is used here because nonmusicians (including many of my experimental participants) are frequently unaware of the term "consonance," and, as I have explained, the "dissonance" of the augmented triad may be attributed to its "instability," whether this instability- may be further attributed to "roughness" or "rootlessness." 52

Participants were told and given a form describing the maximum length of time (fifteen

minutes) during which their presence would be required, what type of task they would be

asked to perform, as well as the way in which they would be compensated and the reason for

such extremely limited compensation ($1 Cdn).^8

Also included in this part of the form was a statement of confidentiality.

Participants were divided into two groups by being given response forms from a stack alternating between two types (about which more later). They were asked to provide their

sex, age, primary language, and a combined estimate of their musical skills (ie: below average, about average, above average). They were instructed not to look at the back of their

forms until asked to do so. They were told that, because the experiment was primarily

intended to test only one assumption about normal music perception, and used limited and specialized stimuli, that the perceived correctness or incorrectness of their responses

should not be considered any measure of musical ability, which is a complex and varied aspect of the human mind; that they should proceed in the experiment with the utmost

confidence. They were also told that attempts to undermine the experiment by giving dishonest responses would probably not work, as the frequency of such attempts could be assumed to be the same in both treatment groups, and that this would effectively cause them to cancel each other out over a reasonable number of participants.

In each section of the experiment (pretest, treatment, post-test), participants were told that they would hear six audio examples of approximately fifteen seconds length, at least

five seconds of which would mask most, if not all background noise in intensity and none of which would damage their hearing. Experiments were conducted where the test stimuli clearly exceeded mean background noise as reported by participants, and considerable effort was made to provide a quiet listening environment. Participants confirmed signal to noise estimates with the understanding that they would be compensated for their effort,

The explanation for limited compensation was "lack of special funding." 53 even if the listening environment was not sufficiently quiet to continue with the experiment. Optimum listening conditions were not always available, but a balanced level of noise between treatment conditions was attained by simultaneously testing one

participant from each group.59

Each example consisted of two repeating pitches and two alternating pitches, such that four pitches were used in each example, with three pitches being present at any given point

The following example notates the audio signal. The upper staff states the alternating pitches A3 and Bb3, while the lower staff states the repeating pitches D4 and F#4. Three pitches, either {A3, D4, F#4} or {Bb3, D4 or F#4}, are always present.

An electronic tone was used which is both easily replicable, and is notably similar to a familiar musical tone, the clarinet family.60

^y Although there was no reason to be concerned that the test stimuli would induce seizures of any kind, my personal contact with urban folklore suggested that the stimuli were of a type which some people may believe are likely to induce seizures. Therefore, in order to minimize perceived connection between the test and any subsequent seizures, anyone who would be exposed to the stimuli was first required to sign a statement that they had never had an epileptic or other sensory-induced seizure. Those who signed were assured that no one had yet suffered a seizure subsequent to exposure to the test stimuli, and the motive for collecting their signatures was stated as clearly and openly to them as it has been here. 60 The clarinet family, lacking in many normal overtones, must not be understood as accurately representing the entire universe of musical timbre, but has nonetheless been subject to the same rules of harmony as have the other instruments, and is therefore relevant to a discussion of these rules as demonstrating or failing to demonstrate the rudimentary interaction of tones. For this reason, although the clarinet sound may not be useful in establishing general rules, it is a valid choice if used (as it is here) only to show exceptions to general rules. A more complete discussion of timbral choice will be provided in the pages to come. 54

Examples were ordered such that no common pitches occured between test examples, and

such that no specific sequential property could be easily concluded from the order of pitch-sets between the complete set of examples.

Examples were of a continuous nature, and were faded in and out manually by the experimentor, creating no specific beginning or ending point. The rhythmic content of the examples was also designed to minimize any sense of beginning or ending; repeating and alternating pitches occurred at metrically dissimilar time intervals (211 milliseconds and

241 milliseconds), thus minimizing the potential for perceptual grouping though real or apparent temporal co-articulation.

Participants were told that, in each example, one of the alternating pitches represented a more stable pitch, and that the other alternating pitch was a less stable pitch; that the more stable pitch could be either the higher or lower of the two, depending upon the specific example.

During the pretest segment, participants heard six examples and were asked to decide, after each example, whether the higher or lower of the alternating pitches was the more stable pitch of the two, and to mark their decisions on their response forms. The participants were not asked to compare the sonorities resulting from the alternation of tones, as this type of explanation was considered to refer too strongly to approaches to hearing in which many of them may have been cognitjvefy preconditioned. But, because any difference in stability of sonorities would have to be attributable to the difference in the stability of the alternating pitches as they interacted harmonically with the repeating pitches, the participants were also concomitantly judging the stability of the resultant sonorities.61

61 One concern about this approach to the problem was that contour may be a factor in these types of stability judgements. While some individuals may have a bias for hearing a higher/lower tone, or an inner/outer pitch set member as more stable, it was decided to address this problem only if the data showed this to be significant for a larger group. They did not. 55

During the treatment segment, participants were referred to the alternate side of their

response forms. Participants heard another set of six similar examples, and were asked to read and copy unchanged (that is, to re-mark), in each case, a statement on side two of their

forms which defined either the higher or lower pitch of the example as the more stable pitch. Both groups of participants received this treatment, but between the two preassigned

groups, the definitions of more stable pitch and less stable pitch were completely opposite.

That is, treatment group 1 was always required to mark the word choice ("upper" or

"lower") which corresponded in that example to the pitch class contained in 3-11, and treatment group 2 was always required to mark the choice which corresponded to the pitch class contained in 3-12. This activity was intended to assure that they had properly read

the statement which was provided for each example, and to condition them (at least somewhat), to be more accepting of the accuracy of this statement

During the posttest segment, participants again heard six examples, again were asked to decide whether the higher or lower pitch was the more stable pitch, and again were asked to mark their decisions on the form.

Forms were collected, participants debriefed, thanked, and each was offered a $ 1

Canadian coin in compensation.

Improperly completed response forms were removed from the sample pool. Improper responses included failure to provide participant data, circling of more than one response per example, and failing to circle any response for an example. One response form was removed from the sample pool because the participant indicated that his primary language was Neanderthal. Remaining data were tabulated and examined in reference to the treatment variable and to the other variables provided.

Total number of viable response forms: 73 56

Chapter 6: The Stimuli

In order to produce relatively consistent, expressively neutral stimuli, an Apple

Macintosh 520 Powerbook was used in conjunction with the Opcode flowcharting program

MAX,^2 to produce MIDI (Musical Instrument Digital Interface) signals lacking in any appreciable musical nuances, such as variances of intonation, or loudness.^3 The MIDI

signals were fed, via an Opcode MIDI translator, to a Roland SC-50 sound module, which

produced an audio signal that was then split to two identical sets of Optimus 417 stereo

headphones. The same two earphone units were used in the collection of all data not excluded from primary analysis. Before use, all the audio test stimuli were fed at maximum output through these earphones, and all four speakers were measured with two types of decibel counters to assure that the signal could not exceed 85 dB at any point. Although some effort was made to provide and to verify consistently quiet listening environments, room acoustics, background noise, and related possible confounding variables were considered not to be significant in terms of creating differences in stimuli between treatment groups, because experimental participants were always tested two at a time; one from each treatment group.

All audio examples consisted of artificial clarinet tones generated by the SC-50. Signals were center-panned and received no extraneous processing such as the SC-50's inboard reverb or chorus effects. The volume display of the SC-50 was kept at 127; maximum. The clarinet sound was chosen from all tones available on the SC-50 for three reasons:

1) It notably resembles a common acoustic timbre.

2) It is a relatively simple waveform, close to a square wave, making it somewhat

62 A concise, nonpromotional explanation of MAX's basic features is provided by Robert Rowe in Interactive Music Systems (Machine Listening and Composing). (Cambridge, MA: The MIT Press, 1993). 63 A graphic model of the test engine used in this experiment is provided in in the appendices of this thesis. 57

replicable with different equipment.

3) Through examination and discussion of various timbres with several people of varying musical tastes I found that the clarinet sound was the least disagreeable of those

available overall, and might therefore hopefully facilitate a more uniform level of listener attention across participants and over time.

There are three notable disadvantages to the clarinet sound:

1) Because it is neither a perfect squarewave, nor an entirely convincing clarinet sound,

it is not 100% replicable by using different electronic equipment, or by using actual " clarinets.

2) In its resemblance to both acoustic and electronic timbres, it is deficient in much of what are often said to be normal timbral components (i.e.: harmonics corresponding to the

natural overtone series ) and conversely, that it is also much more complex than a simple sine wave, and therefore fails to represent pitch relations in isolation from the distractions of timbral content.

3) Each tone begins with a soft attack transient, providing the listener with additional unnecessary information of dubious representative value.

Something can be said, however, in defense of the use of the clarinet sound in this experiment:

1) No two acoustic instruments would likely share acoustic signatures as similar as those of two SC-50's, or be able to repeatedly produce such signatures without any detectable variation. Traditional definitions of consonance and dissonance were well established long before any electronic timbres existed which would be likely to complicate such definitions from the standpoint of perception, and there is no reason to assume that what is perceptually true of acoustic consonance and dissonance will automatically carry over into electronic timbres. Therefore some type of compromise between the acoustically simple 58

(electronic squarewave) and the musically normative (an actual woodwind sound) seems

reasonable.

2) Traditional theories of pitch structures do not take timbre into account, and apply the same rules of consonance and dissonance both implicitly and in practice to a broad variety

of timbres, many of which contain harmonics audibly diverging from the natural overtone series. Conversely, simpler timbres such as pure sine waves are often not particularly

rough from an acoustic standpoint, even when they are presented at an interval of a tritone,64 the dissonance of which is often considered to be at the very core of tonal

perception. 65 Therefore, the experiment shows higher timbral constraint than traditional theory, and yet considers at least some of its salient timbral considerations (i.e.: the function of the overtone-derived perfect twelfth in determining dissonance between two

tones, such as in tritone-relations) pertinent to common musical contexts.

3) An aesthetically suitable timbre was not found on the SC-50 which lacked some type of complex attack transient. At some point in the equipment selection process, time limitations had to be accepted as possible determinants of final selection.

In all 18 examples, participants heard two alternating pitches and two repeating pitches

(as shown before).

The repeating dyad, in each case recurred at a time interval of 241 milliseconds, and the alternating pitches occurred at equal time intervals of 211 milliseconds (both prime numbers). The time intervals were selected to be between 0.20 and 0.25 seconds for three reasons: Durations between 0.20 and 1.00 seconds are considered to represent a normal rate

b4 Gregory Danner, op. cit., pp. 103-122, see graph 65 Gordon D. McQuere, "The Theories of Boleslav Yavorsky", Russian Theoretical Thought in Music. Gordon D. McQuere, ed. No. 10 from Russian Music Studies. Malcolm Hamrick Brown, series ed.(Ann Arbor, MI: UMI Research Press, 1983), p. 113 referring to Yavorsky, "Konstruktsiia melodicheskogo protsessa", in Struktura melodii (Moscow:Gos. akad. khud. nauk [sic], 1929), pp. 7-36 59 of human speech,^^ these are also considered to be normal musical durations 67 and this was considered by myself and others consulted to allow the alternating pitches to be heard as separate, clearly defined events of dissimilar harmonic stability and yet would allow a sufficient number of alternations per example that a listener might be expected to make a conclusive stability comparison.

The idea of using a fixed, non-rearticulating dyad was rejected to eliminate the possibility that the quietly humming "back-end" of the clarinet sound would be partially or completely masked at some point by background noise. Moreover, the recurrent attacks in the dyad were intended not only to match the dyad and alternating pitches in volume, but also in timbre, which would be expected to make all the three tones more "musically pertinent" to each other than they might otherwise seem.

The number of milliseconds selected for the duration of each event was based on the intention to make the repetition and alternation so asynchronous that real or perceived coarticulation of one or both alternating pitches with the repeating dyad would not result in specific extraneous musical associations, and/or any type of on-beat/off-beat biases in regard to perception of consonance and dissonance, be they innate or acquired. Toward this perhaps not fully attainable goal, durations inside the 0.20 to 0.25 second range were selected which could both be represented with prime numbers (of milliseconds), so that actual coarticulation would not occur more than once per example (if at all), and that the

bb Stephen Handel, op. cit., p. 439, referring to:

B. Hayes, "The Phonology of Rhythm in English", Linguistic Inquiry 15/1 (1984): 33-74

E. O. Selkirk, Phonology and Syntax: The Relation Between Sound and Structure (Cambridge MA: MIT Press, 1984)

67 Ibid. 60

simple ratio approximated between repetition and alternation (4:5) would not easily allow

either alternating pitch to be heard as closer to the beat for more than one second at a time.

Because the examples each faded in and out from and to a OdB level ("silence"), and

because of such variables as background noise, the unsteadiness of the experimentor's

hand, and the ongoing prime-number relations between events continuing unheard between

examples (inside the computer), a situation was created in which neither alternating pitch would be privileged as a perceptual "starting point" (or ending point). The resulting

"beginning," even if consistent across an unlimited number of participants within one

presentation of the testing procedure (which is doubtful), would not necessarily be

consistent across presentations. However, the careful use of alternately assigned even

numbers of participants in each presentation allowed equal numbers of both treatment

groups to be represented as having experienced any such effect of "clear beginning."

The choice to use the shorter of the two durations for the alternating tones was made for

three reasons: in order to facilitate the recognition of alternating tones as relatively

"foreground" objects, to allow the greater number of alternations per example and therefore a larger number of comparison opportunities, and to allow attacks of the repeating dyad to be a bit more closely matched by attacks of the alternating pitches; the reverse durational ratio creates a greater duration of metrical association by creating something closer to a steady repetition of "up-down-up-down" against the relatively short five dyad attacks, as compared to "up-down-up-down-up-down-up-down-up-down" against the relatively longer eight dyad attacks. In other words, when the alternating tones are the quicker, their metrical placement is more likely to be heard as reversed twice every two seconds, rather than reinforced every second until affected by the inexactitude of the 4:5 ratio.

During the pretest segment of the experiment, participants were asked to choose in six examples between the upper and lower of two alternating tones as sounding more harmonically stable. In order to assure confidence of response among participants lacking 61

a clear understanding of the expression "harmonic stability," numerous synonyms and

antonyms were verbally provided before participants were subjected to the stimuli. The

synonyms included "relaxedness," "resolution," and "smoothness." The antonyms included

"tension,", "conflict," and "roughness."

Each pretest example consisted of a member of the pitch-class set type 4-19 (0,1,4,8),

such that pc's represented by 0 and 1 were the alternating pitches. The resultant two

verticalities in each case were, therefore, manifestations of the trichord types 3-12 and 3-

11; respectively an augmented triad and either a major or minor triad (depending on

inversion), in either "5/3","6/3", or "6/4" voicing, all triads being "close-voiced." The

exact pitch collections used in each example is notated in the example below such that the

alternating and repeating tones appear as simultaneities. This is done only in order to

avoid giving the impression of a specific order between alternating pitches, and not to

create the impression that alternating tones occurred simultaneously at any point.

example 1 example 2 example 3 example 4 example 5 example 6 alternating tones pgr- repeating tones

The specific relation of the tetrachords used to the prime form, and the order in which

they were presented were T10I, T4, T2l,T8l,T2,T0l, therefore there were no common pitches

between test examples (and between groups of 3 consecutive examples), no readily

predictable pattern between inverted and non-inverted forms, and no readily predictable

order of transpositions. Verticalities were also registrated so that registral position of the

alternating pitches could not be readily predicted (in terms of being above or below

alternating pitches; the order of vertical positions of alternating pitches is as follows:

lowest, highest, middle, highest, middle, lowest). 62

In my zeal to avert audible sequential relationships, I made a design error which was not

noticed until after data collection. The result of this error is that one example in each section of the experiment is duplicated by transposition, and one of the six basic

registrations of 3-11 is not used.68 While this error makes the results of the experiment

less generalizable, it does not weaken the internal validity of the data.

A case might be made that a particularly astute listener would be able to predict the

last example (of each set of six) through deductive reasoning, but if such skills were

common enough to be statistically relevant, there would be no need for this experiment in

the first place. Even if normal listeners were capable of such deduction under optimum

conditions, the present manner of presentation is undoubtedly less than optimum in terms

of brevity of examples, their quick succession, and the competing mental tasks of stability judgement, and reading and marking the forms. (Notably, several participants volunteered

that there was no apparent pattern other than an increasing sense of familiarity.)

No experimental design is perfect, but some effort must be made to minimize the effects

of test stimuli as treatment variables. It is possible that some type of learning occured during the pretest segment, and that this learning may have affected post-test performance.

But even if this was the case, the two groups in this experiment did not differ in this

respect. Therefore, while the experiment may not have been intended to provide completely accurate information on the way people spontaneously evaluate the "isolated" events (the individual examples; they are only "isolated" in terms of minimizing possible sequential

interpretation), the experiment should have provided accurate information on the way the act of superficially accepting the evaluations of others results in later changes to one's own evaluations, at least pertaining to the types of stimuli used, as well as showing possible changes to perception from repeated exposure to the specific type of stimuli.

68 The fact that no specific pattern emerged for the transpositionally repeating events does not mean that the results are random. One tentative explanation for this phenomenon is the tendency of listeners to attribute certain properties unevenly to the various pitch-classes, as shown by Deutsch (1986), among others. 63

During the treatment segment, participants were not asked for their personal responses to the treatment stimuli, but were asked to confirm that they had understood which response had been identified for them as the correct one in each case. They were told that they were welcome to agree, diasagree, or be indifferent to the response provided for them, but that they were not to verbally articulate or otherwise share such opinions with the experimentor, or with the other participant.

The treatment stimuli were similar to the test stimuli, except that they were reversed for order, inverted and transposed, in order to minimize any readily audible sequentiality between the test sets and the treatment sets. Fortunately, some additional time also passed between pretest and treatment stimuli, due to the act of turning over the response form and reading the treatment instructions.

Some pitch commonality exists between consecutive treatment sets, but this was of little or no consequence because treatment consisted of repeating the responses provided, regardless of any personal evaluations, be they sequentially informed or otherwise.

Nonetheless, it should be understandable that participants must be able to draw a clear mental connection between the treatment and post-test stimuli as collective entities; that they sound collectively like a larger, uniform group, each consisting of six individual objects.

The order and voicing-position manner of treatment pc sets was as follows: T5, lowest,

T7l,highest, Tl, middle, T7, middle, T9I, lowest, T3, highest. The sequential register-contour profile was disrupted (through octave shifts) from that directly predictable from pretest examples, to reduce sequential continuity and to improve overall registral continuity with pretest stimuli. The general area around and above middle C was selected for all examples, owing to its frequent musical use for close chord-voicings, an arbitrary but necessary choice. As before, the tetrachords in this section are notated here as simultaneities. alternating tones

repeating tones

In preliminary runs of the experiment using more or fewer treatment examples than six,

it was found that such divergences interfered somewhat with listener concentration. Later,

some participants exposed to six treatment examples volunteered that they believed the

treatment examples to be identical to the pretest examples, save for reordering. While this

way of understanding the combined stimuli is clearly wrong, the misunderstanding seems

to have provided the participants with enough apparent meaning to avoid becoming

distracted by contemplating such mysteries as "why a different number of examples ?"

Also, the observation that such misunderstanding was reported frequently during

debriefing is a potentially interesting piece of information in and of itself.

During the posttest segment, participants were asked to repeat their actions from the

pretest segment, and were actually told that the posttest stimuli might be essentially the

same as the pretest stimuli, except that the order may have been changed.

The posttest stimuli were identical to the pretest stimuli. Naturally, some differences in

response may have occurred between pretest and posttest, due to learning, normal cognitive

variance and variances of extraneous stimuli. The experiment was not primarily intended to

measure these effects, but was also designed to determine their significance, should they

become evident. The purpose of the experiment was to determine whether learning between

pretest and posttest can be affected by different types of conditioning. However, due to

significant, unanticipated differences between pretest resposes between treatment groups,

the data obtained will be discussed largely from a non-experimental standpoints, and will

be used with the intention to answer questions of general conditioning effects on both

treatment groups. 65

Chapter 7: The Data

The main purpose of an experiment is to collect data which will either support or fail to

support a specific hypothesis. In this case, the hypothesis was that by requiring listeners

to accept arbitrary definitions of harmonic stability between closely related manifestations

of pitch class sets 3-11 and 3-12, it is possible to induce changes to their preferences, at

least in the short-term. Neither treatment group was considered to be a control group, in that both received treatment The use of a control group would be necessary to demonstrate

that treatment has an effect as compared to absence of treatment, but use of a control group rather than a second treatment group could not be used to show that complementary differences in treatment produce complementary effects.

This represents a divergence from more common experimental designs, but the experiment is a true experiment nonetheless; it is simply designed to test whether a specific difference of treatment produces any difference in effects, rather than to test whether a specific treatment has a specific effect. To the extent that the various examples are related, this point draws into question anything that might be said of the other examples as well.

In a discussion of the data proper, I shall be referring to the following table, which shall be explained in greater detail below. The table shows, for each example, what types of chords were used, and for each treatment group in each example, the trichord type that was judged to be more stable, the proportion by which it was judged more stable, and the resultant t-score, which was then used to determine the statistical significance of this proportion (about which more later). 66

Treatment Group 1 Treatment Group 2

Pre-test

1) Aug. triad v. Maj 6/4 3-11: 18/40: t=4.0202 3-12; 17/33; t= 1.0005

2) Aug. triad v. min 6/4 3-12: 23/40: t=6.0687 3-11; 16/33; t= 1.0005

3) Aug. triad v. Maj 6/3 3-12: 23/40: t=6.0687 3-12; 17/33; t= 1.0005

4) Aug. triad v. Maj 5/3 3-12: 23/40: t=6.0687 3-11; 16/33; t= 1.0005

5) Aug. triad v. min 6/4 3-11: 18/40: t=4.0202 3-12: 18/33: t=3.0125

6) Aug. triad v. Maj 6/4 3-12: 23/40: t=6.0687 3-11; 16/33; t= 1.0005

Post-test

13) Aug. triad v. Maj 6/4 3-11: 17/40: t=6.0687 3-12: 18/33: t=3.0125

14) Aug. triad v. min 6/4 no pref.; 20/40; t=0.0000 3-11: 15/33: t=3.0125

15) Aug. triad v. Maj 6/3 3-12: 25/40: t=10.2378 3-12: 20/33: t=7.1360

16) Aug. triad v. Maj 5/3 3-11: 18/40: t=4.0202 3-12; 17/33; t= 1.0005

17) Aug. triad v. min 6/4 3-12; 22/40: t=4.0202 3-12; 17/33; t= 1.0005

18) Aug. triad v. Maj 6/4 3-11: 17/40: t=6.0687 3-12; 17/33; t= 1.0005 67

The result of each pretest and posttest response for each treatment group was compared

to chance by using a two-tailed Student t-test at alpha= 0.01. Simply put, a t-score (at least as it is used here) is a measure of the likelihood that a sample correctly indicates that the

population from which it is drawn would differ from an essentially even number of choices of each type available. The larger the t-score is, the more likely it is that the population

from which the sample is drawn would respond on a nonrandom basis to the corresponding example. For any sample size and number of responses to choose from (2 in this case), there

is a specific t-score that corresponds to any specific level of significance, such that a data t-score falling above this reference t-score indicates a probable difference from random

response at that level of significance. For each example, the level of significance has been set at 0.01, indicating a maximum possibility of sampling error, for our purposes, at 1% per example/group data set. This gives different t-score thresholds of significance for sample sizes 40 and 33, both of which are at or above t=2.327. 69

For example, treament group 2's t-score for audio example 4 is below 2.327, thus indicating chance levels of response with what might be called "99% certitude," whereas their t-score for audio example 5 is above 2.327, thus indicating a pattern of response inconsistent with chance with what might be called "99% certitude."

As the word maximum suggests, the exact thresholds of significance for many examples are well below 0.01, and the aggregate probability that at least one of the 24 sets of statistical values inaccurately represents the population from which it is drawn for our purposes is well below the sum of 0.24. Moreover, 1 am assured by statisticians and psychologists that it is "very well" below 0.20, a standard threshold of significance for

69The inexactitude of the significance threshold in this experiment is a result of using t- tests for sample sizes larger than 30, and is considered to be of no consequence here in that all data t-scores are markedly above or below the range of inexactitude. For more information on t-tests, see any standard statistics textbook. The one used for this experiment was: Mario F. Triola. Elementary Statistics. 4th ed. (Redwood City, CA: Benjamin/Cummings, 1989) 68 data treated in the present manner by the psychological community; the interpretation of the data is therefore valid by this standard.

The underlined entries indicate responses for which uniform guessing by participants may be ruled out as a cause at alpha=0.01. Arguably, if even two entries were "wrong" for our immediate purposes, the main points I will make here would still have a good degree of validity given the general shape of the data, as I shall explain. Please note that the table indicates both t-scores for each treatment group in each example and the proportion of each group which preferred 3-12 as more harmonically stable than 3-11 in each example, from which the t-score is derived.

Although we may not assume so, we must consider that the most obvious difference between treatment groups as indicated by the above table may be spurious; the greater number of underlined entries on the left may be the result of using different sample sizes.

We cannot conclude that the two groups are identical, but neither may we conclude the two groups differ in any specific way, at least on the basis of underlined entries in the table.

However, the response of treatment group 2 to audio example 5 indicates that at least one similarity exists between groups; neither group shows a pattern consistent with uniform guessing. Unfortunately, the directional preferences of these two groups are in opposite directions in regard to example 5. We must therefore understand that the difference between groups in example 17, the corresponding posttest example, may have existed in some form before the treatment section of the test.

For this reason, I will not be able to treat the data as experimental data.

What 1 shall do here, instead, is to discuss the data from a lower constraint quasi- experimental standpoint typical of psychomusicological literature.7^ i shall focus on

70 Even a brief perusal of the empirical studies which I have discussed to this point will show that the psychomusicological community accepts a very loose definition of the term "experimental." While I do not personally disagree with such use of the term "experimental" and have no general complaints about the methods so described, 1 should like to point out that my own experiment is also an experiment by narrower definitions. 69

similarities, rather than differences between treatment groups. I shall discuss what these

similarities suggest rather than trying to prove or disprove any particular hypothesis

regarding the comparative effects of the two treatments. The data may thereby, at the very

least, serve to suggest new hypotheses and procedural modifications to additional

experiments.

However, before I diverge in this way, I would first like to point out that the data do, in

fact, prove something. The responses of both treatment groups show that people who are

capable of doing more than to guess at musical examples still fail to consistently recognize

3-11 as more harmonically stable than 3-12; despite the fact that they are capable of some

level of stability preference between these two pc sets, these preferences are not consistent across examples. Moreover, people who can hear a difference do not show any broad, overriding tendency to prefer 3-11 as more harmonically stable than

3-12.

To whatever extent we are willing to accept the complete sample as normal, we may conclude that in synthetic clarinet timbres and without reference to specific musical contexts, normal listeners will not infrequently fail to hear the augmented triad as harmonically unstable in comparison to major and minor triads with which they share two tones. In other words, there are simple exceptions to the relative harmonic instability of the augmented triad.

If the present experiment is reasonably generalizable, such exceptions may not be difficult to come across.

GENERAL CHANGES BETWEEN PRETEST AND POSTTEST

Despite the general lack of a pattern in the overall data in the pretest section

(conforming neither to chance, nor to any specific stability preferences), the posttest 70 section shows one surprisingly defined preference in example 15, in which a vertical group of pitches comprising an augmented triad alternated with a pitch group comprising a first inversion major triad. Both groups responded with the highest degree of preference in this example, and this was the only example on which the two groups largely agreed in pretest.

Notably, this was also the only example for which participants verbally volunteered that

they considered it to be a "trick question"; frequently enough, in fact, that I eventually began checking my equipment after data collection in order to determine if there was some

kind of problem. Several informed friends and I are still unable to locate any such problem, although the example does seem to have a subjective "weirdness" in comparison to the others. This effect (treated very subjectively) seems to be preserved across several timbres, registers, tone durations and amplification equipment.

The direction of stability preferences of the treatment groups for example 15 is also the same; both groups preferred 3-12 as more harmonically stable in this example. Possibly, then, the "weirdness" which the participants were hearing consists of a high degree of cognitive dissonance between definitions stemming from their innate response to acoustic sensation and definitions stemming from their lifelong conditioning as musical listeners; a conflict which would not be inconsistent with certain models of harmonic tension discussed earlier.

Another possibility is, as the changes between pretest and posttest suggest, that although participants had been conditioned through the testing procedure to hear 3-12 as more stable in example 15, they were either unaccustomed to hearing in this way, or at least unaccustomed to being so acutely aware of it. Comparing responses in example 15 to those in example 3, we see that preference has become more prominent, but cannot be said to have changed in its essential nature. Given that the exact same procedure was followed in the administration of examples 3 and 15, it might be inferred that the change in response, which is comparable across treatment conditions, is due either to having heard the example 71

a second time or to repeated exposure to constructively similar but non-identical stimuli:

exposure to the various test and/or treatment examples. Assuming that this change in

response is of further interest, it may be possible to isolate the comparative influences of

repetition and variation in another experiment.

Regardless of the exact causes of change between example 3 and example 15, we may

draw numerically from responses to example 15a tentative conclusion consistent with what has been stated above; that simple exceptions to the comparative dissonance of 3-12 exist, and that listeners may arrive at perceiving them in spite of deliberate instruction to the contrary, as is the case with treatment group 1.

Interestingly, treatment group 1 preferred 3-12 in example 15 even more strongly than did group 2, in exactly the opposite direction from which one would expect from their treatment. Contrary to what this may imply about the sample pool's attitude toward being

"treated," I believe it is unlikely that any of the results obtained in the experiment can be explained as subversion by participants. In preparation for the test, I explained to them that being deliberately contrary would not be likely to hurt my results, because I was looking primarily for any posttest differences resulting from treatment, and I expected a roughly equal number of participants in each group would want to "mess up my data" by giving dishonest responses, thereby cancelling each other out.

FOLLOW-UP USING INCOMPLETE RESPONSES

Although the statistical strength of my observations would be greatly weakened by attempting to re-include data from the incomplete response forms, I considered that the behaviour of this group might be of some interest. If the response pattern exhibited by both treatment groups (as described above) is to be dismissed as mere sampling error or coincidence, particularly regarding responses to examples 3 and 15, one would expect that 72

the available data from the incomplete response forms would either be at the chance level,

or would show the opposite tendency.

Interestingly, 7 of 16 participants in the incomplete category preferred 3-12 as more

stable in example 3, yet 10 out of 15 showed this preference when confronted with the same

stimuli in example 15. That is, while their responses were in the chance range for this

presentation of 3-11 and 3-12 during pretest, they were well outside the chance range

(taking the examples in isolation) during posttest at t=5.3033. One could say that the

"coincidence" appears to be reasonably reliable. It should be emphasized that this point is made here regarding the data from the incomplete responses not to "prop up" what had been

said above, but only to further demonstrate the strength of those findings, and the

conclusion which must be drawn from them.

This conclusion, if supported by further research, is not broadly disruptive of analytical or compositional practices by any means. But this conclusion could have some consequences for analysts and composers in the long term, if it should be supported by the conclusions of others.

What this conclusion could mean for music analysts is either that any specific augmented triad in a piece of music can not be automatically ruled out as "stable," or that any specific major triad in first inversion can not be automatically assumed to be "stable"; further, that systems of analytical reference which depend on consonance and dissonance may, under some conditions, fall short of correctly explaining which of these chord types refers to which from a cognitive-hierarchical standpoint.

What this conclusion could mean for composers who depend on harmonic stability rankings as predictors of chords' locations in perceptual hierarchies is that the augmented triad and first inversion major triad which share two pitch classes, under some conditions

(such as in clarinet timbres near middle C), may ambiguate intended grammars of stability. 73

More than this, the conclusion draws further into question the perceptual reality of

other long-standing assumptions that music theory makes about such things as harmonic

stability. When considered in light of the already numerous works of research which cannot

be easily resolved with implications that current music theory makes about music

perception, the results of this experiment reinforce the position that much is not yet known

about how people mentally process music, and that what is known is not entirely consistent with current music theory. 74

Chapter 8: Concluding Remarks

I understand that such findings may be difficult for some people to accept, and these people will have my full moral support should they choose to replicate my experiment.

Moreover, everyone is more than welcome at any time to correct any errors I may have made.

My goal in this experiment has been to get some partial answers to some tricky questions that have been plaguing me for almost half my life, not simply to make people shake their heads in disbelief. I would like to think that I have used my thesis as an opportunity to show how 1 believe we might best deal with any doubts or disagreements which may exist or which may arise about the effectiveness of music theory in modeling normal listener response.

I do not mean to imply that I think music theorists have been "wrong" about the augmented triad for the last four hundred hundred years, only that the extent to which they have been right is less than absolute. As I have explained, to state that the augmented triad is always less stable than the major and minor triads makes assumptions about timbre, intonation, and reference to larger pitch collections; assumptions which were probably somewhat safer to make before about 1910.

Now at the end of the twentieth century, it can be little doubted that as producers and consumers of an increasingly global culture, normal listeners to tonal music are also completely capable of, if not actively pursuing, coherent listenings of musics utilizing timbres, intonation systems, and pitch collections notably divergent from those which tonal theory (or, for that matter, serial theory) was intended to consider. However, formal theory instruction and even the most scholarly analyses of twentieth-century musical scores fail to address the undeniably expanding number of points at which primarily graphic models of music perception and cognition fall far short of providing accurate assessments of localized listener response. 75

I believe that my findings underscore the need for methods of analysis which address

not only what the composer may have intended one to hear, but which also address, to an increasing extent, what an attentive but otherwise normal listener would actually be likely

to hear.

As yet, there is little empirical basis provided by myself or by others for any

refinement to analysis, but there is some. It may be decades before any kind of analysis

emerges which bears even a moderately higher degree of empirical validity, and it is therefore understandable that many music theorists are resistant to the idea of conducting

perceptual experiments. Indeed, neither has music theory ever been defined as scientific fact, nor has it apparently needed to be in order to find support, and it may continue to depend greatly on rationalism.

But should theory be considered immune to careful observation? I think not. The point at which music theory is considered immune to scientific criticism is the point at which the

non-empiricist becomes the anti-empiricist; the point at which education and real learning by all parties involved must take a back seat to indoctrination.

"Probabilistic models are to be preferred when available because they are, by definition,

the least likely to be dead wrong."

-Jason H. Stover, PhD. (Statistics) 76

Bibliography

Books:

Abbott, Lawrence. The Listener's Book on Harmony Lnodon: George G. Harrap & Co. Ltd., 1943 i Aldwell, Edward and Carl Schachter. Harmony and Voice Leading second edition Fort Worth: Harcourt Brace Jovanovich College Publishers, 1989 Brye, Joseph. Basic Principles of Music Theory The Ronald Press Company, 1965

Duckworth, William. A Creative Approach to Music Fundamentals, Fifth edition Belmont, CA: Wadsworth Publishing Company, 1995

GrifFith, Niall and Peter M. Todd eds. Musical Networks: Parallel Ditributed Perception and Performance Cambridge, MA: Bradford Books, THe MIT Press, 1999

Forte, Allen. The Structure of Atonal Music New Haven, CT: Yale University Press, 1973

Handel, Stephen. Listening (An Introduction to the Perception of Auditory Events) Cambridge, MA: The MIT Press, 1989

Helmholz, Hermann, On the Sensations of Tone New York: Dover Publications, Inc., 1954

Hindemith, Paul, the Craft of Musical Composition, vol.1 Melville, NY: Belwin-Mills, 1942

Horwood, Frederick J. The Basis of Harmony Toronto, Canada: Gordon V. Thompson Limited, 1948

Hutchins, Carleen Maley ed. The Physics of Music San Francisco: W.H. Freeman and Company, 1978

Liebling, Emil, ed. (The American History and Encyclopedia) Essentials of Music Vol.11 New York: Irving Squire, 1910

MacPherson, Stewart. Practical Harmony (further revised edition, etc..) London: Joseph Williams, Ltd., 1907

McHose, Allen Irvine. Basic Principles and Technique of 18th and 19th Century Composition New York: Appleton-Century-Crofts, Inc., 1951

McQuere, Gordon D. ed. Russian Theoretical Thought in Music Ann Arbor, MI: UMI Research Press, 1983

Partch, Harry. Genesis of a Music, second edition, enlarged New York: Da Capo Press, 1974 77

Piston, Walter. Harmony, third edition New York: W. W. Norton & Company, Inc., 1962

Ratner, Leonard G. Music, the Listener's Art New York: McGraw Hill Book Co., Inc., 1957

Roederer, Juan G. Introduction to the Physics and Psychophysics of Music London: The English Universities Press, Ltd., 1973

Rowe, Robert. Interactive Music Systems (Machine Listening and Composing) Cambridge, MA: The MIT Press, 1993

Sadai, Yizhak. Harmony in its Systemic and Phenomenological Aspects J. Davis and M. Shlesinger, translators Jerusalem: Yanetz Ltd., 1980

Salzer, Felix and Carl Schachter. Counterpoint in Composition (the Study of Voice Leading) New York: McGraw-Hill Book Company, 1969

Selkirk, E.O. Phonology and Syntax: The Relation Between Sound and Structure Cambridge, MA: The MIT Press, 1984

Schoenberg, Arnold. Preliminary Exercises in Counterpoint, edited by Leonard Stein London: Faber and Faber Limited, 1963

Schoenberg, Arnold. Structural Functions of Harmony, edited by Leonard Stein New York: W. W. Norton & Company, Inc., 1969

Sheldon, David A. Marpurg's Thoroughbass and Composition Handbook (a Narrative Translation and Critical Study, Harmonologia Series No. 2) Stuyvesant, NY: Pedragon Press, 1989

Shirlaw, Matthew. The Theory and Nature of Harmony Sarasota, FL: Dr, Birchard Coar, 1970

Smith, Charlotte. A Manual of Sixteenth-Century Contrapuntal Style Cranbury, NJ: Associated University Presses, Inc., 1989

Triola, Mario F. Elementary Statistics, fourth edition Redwood City, CA: Benjamin/Cummings, 1989

Williams, D. E. A Music Course for Students Entering for Certificate and Others London: Oxford University Press, 1937 78

Articles:

Balzano, Gerald J. "What Are Musical Pitch and Timbre?" Music Perception 3/3 (Spring 1986)

Brown, Helen. "The Interplay of Set Content and Temporal Context in a Functional Theory of Tonal Perception" Music Perception 5/1 (Spring 1988)

Bruner, Cheryl L. "The Perception of Contemporary Pitch Structures" Music Perception 2/1 (Fall 1984)

Christensen, Thomas. "The Spanish Baroque Guitar and Seventeenth-Century Triadic Theory" Journal of Music Theory 36/1 (Spring 1992)

Danner, Gregory. "The Use of Acoustic Measures of Dissonance to Characterize Pitch-Class Sets" Music Perception 3/1 (Fall 1985)

Deutsch, Diana. "A Musical Paradox" Music Perception 6/1 (Fall 1988)

Egmond, Rene Van and David Butler. "Diatonic Connotations of Pitch-Class Sets" Music Perception 15/1 (Fall 1997)

Houtsma, Adrianus J.M. "What Determines Musical Pitch?" Journal of Music Theory 15/1 (1971)

Houtsma, Adrianus J. M. "Pitch Salience of Various Complex Sounds" Music Perception 1/3 (Spring 1984)

Isaacson, Eric J. "Similarity of Interval-Class Content between Pitch Sets: The IcVSIM Relation" Journal of Music Theory 34/1 (1990)

Loosen, Franz. "The Effect of Musical Experience on the Conception of Accurate Tuning" Music Perception 12/3 ( Spring 1995)

Morris, Robert. "A Similarity Index for Pitch Class Sets" Perspectives of New Music 18 (1980)

Parncutt, Richard. "Revision of Terhardt's Psychoacoustic Model of the Root(s) of a Musical Chord" Music Perception 6/1 (Fall 1988)

Patterson, Roy D. "The Tone Height of Multiharmonic Sounds" Music Perception 8/2 (Winter 1990) 79

Rahn, John. "Relating Sets" Perspectives of New Music 18 (1980)

Rasch, Rudolf A. and Vincent Heetvelt. "String Inharmonicity and Piano Tuning" Music Perception 12/3 (Winter 1985)

Teitelbaum, Richard. "Intervallic Relations in Atonal Music" Journal of Music Theory 9/1 (1965)

Terhardt, Ernst. "The Concept of Musical Consonance: A Link between Music and Psychoacoustics" Music Perception 1/3 (Spring 1984)

Thomson, William. "The Harmonic Root: A Fragile Marriage of Concept and Percept" Music Perception 10/4 (Summer 1993)

Trainor, Laurel J. and Sandra E. Trehub. "What Mediates Infants' and Adults' Superior Processing of the Major Over the Augmented Triad ?" Music Perception 11/2 (Winter 1993)

Walsh, Stephen. "Musical Analysis: Hearing is Believing?" Music Perception 2/2 (Winter 1984)

Warren, Richard M. "Helmholz and His Continuing Influence" Music Perception 1/3 (Spring 1984) an

Appendix 1: Experimental Response Form

* 'Cp- Sex: Pri m ary I an guage: Date: Estimated musical skills (circle one): a)below average b)average c)above average

Please DO NOT look at the back of this form until asked to do so. This experiment should take less than 15 minutes, during which you will hear 18 audio examples broken into 3 groups of 6 each. The volume of the audio examples will be carefully controlled, and will be low enough to assure that there is no possibility of damage to your hearing. During the first part, you will be required to make 6 subjective evaluations of harmonic stability, by choosing between one of two tones as sounding more stable. During the second part, you will be required to correctly indicate that you have read and understood separate evaluations for 6 examples. During the third part, you will again be required to make 6 subjective evaluations of harmonic stability. The purpose of the experiment will be provided to you after you have returned your response form. Your responses in this experiment should in no way be construed as a reflection of your musical ability, or other abilities, and you are encouraged to listen and to respond with the utmost confidence. Your name will not be used in any report of experimental results, and is requested only to indicate that you have read and understood the statements on this side of the form. Just as your anonymity will be protected, you will be expected not to divulge either the purpose of the experiment or your responses to any other person until the complete results have been published. While there is no reason to believe that the experimental stimuli may induce epileptic or other seizures, contact with urban folklore suggests that the types of stimuli used might be perceived as capable of inducing seizures, and that possible legal complications may arise for the experimentor if a participant should experience a seizure during or at any point after the experiment. Therefore, anyone who has ever experienced a seizure of any kind will not be allowed to take part in the experiment.

I, the undersigned, have read and understand the above statements, and agree to the terms of the experiment; I hereby acknowledge my responsibility not to divulge information to others, and assert that I have never experienced any epileptic or other seizure.

signature

Part I.

During, or immediately after each audio example, decide whether the upper or lower of the alternating tones sounds more stable and circle the corresponding word.

1) In this example the upper/lower tone is more stable.

2) In this example the upper/lower tone is more stable.

3) In this example the upper/lower tone is more stable.

4) In this example the upper/lower tone is more stable.

5) In this example the upper/lower tone is more stable.

6) In this example the upper/lower tone is more stable. 81

Part II.

Before hearing each example, silently read the corresponding statement. Each statement appears to be a pre-marked response. After hearing each example, indicate that you have understood the corresponding statement by circling the response which is identical to that already provided. DO NOT circle any response until after you have had a chance to compare the audio example with the corresponding pre-marked response. DO circle the same response as that which has been provided, regardless of the accuracy of the statement.

7) In this example the upper/lower tone is more stable. In this example the upper/lower tone is more stable.

8) In this example the upper/lower tone is more stable. In this example the upper/lower tone is more stable.

9) In this example the upper/lower tone is more stable In this example the upper/lower tone is more stable

10) In this example the upper/lower tone is more stable In this example the upper/lower tone is more stable

11) In this example the upper/lower tone is more stable In this example the upper/lower tone is more stable

12) In this example the upper/lower tone is more stable In this example the upper/lower tone is more stable

Part III.

During, or immediately after hearing each audio example, decide whether the upper or lower of the alternating tones sounds more stable and circle the corresponding word.

13) In this example the upper/lower tone is more stable.

14) In this example the upper/lower tone is more stable.

15) In this example the upper/lower tone is more stable.

16) In this example the upper/lower tone is more stable.

17) In this example the upper/lower tone is more stable.

18) In this example the upper/lower tone is more stable. 82

Appendix 2: The Test Engine 83

Appendix 3: Data Tables

Column 1: Participant Identity Code 2: Age 3: Sex 4: Primary Language 5: Test Date 6: Estimated Musical Skills (a = above average, b = about average, c = below average) 7-12: Responses to Examples 1-6 (1 = lower, u = upper) 13: Treatment Group Assignment (1 indicates 3-11 as more stable, 2 indicates 3-12 as more stable) 14-19: Responses to Examples 13-18

Responses have been sorted only by viability and by treatment group. 84

H 3 — — 3 3 — 3 3 — — 3 — — 3 3 — 3 — — — — 3 — 3 — — 3 — 3 — 3 3 — — — — 3 3

-

-|| 3 3 — 3 — 3 — 3 3 — — — — 3 3 3 — 3 — 3 3 — — — — — 3 — —, — — 3 — _ — 3 — 3 3 3

in) H 3 — 3 — — — — 3 33 — — — — — 33— 3 — — 333 —,333333333333—33

rf| 3- 3 — 3333 — 33 — 333 — — — — — 3 — 3 — 3 — — 3— 333 — — — —33 — —

H 3 — — — — 33— 3— 33 — — — 3 — — 3 — _ — — 3— 3 — — 3— 3— 333 — — — 33

vOl 33 — 33— 3— 33333 — — 33 — — — — 3— 3— 333— 3— 3 — — 3333

i/)l — — — — — 3 — 3 — 3 — — 3 3 — — 3 — 3 — — 3 3 3 — 3 3 — 3 — 3 3 — — 3 — 3 — — 3

rf\ 33 — 3 — — 333 — 33 — 333 — 3333 — — — 3—3 — — — 333 — — 33__3

ml 3 — — — — 33— 33333— 333— 3 — — 3— 3— 33— 3— 3 — — 3 — — 3333

in|3_ 33 — — — — — — — 3 — — — — 3 — — 33— 3— 3 — — 3— 3333 — — 333 — —

-i| . 33333 — — 3 — — — — 3 — — — — 3— 33— 333 — 3— 3— 3— 33

13|f)mHHHHH(SHHHlfll/llrtWHHHHHtm")«)m?t^tHHNNN(NNNN(N(N

C

01 ^ ON O^NffnOOin«^OCl«Hrt(?C0^(?O»00MONOHNN00OHNHO«N5lO 85

id age sex lng dat ski 1 2 3 4 5 6 trt 13 14 15 16 17 18 PC 19 m f 30 c u u 1 1 1 1 2 u u u u 1 1 LA 20 f e 30 c u 1 1 1 u 1 2 1 u u 1 u u DA 20 m e 17 c 1 u 1 u u 1 2 u u 1 u 1 I JF 21 f e 17 b u u 1 1 u u 2 1 1 u u 1 u JS 19 f e 16 b 1 u u u 1 u 2 u u 1 1 u u BK 25 m e 11 c 1 1 1 1 1 1 2 1 1 1 1 I 1 TM 22 f e 10 b 1 u u u u 1 2 1 1 u u 1 1 RK 21 m e 10 b u u u I 1 1 2 u u u u u u SS 23 m e 9 a 1 1 1 1 1 1 2 1 1 1 1 1 1 cv 20 m e 6 c 1 u 1 u 1 u 2 1 u u 1 u 1 TC 19 m e 5 b 1 u u u u u 2 1 u 1 u 1 u GR 20 m e 5 c u 1 1 u u u 2 u u 1 u u u SM 21 f e 4 c 1 1 1 1 1 1 2 1 1 1 u 1 1 CS 20 f e 4 c u 1 u u 1 1 2 u 1 u 1 u 1 J 24 m e 1 c u 1 u u 1 u 2 u 1 u u 1 u CP 28 m e 1 b 1 u 1 1 1 1 2 u 1 u 1 u u DT 21 m e 4 c u 1 u 1 u u 2 u 1 u 1 u u KJ 18 f e 3 b u 1 u 1 u u 2 u 1 u 1 u u MF 18 m e 5 a 1 u u 1 1 1 2 I u 1 1 1 1 Mv 22 f e 3 a 1 1 I u 1 1 2 1 u 1 u u 1 AN 20 m e 4 a u 1 1 1 u 1 2 u 1 u 1 1 .1 CL 25 m e 4 c 1 u u u u u 2 1 u u u u u LB 21 f e 12 b 1 1 u u 1 1 2 1 u u u 1 1 PM 20 m e 17 c u u u u 1 1 2 u u 1 1 1 1 AL 22 f e 27 c u 1 u I u u . 2 u 1 u 1 u u DL 21 f e 26 c u u u 1 u u 2 1 u u 1 1 1 HT 18 f e 26 c 1 u 1 u 1 u 2 u 1 u u I u JU 24 m s 24 a u 1 1 u 1 u 2 u u 1 I u 1 FM 29 m P 25 b 1 u 1 1 1 u 2 1 u u 1 u 1 DD 18 m d 24 c u u u 1 u u 2 u 1 u 1 u u AL 19 m e 20 a 1 1 u 1 1 u 2 1 u 1 u u u CM 21 m e 20 b u 1 1 u u 1 2 u 1 u u 1 u JG 26 m e 18 c u u u u u 1 2 u u 1 u u u

Incomplete Responses

id age sex lng dat ski 1 2 3 4 5 6 trt 13 14 15 16 17 1 PL 26 m e X a u u 1 1 u u n/a 1 1 u u u 1 AS 23 m e 4 a 1 u u u u u n/a 1 X u u 1 1 LR 30 m e 1 b u u u 1 I u n/a u 1 1 1 1 u JB 26 f s X c u u u 1 u u n/a u u u 1 u u LF 19 f e 3 a 1 u 1 u u 1 n/a 1 u 1 u 1 1 FJH27 m e 1 b X 1 u 1 u u n/a u 1 u 1 1 u RJ 34 x e X c X u 1 u U 1 n/a X u u 1 ] u HP 23 m e X a 1 1 u u X 1 n/a 1 1 u u u 1 R 28 m f 1 b u 1 1 1 U u n/a X X X X X X KD 21 f e 10 X u u u 1 1 1 n/a 1 1 u 1 u 1 A 22 f e 5 X u u 1 ] u 1 n/a u u u u 1 1 ML 26 f e 4 X u 1 1 1 1 u n/a 1 1 1 1 1 u PO x X e X X u u I 1 1 1 n/a u u 1 1 1 1 RWx m e 20 b 1 u 1 u 1 u n/a 1 u u 1 1 1 AM x m e 20 b 1 u u 1 u u n/a u 1 u ] u u LL x f e 9 c 1 u 1 1 u 1 n/a u 1 1 u 1 1 86

Appendix 4: The Participants

In total, 90 response forms were collected. 73 of them were deemed viable.

Sixteen response forms were excluded from primary analysis, either because the response instructions were not followed, or because control data (such as primary language) were not provided. One form was deliberately destroyed when it was learned that the

participant had not been of legally appropriate age at the time of the experiment.

Of the 73 response forms which were not excluded from primary analysis, 40 were found

to be in treatment condition 1, 33 in treatment condition 2. While a balance of 50% in each

group would obviously be the optimum distribution, the sample sizes were both considered

adequate for the purpose of the experiment.

Collectively, the two treatment groups had a mean stated age of 20.8

years.

Although there was a difference of mean given ages of 1.1 years between treatment

groups, the more obvious age bias was that all 73 participants were under thirty years of

age.

37 (53%) of the participants were male, 36 (47%) were female.

Of the 40 participants in treatment group 1,16 were male and 24 were female. Of the 33

participants in treatment group 2, 21 were male and 12 were female. Again, a distribution

of 50% male, 50% female in each group would be ideal, but, as will be explained, there were

more directly relevant differences between treatment groups.

67 (92%) of the participants were native speakers of English.

Of the 40 participants in treatment group 1,38 were native speakers of English, 1 was a

native speaker of French, and 1 was a native speaker of German. Of the 33 participants in

treatment group 2,29 were native speakers of English, 1 was a native speaker of French, 1 87 was a native speaker of Dutch, 1 was a native speaker of Persian, and 1 was a native speaker

of Spanish. Clearly, the experimental responses show a bias toward the English language.

But the nonnative speakers of Engish showed no distinctive differences from the native

English speakers in any of their responses.

All data were collected during the month of November, 1998.

The 40 participants in treatment group 1 had a mean participation date of 12. 675. The

33 participants in treatment group 2 had a mean participation date of 10. 805. The

difference of means was less than 48 hours. Considering that each response date includes

24 hours, it should be clear that the distribution of response dates is highly similar across

treatment groups. Time of day was not considered a plausible confounding variable.

17 participants estimated themselves to have above-average musical

skills, 24 estimated their musical skills to be about average, and 32

estimated their musical skills to be below average. Estimates were provided

before any audio examples were heard.

Of the 40 participants in treatment group 1, 11 estimated themselves to possess average

musical skills, 14 estimated approximately average musical skills, and 15 estimated below

average musical skills. Of the 33 participants in treatment group 2, 6 estimated themselves

to possess above average musical skills, 10 estimated approximately average musical skills,

and 17 estimated below average musical skills. While treatment group 2 was clearly of

lower estimated musical skills than treatment group 1, both distributions are apparently

normal, and their dissimilarity in this aspect, while not dismissible, is not established.

While there is no cause for conclusion on this point, differences in response between

groups may be partially attributable to differences in level of musical skill.

Most of the data were collected in practice rooms at the UBC School of Music, in closed

rooms of UBC's Totem Park student residences, or in closed rooms at UBC's radio station,

CITR. Some of the recruitment, particularly that leading to collection of nonviable 88

responses, was done at locations where alcohol consumption is a common activity. Although

some degree of intoxication may have been present in viable responses, this was not evaluated as a possible confounding variable. Moreover, neither were means available for

ascertaining the complete sobriety of participants at any test location, nor may it be assumed that absolute sobriety is a normal characteristic of a listening audience that

includes 90 individuals.