INFORMATION TO USERS

This was produced from a copy of a document sent to us for microfilming. While the most advanced technological means to photograph and reproduce this document have been used, the quality is heavily dependent upon the quality of the material submitted.

The following explanation of techniques is provided to help you understand markings or notations which may appear on this reproduction.

1. The sign or “target” for pages apparently lacking from the document photographed is “Missing Page(s)”. If it was possible to obtain the missing page(s) or section, they are spliced into the film along with adjacent pages. This may have necessitated cutting through an image and duplicating adjacent pages to assure you of complete continuity.

2. When an image on the Him is obliterated with a round black mark it is an indication that the film inspector noticed either blurred copy because of movement during exposure, or duplicate copy. Unless we meant to delete copyrighted materials that should not have been filmed, you will find a good image of the page in the adjacent frame.

3. When a map, drawing or chart, etc., is part of the material being photo­ graphed the photographer has followed a definite method in “sectioning” the material. It is customary to begin filming at the upper left hand comer of a large sheet and to continue from left to right in equal sections with small overlaps. If necessary, sectioning is continued again—beginning below the first row and continuing on until complete.

4. For any illustrations that cannot be reproduced satisfactorily by xerography, photographic prints can be purchased at additional cost and tipped into your xerographic copy. Requests can be made to our Dissertations Customer Services Department.

5. Some pages in any document may have indistinct print. In all cases we have Aimed the best available copy.

University Mierxjftlms International 300 N. ZEEB ROAD, ANN ARBOR, Ml 48106 18 BEDFORD ROW, LONDON WC1R 4EJ, ENGLAND 8107334

H a h n , Ju n e Ir e n e

A MODEL OF OCTAVE STRETCH WITH IMPLICATIONS FOR THE SUBJECTIVE REPRESENTATION OF PITCH

The Ohio Stale University PH.D. 1980

University Microfilms International300 N. Zeeb Road, Ann Arbor, MI 48106 A MODEL OF OCTAVE STRETCH WITH IMPLICATIONS FOR THE SUBJECTIVE

REPRESENTATION OF PITCH

DISSERTATION

Presented in Partial Fulfillment of the Requirements for the

Degree Doctor of Philosophy in the Graduate

School of The Ohio State University

By

June Irene Hahn, B.S., M.A.

The Ohio State University

1980

Reading Committee: Approved By

Mari R. Jones

Paul D. Isaac

Thomas E. Nygren

James R. Leitzel Department of Psychology To my mother and the memory of my father

ii ACKNOWLEDGMENTS

The author is indebted to Dr. Mari R. Jones for the guidance and encouragement she provided both during the writing of this dissertation and during the years of graduate study that preceded it. As an adviser and teacher her knowledge, enthusiasm, and insight were always freely given and greatly appreciated.

The author also wishes to thank the members of her reading committee,

Drs. Paul Isaac, Thomas Nygren, and James Leitzel for their time and their helpful comments and suggestions.

Thanks go to Robin Wetzel for her help in scheduling musicians for the two experiments, and to the musicians themselves for their participation.

Finally, the author wishes to thank her friends and fellow graduate students, June Baird and Lorraine Normore who understood better than anyone else what the writing of this dissertation entailed. They provided support, meals, and entertainment at all the right times.

iii VITA

July 7, 1951 Born - St. Louis, Missouri

1973 B.S., Applied Mathematics, University of Missouri, Rolla, Missouri

1975-1978 Teaching Associate, Department of Psychology, The Ohio State University, Columbus, Ohio.

1977 M.A., Psychology, The Ohio State University, Columbus, Ohio.

1978-1980 Statistical Consultant, Department of Psychology, The Ohio State University, Columbus, Ohio.

PRESENTATIONS

"Robustness of INDSCAL and ALSCAL with Respect to Violations of Metric Assumptions". Presented at Psychometric Society Meeting, August 27, 1978, McMaster University.

PUBLICATIONS

Jones, M.R., Kidd, G., and Hahn, J. "Space-Time Expectancies in Auditory Pattern Memory". OSURF Technical Report No. 2, Columbus, Ohio: Ohio State University, 1978.

FIELDS OF STUDY

Major Field: Quantitative Psychology

Studies in Mathematical Models and Perception. Professor Mari R. Jones

Studies in Statistics and Methodology. Professors Paul Isaac, Thomas Mygren, and Robert MacCallum.

iv TABLE OF CONTENTS Page

DEDICATION ii

ACKNOWLEDGMENTS iii

VITA iv

LIST OF TABLES vii

LIST OF FIGURES viii

CHAPTER

I INTRODUCTION 1

II LITERATURE REVIEW 7

Early Representations of Pitch 7

Experimental Studies of Pitch 11

Scaling Studies of Pitch 20

Mathematical Representations 27

Summary 34

III A MODEL OF OCTAVE STRETCH 35

Introduction 35

Octave Stretch 36

A Threshold Model of Octave Stretch 47

A Subjective Representation Based upon the Logarithmic Spiral: A Psychophysical Model with Implications for Multidimensional Scaling 52

Summary 55

v Page

IV EXPERIMENT ONE 57

Method 5 7

Results 59

Discussion 65

V EXPERIMENT TWO 70

Method 70

Results 72

Discussion 80

VI GENERAL DISCUSSION 84

Introduction 84

Experiment One 86

Experiment Two 92

Implications for Further Research 96

Summary and Conclusions 97

APPENDIXES

A Observed FrequencyAdjustments and Values Predicted by Threshold Model 99

B Tone Pairs Used in Scaling Study 108

C Instructions for Scaling Study 113

REFERENCE NOTES 116

REFERENCES 117

vi LIST OF TABLES

Page

Table

1. Harmonic Ratios Generated by a Spiral Analysis 32

2 . Ward's 1954 Data on Octave Stretch 37

3. Number of Times that the Variable Tone Was Set Higher or Lower Than an Octave 60

4. Means and Standard Errors for Octave Adjustment Task 61

5. Root Mean Square Deviations and Average Absolute Deviations of Observed Settings from Values Predicted by Galilean Model 63

6 . Predictions and Fit of Terhardt Model 64

7. Root Mean Square Deviations, Average Absolute Deviations, and Estimated k Values for Threshold Model 66

8. Cumulative Proportions of Ratings Differing by 0, 1, or 2 Points Across the Two Sessions 74

9. Mean Ratings for Musical Intervals 75

10. Subject Weights for the 2 and 3-dimensional Scaling Solutions 81

11. Rank Ordering of Dissimilarity Ratings by Octave 94

vii LIST OF FIGURES

Figure Page

1. Two Pictorial Representations of Pitch 10

2. Two Psychophysical Scales 13

3. Scaling Solution of Levelt, Van de Geer, and Plomp 23

4. Krumhansl's Chroma Cone Configuration 26

5. Pitch Spiral 30

6. Two-dimensional Scaling Solution 77

7. Three-dimensional Scaling Solution Dimension 1 (horizontal) vs Dimension 2 (vertical) 78

8. Three-dimensional Scaling Solution Dimension 1 (horizontal) vs Dimension 3 (vertical) 79

viii CHAPTER ONE

INTRODUCTION

A question that has recurrently fascinated psychologists for many years concerns the manner in which individuals perceive events or sequences of events. At the heart of this question is the fact that an individual's perception of an event, while depending on aspects of the event that can be physically defined, does not necessarily simply mirror these physical aspects. Subjectively, the event may be somewhat distorted. What psychologists have sought is a way to quantify the relationship between these physically defined events and their subjec­ tive representation. One area in which there have been attempts to develop such quantitative relationships is auditory perception, in particular with musical stimuli comprising the events.

Music is generally composed of long sequences of complex sounds.

An ultimate goal of any psychological study of musical events is to understand more about how people interact with or learn these complex sequences. But to arrive at this goal it is necessary to begin very simply. First the physical stimuli must be adequately described.

Traditionally the description of musical stimuli has relied upon a one-dimensional frequency continuum. However, this linear representa­ tion, though reflecting tone height, may fail to reflect other important

1 2

relations which are also inherent in the tones themselves. In light of

this, alternative representations may have some appeal. One that is

especially useful has been proposed by Hahn and Jones (Note 1). This

representation is a bi-dimensional description of frequency in which

both pitch height and tone chroma are represented.

Given that the stimuli can be objectively described, a related

question concerns how these physical sitmuli are subjectively represent­

ed. Many attempts at a pictorial representation have been made, from

Drobisch's spiral shaped configuration in 1946 to Shepard's five

dimensional double helix in 1979 (cf. Krumhansl, 1979). A necessary

next step is to relate these two representations — the physical and the

subjective. In this area many questions remain unanswered.

One of these questions concerns the relationship between frequency

and pitch. Musical tones can be specified as definite physical quantities

and measured using physical instruments. For example, middle C on a

piano can be identified as that sound having a fundamental frequency of

261.63 Hz. Pitch, on the other hand, is the subjective evaluation of

the frequency of a sound (Backus, 1969). This subjective property is what allows a tone to be compared to other tones, e.g., is it higher or

lower than another tone. Furthermore, frequency does not appear to be

the sole determinant of pitch, as the pitch of a pure tone, or tone with no harmonics, is a function not only of the frequency of the tone but

also of its intensity (Stevens, 1935).

Traditionally psychologists have been most concerned with describ­

ing the relationship between frequency and subjective pitch of a single

tone. This tradition is best represented by classical psychophysics. 3

However, this approach often ignores the fact that musical relations are defined by frequencies between two tones. That is, an interval refers not to the difference in frequency between two notes, but rather to the ratio of the frequencies of the two notes involved. For example, an octave interval is defined as two tones whose ratio of frequencies is

2 to 1. Thus the notes with frequencies of 200 and 400 Hz are an octave apart, as are the notes with frequencies of 850 and 1700 Hz. Though other intervals can be likewise expressed as different frequency ratios, the octave interval seems to be particularly important. It is instruct­ ive to consider the definition of an octave as given in the Harvard

Dictionary of Music, Second Edition:

"...Accoustically the tone with twice the frequency of a home tone (ratio 1:2; e.g., a '=440; a ,,=880). The octave is the most perfect consonance, so perfect that it gives the impression of duplicating the original tone, a phenomenon for which no convincing explanation has been found. ... The fundamental importance of the octave appears also from the fact that it is the only interval common to practically all scales ever evolved, regardless of the number or pitch of the inter­ mediate steps."

Of importance in this definition are the facts that (1) the octave is defined as a 2 to 1 ratio; (2) this particular interval is common to many scales; and most importantly (3) there is currently no explana­ tion as to why the octave is such a salient interval. It has been hypothesized that the consonance or pleasing sound of this interval is due to the absence of beats. This refers to the fact that when two notes separated by an octave are sounded, the second, fourth, sixth and so on harmonics of the lower note will coincide with the first, second, 4

third and so on harmonics of the higher note. However, this explanation

does not account for the fact that the octave relationship remains

salient even when the tones involved are pure tones (i.e., tones with­

out harmonics). The bi-dimensional frequency model proposed by Hahn

and Jones quantitatively addresses the objective salience of the octave

interval and it also provides a means for incorporating this salience

into a subjective representation. These aspects of the model, among

others, will be developed subsequently.

In a larger frameowrk, individual tones and the intervals they

form in context comprise musical scales. From the large number of

frequencies that can be heard by the human ear a discrete subset of

these are chosen to form a scale. Individual elements of the scale are

termed notes. Many scales are possible, but most agree that an octave

is a 2 to 1 frequency ratio. The equal temperament scale is basically

a tuning system which divides each octave into twelve equal logarithmic

steps. In contrast, other tuning systems have slightly irregular

spacing. These spacings yield scales, such as the Pythagorean scale

and the Just Intonation scale which preserve simple ratios between certain

tones. Hahn and Jones note that the equal temperament scale is consistent with their representation of frequency. Subjectively it appears from a

number of studies involving musicians, that the internal scale used by

musicians corresponds fairly closely to the equal temperament scale,

but with a slight stretching of the intervals (Ward, 1970).

This subjective stretching of intervals has been extensively

studied with respect to one interval in particular, namely the octave. When listeners judge octave relationships, they are consistently and reliably wrong in that their subjective judgment of an octave yields

frequency ratios which are somewhat larger than 2 to 1. Though the lengthening of the subjective octave relative to the physical octave is more pronounced for frequencies above 1,000 Hz, this stretching holds over the entire frequency range. Several theorists have attempted to address this issue. Ward (1970) suggested that the stretched scale of the piano might be responsible for the stretching of an internal musical scale. Piano strings, not being infintely thin, vibrate in such a way that the partials are not exactly harmonically related, so the piano scale is stretched. If a subjective musical scale is built based upon the piano scale, the internal scale would also be stretched.

Another theorist, Terhardt (cf. Sundberg and Lindqvist, 1973) also believes the stretching is due to learning acquired early in life as a result of exposure to complex sounds. According to this view, the pitches of neighboring partials move away from each other slightly due to a mutual masking effect. This leads to a subjectively stretched scale. On the other hand, Dowling (1973) suggests that stretching is due to the structure of the auditory system itself. A final possibility which will be explored is concerned with the relationship between the organism and the particular tones being considered. The model under­ lying this approach is dynamic in nature and assumes that people interact with structure that is inherent in the physical stimuli.

To summarize, in order to describe the subjective representation of frequency relations it is first necessary to consider an objective representation. Musical sounds themselves incorporate not only a tone 6

height component, but also contain information about tone chroma or

octave relations. A subjective representation should reflect both of

these dimensions. Further, pitch is not equivalent to frequency and

subjective intervals are not identical to physical intervals. Based

upon a dynamic model, a possible explanation for this subjective

stretching of intervals will be proposed.

In the next chapter the history of attempts at relating frequency

and pitch will be traced. This review includes early representations, which were primarily pictorial in nature; experimental studies which

suggested the importance of more than one dimension; and two approaches

that have currently been used to arrive at a description of the subject­

ive representation of pitch. A model for predicting the stretching of musical intervals will be developed in chapter 3. Implications of this model as well as of a particular physical representation of frequency will be explored. The validity of the stretch model and the proposed physical representation of frequency will be evaluated experimentally through two studies. Results of these studies will be reported in chapters 4 and 5. Issues will be summarized and implications of the experimental results will be discussed in a final chapter. CHAPTER TWO

LITERATURE REVIEW

Early Representations of Pitch

The earliest attempts in psychology at representing pitch were primarily pictorial. These representations were only loosely tied to physical dimensions and no attempt was made to quantitatively describe either physical or subjective relationships. However, even in this early research, it was commonly accepted that pitch representations must have more than one dimension. Besides a tone height dimension, tone chroma was also frequently included. Tone height is directly related to the magnitude of frequency — the note D in the fourth octave having a frequency of 293.66 is higher than the note C in the same octave having a frequency of 261.63, and people readily report this sort of rise in pitch. Tone chroma refers to the position of a tone within an octave. It reflects the commonality of all C's or all D's in spite of frequency differences. Despite some acrimonious debate over the issue, it turns out that people also respond to chromatic relationships between tones.

In 1846 Drobisch attempted a subjective representation of pitch that would incorporate both these aspects. The configuration he drew was crudely spiral shaped. The advantage a spiral shape has over a straight line representation is that notes separated by an octave are

7 8 brought into close proximity through successive sweeps of the spiral's whirls. This attempt was criticized by Ebbinghaus (1902) on the grounds that other important musical relations such as the fifth were not represented by close spatial proximity, and further that notes ah octave apart could only be reached by approaching through intermediate and totally dissimilar notes. He proposed instead a one-dimensional subjective representation and added that a spatial representation which could also exhibit important musical relationships among tones was impossible.

The next efforts took somewhat different shapes. In 1909 Titchener proposed a two-dimensional pencil shaped configuration which he termed the tonal pencil. The first dimension was tone height. The second dimension, which he called diffusion or volume, was relatively large in the bass notes, moderate and without much variation through the middle notes, and then rapidly disappeared in the high register, giving the pencil its point. Ogden (1920) had a different view than had been expressed before. He states "... it is now possible to regard the octave quality of a tone as perceptual in origin.. .inneither case does a quality of consonance or an octave character attach itself directly to the simple element of tonal experience, as does, by contrast, its pitch, its intensity, or its volume." He was concerned only with characteristic aspects of single tones or elemental sounds and did not believe relations or intervals between tones was a characteristic aspect of pitch. He recognized four attributes of pitch— pitch-brightness, volume, intensity, and duration. The particular degree of each of these dimensions which is attached to any given sound is determined by the psycho-physical conditions 9 under which the sound exists. Based upon these dimensions he construct­ ed a psychological tonal manifold for pure tones. This is shown in

Figure 1. Each tone was represented by a spread along a baseline indicating its volume, and from this baseline it rose to a peak, the sharpness of which indicated its pitch. The height of this peak above the baseline measured the sound's inherent intensity.

In 1929 Ruckmick objected to Ogden's tonal manifold on the grounds that it included quantitative distinctions such as thresholds of intensity in what was an otherwise qualitative scheme. He proposed his own representation called the "tonal bell" adopting ideas from both

Drobisch and Titchener. In this bell pitch is represented by a continuous gradually ascending spiral line. Ruckmick's configuration is shown in

Figure 1. This spiral begins somewhere below the auditory limit for tones and continues beyond the upper tonal limit. At each successive octave, the ascending line brings notes into close proximity or tonality, thus, once again octave relationships are represented. Volume or breadth is given as a broad base to the figure, which gradually diminishes to a relatively constant amount through the midrange and rapidly decreases in the upper range, leading to the bell shape. Furthermore, the spirals are closer together in the high and low registers indicating that a tone may go around an octave in these regions of the scale with­ out rising much in pitch. This figure was intended primarily as a qualitative representation which could also account for certain known phenomena. Increasing pitch was exhibited by the ascension of the spiral; the importance of octave relationships was manifested by close spatial 2018 10 * 4096 512 192 10* 256 10 t 128 JJ

10 +

10

10

- .~ L . - • * -1...... «-*• • i ...... »— + v- 30 60 90 120 150 180 210 228 Intervals

Figure 1: Two pictorial representation of pitch. o 11

proximity of notes separated by an octave; pitch discrimination is most

acute in the middle registers and spiral lines are correspondingly

further apart in this range. Though these phenomena were noted and

incorporated into the representation, it was not until later that an

attempt was made to tie physical dimensions to these subjective

representations.

In general these early representations regarded pitch as multi­

dimensional. However, there was great variability in the particular

dimensions incorporated as well as clear ambiguity with respect to the

relationship between the subjective and physical dimensions. Also in

evidence even at this time was a controversy over the relative salience

of octave and other intervals and how if at all they should be

represented.

Experimental Studies of Pitch

Early attempts at representing the subjective structure inherent in

pitch were primarily qualitative in nature. They were pictorial

representations that tried to capture known phenomena. Though dimensions

in these representations were at times clearly related to physical

dimensions, the quantitative function relating the two was not determined.

Furthermore, no experimental data was gathered as proof of the validity

of the representation. For example Ruckmick did not try to tie the

dimensions of his tonal bell to any objective dimensions.

Beginning in 1937 a study by Stevens and Volkmann raised important questions about the validity of this kind of representation. They 12 asked the following questions: Is there more information available in

the stimulus itself than just tone height? Are there systematic deviations of the subjective representation from the known physical one?

And what is the psychophysical function relating an objective stimulus, defined as a single frequency, to a subjective attribute such as pitch?

Evidence was collected on each of these questions.

The last question was addressed by Stevens and Volkmann in 1937.

They attempted to specify the psychophysical function relating pitch to the physical dimension of frequency. Their goal was a scale of pitch which could be expressed in numbers whose values were directly propor­ tional to the magnitude of the observed frequency. They accomplished this experimentally through the method of bisection. This method requires subjects to adjust the frequency of a variable tone until it sounds half as high as a standard tone. To the extent that listeners underestime or overestimate the half frequencies, the subjective scale will differ from the physical dimension. Using results of the fractionation task, numbers were assigned to the tones to derive a numerical scale of perceived pitch. The 1000 Hz tone was assigned a value 1000 on this scale. Then the tone which was heard as half as high was assigned the value 500 and so on. The resulting scale, termed the mel scale (see

Figure 2), related the perceived pitch of a tone (in units called mels) to its frequency. Unlike the early pictorial subjective representations, this scale was quantitatively tied to a physical dimension. Given a tone of a specific frequency, its subjective value, in terms of mel units, could be determined. Figure 2: Two psychophysicalscales Two 2:Figure

Difference in cents Pitch in mels 1000 600 800 400 120 200 . _ I 100 200 Frequency 300 Frequency 1000 0 70 1000 700 500 4000 13 14

The mel scale, though quantifying the relation between frequency and perceived pitch, did not have any basis for expressing the importance of musical intervals such as the octave. Ward (1954) later questioned the usefulness of the mel scale on these grounds. He felt that subjective musical intervals should correspond at least roughly to physical musical intervals as musical relations are derived from experience with physical frequency ratios. Evidence seemed to suggest that notes separated by an octave, in which the physical frequency ratio is 2.00, have something in common. For example, Bachem (1937), in a study of persons with absolute pitch, found that in identifying tones these people would often name the correct note, but would place it in the wrong octave. Other studies using nonmusical subjects reported similar results. Humphreys (1938), using psycho-galvanic response as a depend­ ent measure, found that people trained to a specific frequency displayed greater generalization to tones an octave apart than to other tones.

Further in 1943 Blackwell and Schlosberg reported evidence that rats

trained to respond to a certain tone generalized to tones an octave from

the standard. And significantly, this occurred both when the stimuli were tones with some harmonics and when the stimuli were pure tones.

The importance of this experiment is not only that it supports the

salience of the octave interval, but also that it occurs in animals, thus

ruling out musical training as an explanation. Additional support for

the importance of the octave interval was given by Bachem (1954) in an experiment investigating the effects of time on absolute pitch

discriminations. Subjects were required to judge whether a standard and

a comparison tone were the same. The time between the presentation of 15 the standard and the comparison tone varied from 1 second to 1 week.

For some subjects, octave errors were common. They would incorrectly identify the comparison tone as the standard when they were an octave multiple of each other.

Ward (1954) made use of the importance of the octave interval in an attempt to devise a subjective scale of musical pitch that differed from the mel scale of Stevens and Volkmann. Having objected to the mel scale on the grounds that it was constructed based upon judgments of nonmusical relations, Ward derived a scale based upon a specific musical relation. He hoped to find that subjective musical intervals would correspond at least roughly to physical musical intervals.

Accordingly he used the subjective octave as the unit for his scale.

In several experiments (using pure tones) he presented musically trained listeners with a fixed tone and then asked them to adjust the frequency of a variable tone until it was an octave above the standard. In one experiment he utilized frequencies of 425, 757, 1180, 1660, and 2225 Hz.

In a second experiment he used frequencies sampled from the continuum above 250 Hz in order to obtain a fairly continuous function relating the subjective octave to the physical octave. The scale he constructed from these data relate frequency to subjective musical pitch. As shown in Figure 2 this curve exhibits a general upward trend, indicating that musical octaves change at a slightly slower rate than frequency. For example, a tone with frequency 500 is not heard as an octave above the tone with frequency 250. A slightly higher frequency is required to subjectively sound like an octave. He concluded that "The consistent difference between the average MP (musical pitch) scale and the ordinary 16 systems of temperament is probably the must puzzling finding of the present study.... However, the fact remains that, with pure tones, the scales for all observers show this gradual stretching and therefore it may be considered normal."

Other evidence for both the ability of humans to make octave judgments and for the stretching of subjective octaves is summarized by

Dowling (1973). He reviews cross-cultural research in octave judments and concludes that both westerners and non-westerners are extremely precise in making octave judgments with pure tones. Furthermore, these judgements deviate systematically from the 2 to 1 frequency ratio in the direction of larger or stretched subjective octaves, and the amount of deviation from the physical octave increases in the higher frequency range. Burns (1974) reported a study in which nine Indian musicians performed a task similar to Ward's. His results were essentially the same as Ward's also — a small but statistically significant frequency dependent stretching of the subjective octave. This finding offers evidence against the theory that octave stretch is learned from the stretched tuning of the piano which western musicians are exposed to.

This stretched tuning is not found in most instruments which Indian musicians use.

Studies thus far cited were based upon judgments using pure tones.

Sundberg and Lindqvist (1973) essentially replicated Ward's study, but utilized complex tones as the stimuli. Musically trained subjects adjusted the frequency of a variable tone to be an octave above a reference tone. As in Ward's study, they found generally small 17 intrasubject variability with the most variable settings being made at the extreme ends of the frequency range; somewhat larger intersubject variability, but with similarly shaped curves relating the physical octave to the subjective octave across subjects; and the frequency ratio required for subjects to perceive an octave slightly exceeding a

2 to 1 ratio. In fact, the subjective values obtained with sinusoids tended to be larger for higher frequencies and smaller in the lower frequency region when compared with values obtained using complex tones.

Other support was offered by BfcLfner (1963) who found that sleep deprivation increased the frequency required for octave determination, while treatment with d-amphetamine reduced this frequency.

These studies provide evidence for a discrepancy between a particular musical interval, the octave, and its subjective value. The results were presented as psychophysical functions or uni-dimensional scales relating frequency to subjective octaves. Note, however, that unlike the early subjective representations in which tone height and tone chroma were often pictorially displayed as separate dimensions, no attempt was made to separate the two components in these experiments.

An important study by Shepard (1964) provided convincing experimental evidence that the two in fact could be separated.

For this study, he constructed a special type of complex tone, each tone consisting of ten simultaneously sounded sinusoids spaced at octave intervals. Amplitudes of the components were large for the com­ ponents of intermediate frequency, and tapered off gradually at the high and low end of the frequency region. Given an initial complex tone, a second tone was constructed by shifting up all components the same fraction of the way toward the next higher octave. This upward shift in frequency is offset to some degree because the contributions of the lower components are increased while the contribution of the upper components are decreased. In fact, if the tone is shifted up an entire octave it becomes identical to the first tone. Thus this scheme for constructing tones was a cyclic one, and the twelve notes so generated can be represented as regularly spaced points around a circle. These tones then maintain a chroma dimension while suppressing the pitch height dimension. When these tones were played successively, it appeared to listeners as if each tone was higher than the tone which preceded it in spite of the fact that the same twelve tones were re­ peated over and over again as the sequence recycled. He further had subjects make pairwise judgments of the tones. Having heard one tone they had to judge whether a second tone was up or down with respect to the first. Again results supported the circularity of the tones. For example, when the two tones were diametrically opposite on the circle, the second tone was judged to be higher than the first about as often as it was judged to be lower. This study clearly supports an attribute of pitch other than tone height.

In another study Risset (1969) constructed complex tones in a similar way and found evidence that tonality and tone height can be varied independently.

Two other studies found differing degrees of support for the chroma dimension. Attneave and Olson (1971) investigated whether listeners transpose musical patterns according to log frequency values, which would be in accordance with musical scales, or according to the mel scale. When subjects were asked to transpose a novel pattern into a different frequency region there was wide individual variation.

Subjects who had musical training transposed the pattern according to a log frequency scale while subjects with no musical training differed greatly from the musical group and also from one another. However, on a second task, which involved transposing a very well learned pattern, the nonmusical subjects also transposed along a log frequency scale.

This study raised doubts about the usefulness of the mel scale in representing subjective pitch and offered further support for the chroma dimension. Somewhat conflicting results were reported by Thurlow and

Erchul (1977). They tested subjects with musical training on their ability to identify two note musical intervals in which the upper note is shifted by one or more octaves. Only some of the subjects were very accurate at this task. Similar results were found in a second task.

Subjects heard three tones and were asked to judge which of the last two notes was more similar to the first. One of these two tones was an octave from the first. Most of the subjects did not perceive greater similarity for octaves than for nonoctaves.

These experimental studies sought answers to old problems and raised some new ones. Attempts were made to specify the relationship between the physical dimension of frequency and several subjective ones.

In all cases, however, the linear representation of frequency was the objective representation. Although it became apparent through experiments with people and rats that there might also be information in the stimulus about octave relationships, only single dimensional 20 functions were studied. Furthermore, through these experiments the phenomenon of the stretching of subjective octaves was noted, but no theory was able to adequately explain this finding.

Scaling Studies of Pitch

The mel scale and the subjective octave scale were psychophysical models of subjective pitch. Using fractionation and interval adjustment tasks, one-dimensional subjective representations were arrived at.

These representations related the objective linear frequency continuum to a subjective dimension. With the advent of multi-dimensional scaling

(MDS) techniques (Kruskal, 1964) procedures became available for arriving at subjective representations in more than one dimension.

Besides allowing for multi-dimensional representations MDS techniques have another advantage. The task subjects perform in MDS studies was simpler than fractionation tasks. And more importantly, it did not entail specifying to subjects the basis on which they were to make their judgments. Psychophysical methods involved asking subjects to make judments based on height, as in fractionation tasks or on interval relations, as in octave judgments. Scaling techniques, on the other hand, attempt to uncover the underlying subjective dimensions relating a set of stimuli, whatever these dimensions may be. Data for MDS programs are proximity measures, or measures indicating the degree of similarity between pairs of stimuli within the set. For example, a subject may be asked to rate how similar two tones seem to be on a scale of 1 to 7,

1 being very similar and 7 very dissimilar. Based upon average similarity 21 measures between pairs of stimuli, MDS recovers whatever structure the subjects use to make these judgments. The final result is a spatial arrangement of the stimuli in one or more dimensions. This configura­ tion is derived by the minimization of a function which measures how poorly distances between points in the space are related to the similarity measures between the same points. Thus in the solution, pairs of stimuli which were judged to be very similar should be close together, while pairs which were judged to be dissimilar should be far apart. Dimensions upon which the configuration is based may be linked to specific physical dimensions as is done in psychophysical models.

However, it is more common that the dimensions are viewed as having been constructed by the subjects themselves, and not necessarily related to structure inherent in the stimuli.

One of the first applications of MDS to representing subjective pitch was reported by Levelt, Van de Geer, and Plomp (1966). They used the method of triadic comparisons to arrive at similarity measures among tones. The stimuli they used were pairs of tones sounded simultaneously, the mean value of the frequency of each pair being 500 Hz.

This mean value across all pairs was chosen to suppress pitch height as a dimension. Subjects were presented with three tone pairs and asked to decide which two pairs were the most similar and which two pairs were the least similar. Two sets of stimuli were used, one being pairs composed of pure tones and the other being pairs composed of complex tones. MDSCAL was used to analyze the similarity matrices. For both complex and pure tones three dimensional solutions were accepted. A dimension common to both solutions was a horseshoe-shaped dimension in which musical intervals were ordered according to the width of the

interval (see figure 3). For the complex tones they identified a third

dimension in which the intervals were ordered according to the

simplicity of their frequency ratios. On the other hand, the third

dimension for the pure tones was interpreted as indicating the existence

of ideal reference intervals, such as a fifth, third and octave.

Shepard (1974) later criticized their solution on the ground that the horseshoe-shaped figure could merely be the consequence of extracting more dimensions than the nonmetric analysis could support.

Levelt et al.'s solution, in recovering dimensions relating to musical intervals and ratios were interpreted from a constructionist's viewpoint. That is, in spite of the fact that specific frequencies do correspond to these special musical ratios, they interpreted the solutions as revealing the subjects' construction of subjective dimensions. Thus, through exposure to music, people are thought to build subjective dimensions based upon learned relationships. Implicit in this argument then is the assumption that individuals with musical training will have different subjective dimensions than individuals with less training. Levelt, Van de Geer, and Plomp did not indicate what if any musical training subjects in their study had. Balzano (1977) also used multi-dimensional scaling to find a subjective representation of musical intervals, but specifically restricted his subjects to musicians. These subjects heard a musical interval, while at the same time a musical name was visually presented. Subjects responded same or 23

314, '5.8'

4: 7

,8:1 8:9

1:1

15:

Figure 3: Scaling solution of Levelt, Van de Geer and Plomp. Points represent musical intervals, composed of two notes having the specified frequency ratio. 24

different and their reaction times were analyzed with MDS. A two-

dimensional configuration resulted, in which the intervals fell in a

nearly circular arrangement corresponding to tone chroma.

A more direct way to get at individual variation in representa­

tions of music is through scaling models which incorporate individual

differences. In these procedures a common spatial configuration of

the stimuli is derived for all of the subjects. However, individuals

can differentially weight the dimensions. Thus musically trained

subjects may have higher weights on some dimensions than subjects who have no such training. Miller and Carterette (1975), in a study of the

attributes of timbre investigated this possibility. They used INDSCAL,

an individual differences scaling program, to analyze proximity measures between tones. This analysis resulted in a three-dimensional solution.

One of these dimensions was interpreted as tone height, and it seemed

to be overwhelmingly salient in terms of the subjects’ weights.

Furthermore, no differences were found between musical and nonmusical subjects either in the size of the weights, or their distribution.

These results seem to indicate that subjects are responding to structure inherent in the stimuli. Other studies of pitch representa­

tion were also conducted based upon the assumption that musical

training leads to the construction of subjective dimensions. Two studies by Krumhansl (1979; Krumhansl and Shepard, 1979) investigated how listeners respond to tones in organized musical sequences. Rather than have subjects respond to isolated tones, she first established a tonal context by playing the seven notes of an ascending or descending 25

. In one experiment (Krumhansl and Shepard, 1979) this

sequence was followed by a single tone from somewhere in the octave

range following the sequence. Subjects with varying amounts of musical

training rated how well the tone completed the sequence. They found

three factors which appeared to account for the ratings: pitch height,

octave relationship and a hierarchy of tonal functions, the best fit

being the tonic note, then other tones of the major triad chord, other

notes of the , and finally nondiatonic tones. Differences were found due to the musical background of the subjects, with the

least musical subjects basing their ratings almost exclusively on pitch height, while more musical subjects appeared to use the other two

factors as well.

In the second study, the sequence was followed by a pair of tones.

Subjects were asked to rate how similar the first tone was to the

second with respect to the tonal system suggested by the musical

sequence. As in the earlier psychophysical studies subjects were given some guidance in how they were to base their judgments. Responses were averaged across subjects as they had more or less equivalent musical backgrounds. This data was then analyzed using KYST. A three- dimensional solution which approximated a right circular cone was accepted. A slightly idealized solution is shown in Figure 4. As shown, notes were ordered around the cone according to pitch height, with the two notes separated by an octave coming into close spatial proximity. Furthermore, the close relationship among diatonic tones was observed in that the diatonic tones fell on a plane close to the vertex of the cone, while the nondiatonic tones fell along a plane 26

t-G i t>#

Figure 4: Krumhansl's chroma cone configuration. 27 farther from the vertex. Subjects again appeared to be responding based upon a learned scale. The scale corresponded to the scale established by the tonal context.

Scaling approaches have led to the return of pictorial subj.ective representations. They did however illustrate that people respond to tones in a multi-dimensional way. Further, both the shape and number of relevant dimensions vary as a function of the musical background of the listeners. However, as with early representations, how these subjective dimensions quantitatively relate to objective physical dimensions was not usually an issue. The emphasis was on the purely subjective structure with little attention given to how it comes into being and how it relates to objective structure.

Mathematical Representations

Both the results of experimental studies and scaling solutions indicated several dimensions are of importance in pitch perception.

Questions remain as to whether these dimensions reflect structure that exists in the stimuli, or are a result of musical training. Theories of music perception do not usually address this issue. However, several current models exemplify different theoretical approaches to the interaction of the listener with tones.

Two different theoretical approaches are illustrated by a neuropsychological model and mathematical representations. Deutsch

(1969) proposed a neuropsychological model. In this model, tones are associated with particular neural units. Neural units corresponding 28

to tones an octave apart converge onto the same higher order neural

unit. Other theorists, such as Rothenberg and Jones emphasize certain

mathematical relationships between tones. Rothenberg argues that

subjective representations are based upon musical scales while Jones

maintains that subjective representation stems from the dynamic pick-up

of group symmetries in the physical stimulus. The particular symmetries

relevant to a subjective representation of pitch are specified by a

bi-dimensional representation of the subjective dimension frequency.

In a series of papers, Rothenberg (1978, a,b,c) presented a

mathematical formalization of the structure in musical scales. A

function of scales is to provide for measurement of intervals and also

for rapid identification of the scale elements. Listeners use these

musical scales as reference frames. He assumes, as Krumhansl did in

her scaling study, that listeners extract these reference frames from

a musical context. Based upon this context, they partition musical

stimuli into classes, each of whose elements are equivalent in musical

function. This classification or coding maps a continuous space of

possible intervals into a set of discrete points. In this scheme, the

perception of music is in large part learned. Musical scales, having

specific mathematical properties, are learned, and later used as

reference frames. The particular reference frame a listener chooses will determine to some extent how the intervals he hears are coded.

Structure, in this model, resides in the musical scales.

Jones (1976, 1980) also discusses the structure in musical scales, but with a different emphasis. She employs group symmetry rules to describe both objective and subjective structure. Structure or patterness in objective stimuli can be described by what remains invariant during change. For example, in music octave relations are defined as 2 to 1 frequency ratios. This is true regardless of the particular frequencies involved. Further, she assumes that humans detect these invariances, although not necessarily veridically, and use them to build corresponding subjective dimensions. As a representa­ tion of pitch she hypothesized a particular geometric configuration, the logarithmic spiral. This pattern is constructed by a rotation around a center point of a circle coupled with a regular dilation of the circle's radius. Elements of this configuration can be succinctly described by two generative rules, a dilation ratio and a rotation.

Mathematically it can be expressed by the formula

£ - f0 c / (1) where c is the dilation ratio and 0 represents angular rotation. This configuration can capture two dimensions of pitch — tone chroma being represented by each whorl of the spiral and tone height by movement away from the center point. Thus, in this model frequencies are represented as motions. As shown in equation (1), a frequency, f, can be related to a reference frequency, fg, by a rotation and dilation.

Hahn and Jones (Note 1) expanded upon this idea. They presented the logarithmic spiral, not as a subjective representation, but rather as a system for describing the physical attribute frequency. This representation is shown in Figure 5. Though frequency is traditionally represented as a one-dimensional linear continuum, they argue that tones contain sufficient information physically to support the development of 30

JST

| ST A d :

F,# (174.6Hz)

Figure 5: The pitch spiral. 31 more than one subjective dimension. This model provides a basis for two dimensions as well as explaining important relations that exist between musical tones. For example, the octave interval, which occurs in many scales, has a special representation in the spiral. Notes separated by an octave lie on a common radius, but different whorls.

Furthermore, as arc lengths of the spiral are associated with frequencies and frequency differences, the logarithmic relationship between frequencies is explained as a property of the model.

More specifically, to describe frequency by this configuration the dilation ratio in equation (1) must be cfc = 1.1166. Other dilation ratios would yield different logarithmic spirals which would not describe musical relations so neatly. For example, with a dilation ratio of

1.1166 rotation through 360° yields a doubling of frequency as:

1.1166 6'28319 radian =2.0

With other generators, rotation through 360° would give different expansions. If c was 1.2 then:

1.2 6 *28319= 3.14

Musical intervals other than the octave can also be represented. In general, any frequency ratio can be rewritten in terms of the spiral's generators:

f / f0 - ct6 (2)

They are shown in Table 1. Column 1 gives the musical interval; column

2 gives the interval in terms of its position along a circle divided into 12 equal segments; column 3 gives the angle necessary to specify the interval; column 4 gives the angle in radians. Predicted harmonic Table 1

Harmonic Ratios Generated by a Spiral Analysis General Expression: r_. = r^ cj

Equal Tempered Harmonic Ratios

Musical Interval Polar Angle, e( = so.) Predicted Observed j 3 9 . ..,,9 deg. rad. c^_ = 1.1166 (from Backus,

Unison N° 0 0 1 . 0 0 0 0 1 . 0 0 0

Octave N12 360° 6.28319 2.0000 2.000

Semitone N1 30° .52359 1.0594 1.059

Whole tone N2 60° 1.04719 1.2224 1.22

Perfect fifth N7 210° 3.66519 1.4981 1.498

Diminished fifth N6 180° 3.14159 1.4140 1.414

Perfect fourth N5 150° 2.61799 1.3347 1.335

Major third N4 120° 2.09439 1.2598 1.260 0

Minor third N3 O 1.57079 1.1891 1.189

Major sixth N9 270° 5.71238 1.6815 1.682 O CM Minor sixth N8 O 4.18879 1.5872 1.587

Minor seventh N10 300° 5.23598 1.7815 1.782

Major seventh N11 330° 5.75958 1.8875 1.888 33 ratios based upon the logarithmic spiral equation are shown in column 1 5, while harmonic ratios for the equally tempered scale as reported in standard musical texts are given in column 6. Thus this model seems to offer much in the way of objectively describing frequency. Further­ more, it can then provide a basis from which subjective dimensions can be built.

These models of pitch perception attempt to relate structure in tones to subjective representations. For Rothenberg, this structure was imposed by musical scales and is learned. These learned reference frames guide perception. On the other hand, Jones (1976), and Hahn and Jones (Note 1) conceive of the structure as being inherent in a curvilinear frequency dimension. Learning consists of refining ways to pick up on the structure which exists in tone patterns. This implies that scaling solutions should reflect this structure. The configuration found by Krumhansl (see Figure 4) shows an indication of the curvi- linearity in that the two tones an octave apart are drawn close together. In general, however, interpretations of multidimensional scaling solutions have not been based upon a particular physical representation of frequency. The logarithmis spiral representation gives a precise description of objective frequency relations which could be used in such a way.

Approaches to subjectively representing musical pitch using the logarithmic spiral analysis has several advantages over other approaches. It differs from the approach taken to devise the mel scale in that it does not rely on a linear frequency dimension. Therefore solutions based upon it may be curvilinear. Furthermore, as will be 34 developed in the following chapter, it provides a basis for understand­ ing why subjective intervals are stretched relative to physical intervals.

Summary

Early pictorial representations of pitch usually incorporated several dimensions. However, these representations were subjective in nature and no attempt was made to tie them to specific physical dimen­ sions. With the advent of the psychophysical scaling approach the goal became finding a function relating a particular physical dimension, usually frequency, to a subjective scale (e.g., the mel scale). In this approach frequency was treated as a one-dimensional linear continuum. In contrast to these one-dimensional representations experimental evidence suggested that both tone height and tone chroma were relevant dimensions to judgments made about tones. Furthermore, consistent findings indicated that the subjective representation of octaves is somewhat stretched relative to the physical 2 to 1 frequency ratio. These findings were and are still difficult for many theories to explain. One model that addresses the issue of multiple dimensions was offered by Jones and Hahn. They suggest that objective frequency be represented as a logarithmic spiral, as many musical invariants stem directly from the spiral configuration. Subjective dimensions of pitch should then reflect this underlying physical structure. Further­ more, in conjunction with a dynamic model of perception this representation can be used to predict octave stretch. This model will be developed in the following chapter. CHAPTER THREE

A MODEL OF OCTAVE STRETCH

Introduction

Psychophysical methods (e.g., Stevens and Volkmann, 1937; Ward,

1954), whether they have used musical intervals or not, have not relied upon a rigorous multidimensional representation of tonal frequencies.

Instead, there has been an underlying assumption that physically, frequency is a linear dimension. This assumption has also motivated multidimensional scaling research to a large extent, thus encouraging the interpretation that dimensions retrieved from a scaling analysis are created by the perceiver and do not represent the subject's response to physical stimulus relationships.

Such approaches have several shortcomings. Specifically, they lack the ability to relate multiple subjective dimensions to a physical representation. Furthermore, they provide no basis for representing, other than descriptively, the subjective stretching of musical intervals.

The goal of the present chapter is the development of a model that will

(1) reflect listeners' sensitivity to musical intervals and in particular to various dimensions represented physically by the logarithmis spiral; and (2) precisely predict the degree to which subjective representations of musical intervals deviate from physically defined intervals.

35 36

Octave Stretch

Octave stretch has previously been investigated in the context of developing a scale of subjective pitch. Ward (1954) devised a scale relating frequency to the subjective octave for pure tones; Sundbert and Lindqvist (1973) developed a similar scale for complex tones. Most explanations for why this enlarging of subjective octaves occurs have been primarily descriptive. For example, it has been hypothesized by some researchers (e.g., Dowling, 1973) that the stretching is due to innate characteristics of the ear, and by others (e.g. Ward, 1954) that it is based upon the stretched scale to which pianos are tuned.

Theories attributing stretch to the piano scale don't appear adequate in the face of cross-cultural studies which have demonstrated that this phenomenon exists even among non-Western musicians whose backgrounds do not include familiarity with the piano scale. Furthermore, the theories presented have not been able to explain some of the results which have been consistently found across studies. Besides the finding that subjective octaves are slightly larger than physical octaves, studies have consistently found that this stretching increases as frequency increases.

Another important feature of these studies is that the variability associated with a single subject's adjustments for a particular standard frequency is small, while variability of the adjustments across subjects is large. A portion of Ward's 1954 data, displaying these trends is shown in Table 2. Corresponding to each standard tone used in the study is given its frequency, the group mean of the differences between 37

TABLE 2: Ward’s 1954 data on octave stretch. Presented are group mean differences (n=9) between subjective and physical octaves (Column 2), the standard errors of the group mean, (Column 3), the mean of the standard errors of the individual settings (Column 4), and its standard error (Column 5).

Frequency of SO - PO* Std. Error Mean std. Std. Error standard in Hz Cents Cents Error Cents of (4) Cen (1) (2) (3) (4) (5)

423 11.0 5.0 4.1 0.6

757 34.5 7.5 4.3 0.8

1180 16.0 9.0 4.9 0.8

1660 53.5 16.5 6.2 1.0

2250 48.5 16.0 9.8 3.2

* SO = subjective octave

PO = physical octave subjective and physical octaves and the standard error of this group mean, the mean of the standard errors of individual means and the

standard error of this mean. (Mean differences between the subjective and the physical octave and the standard errors are presented in- cents which are the smallest unit into which musical intervals are commonly broken up. There are 1200 cents in an octave, with each semitone being

composed of 100 cents). For example, for a standard tone of 425 Hz, the mean setting of the 9 observers was 11 cents higher than the physical octave of 850 Hz. The range of the individual means was such that the standard error was 5.0 cents. Each observer made 4 settings for each standard, and the mean standard error of these 4 settings was 4.1 cents.

This mean standard error had a standard error of .6 cents. Data for standard frequencies ranging from 425 to 2250 Hz are shown. Other octave streth studies have reported similar results. What is desirable is a model which can explain this pattern of findings: (1) the stretching of subjective octaves relative to physical octaves; (2) the increased stretching as frequency increases; and (3) the large variation between individuals.

One theory has attempted to provide a quantitative description of music perception. In particular, this theory, proposed by Terhardt

(1974) has a basis for predicting octave stretch. An important concept of this model is the distinction between what is called spectral pitch which refers to the pitch of a pure tone and virtual pitch which is the pitch of a complex tone. Both kinds of pitch are derived from spectral cues, but the way in which they are derived is different. Terhardt proposed two modes of pitch perception, an analytic 39 mode resulting in spectral pitch and a synthetic mode resulting in virtual pitch. In the analytic mode, the pitch of a tone will depend upon not only its frequency but also on its sound pressure level and the presence or absence of adjacent partials. These factors influence the perception of a tone by causing pitch shifts. Mathematically, the pitch of a pure tone with frequency f^ is given by:

x± = (f±/l Hz) (1 + v±) (3) where v^ represents the pitch shift. This shift may have a positive or negative value of a few percent, which can be measured in psycho­ acoustic experiments. Thus the pitch of a given tone may differ from its frequency, the amount of difference depending upon v^. For example,

Terhardt reports that the pitch shift for a tone with frequency 200 Hz is negative one percent. Therefore the pitch of the tone will be per­ ceived as 200(1 + (-.01)) = 198. By definition Terhardt assigns v^ the value of zero for a pure tone with 40 dB sound pressure level presented in isolation. The pitch of a single pure tone with 40 dB SPL, then will have the same numerical value as the tone's frequency.

In contrast to spectral pitch, virtual pitch can be generated only if a learning process has been previously performed. Terhardt considers this learning process to take place during the learning of speech.

"This learning process is assumed to be a part of the learning process which is essential for acquiring the ability to identify speech sounds. In that process the correlations between the spectral-pitch cues of voiced speech sounds (i.e., of harmonic complex tones) are recognized and stored. The knowledge about harmonic pitch relations that is acquired in this way is employed by the system to generate virtual pitch." 40

The synthetic mode which generates virtual pitch consists of two phases, a learning phase, mentioned above, and a recognition phase.

After repeated stimulation by speech sounds in the learning phase, the organism is able to operate in the post-learning or recognition phase.

In recognition, spectral cues are generated based upon the harmonics of which a complex tone is composed. For a complex tone with a full set of harmonics (Terhardt considers this to be 8) spectral cues are formedcorresponding to the following formula.

y = — (1 + v, - v + v ) (f,/I Hz) m,n = 1»2, ... 8 (4) n i n m b

In this formula y is the value of the spectral cue, vm is the pitch shift of the harmonics actually present, f, is the fundamental frequency b of the tone, and vn is the mean pitch shift of the nth harmonic.

Each harmonic actually present in the tone will generate a set of spectral cues. These cues are compared to cues which have been stored during the learning process, and a pitch value is assigned to the tone corresponding to the stored pitch whose cues best match the generated cues.

Furthermore, this learning process by which complex tones are identified is also considered to provide a basis for explaining octave stretch:

"According to this explanation the octave enlarge­ ment is a consequence of the pitch shifts which are introduced by the peripheral auditory system, i.e., by the process 'extraction of spectral-cues'. An essential part of this explanation is the described learning process. Thus, one may say that the successful explanation of octave enlargement supports the hypothesis that musical intervals are acquired in a learning process." 41

In a complex tone containing 8 harmonics, some of the harmonics are related by a frequency ratio of 2:1. Terhardt assumes that the frequency ratio which subjects use to make judgments about octave intervals is based on the learned frequency ratios of partials which are an octave apart. Using equation 4, he derived a formula giving the frequency ratio of partials an octave apart:

x f c M =2:1 + i/4 (-;1 - ; 3 + ; 6 + ; 8)i

This equation, together with pitch shift values given in Terhardt's

1974 article, predicts that the frequency ratio between two partials an octave apart will be slightly greater than 2. This is because (-v^ - v^) is always less than (v^ + Vg). Using pitch shift values given by

Terhardt in his 1974 article, the ratio for a tone with a fundamental frequency of 200 Hz is calcualted to be:

2 [ 1 + 1/4(— (—.01) - .015 + .03 + .04)] = 2.0325

Therefore, for a pure tone of 200 Hz presented at 40 dB, a tone an octave higher would be:

200 (2.0325) = 406.5 Hz.

If the tone presented was not a pure tone, or if it were presented at other than 40 dB SPL, pitch shifts would have to be incorporated in the calculation of the predicted octave interval.

Similarly, the frequency ratios for octaves based on other fundamental frequencies could be computed and the frequency of a tone an octave above could be predicted. There are, however, several limitations to this approach. Terhardt proposed this learning process 42 only for frequencies within the range of the fundamental frequencies of speech, which involves frequencies ranging from 50 to 500 Hz, approx­ imately. For frequencies above 500 Hz. presumably the learning process will not have developed cues which can be matched. As mentioned earlier, octave stretch studies have consistently found that stretching increases as frequency increases. In fact, stretching is particularly evident for frequencies above 500 Hz. Also, this model does not quantitatively account for individual differences, as the pitch shift values are assumed to be constant across observers.

Terhardt's model, though restricted in the way mentioned above, is the only quantitative model developed to explain the discrepancy between objective and subjective octaves. His model is concerned specifically with the auditory system and distortions involved in the perception of auditory events. Other models exist which have been developed to explain differences between subjective and objective events in other modalities. One model, proposed by Caelli, Hoffman, and

Lindman (1978) was developed to explain distortions which occur within the visual system. Its adaptation has implications for auditory distortions such as octave stretch. A more general theory, independently proposed by Jones (1975), also addresses perceptual distortions in both visual and auditory systems, and it will be shown that specific models within this framework result in a clearer set of predictions for an octave adjustment task.

Both Caelli, Hoffman, and Lindman, and Jones consider time to be an important component in perception. Specifically, the relationship between time and spatial dimensions is central to both theories. 43

Rather than considering time as an absolute quantity in which spatial

events occur, time is represented via velocities or motions. This

emphasis on timing and motions distinguishes these models from Terhardt's

model. Furthermore, this emphasis on motions has led both researchers

to incorporate concepts common to theories in the physical sciences into

their models. Reference frames and Lorentz transformations are integral

parts of both these models. However, how they are applied is somewhat

different.

Caelli, Hoffman, and Lindman were interested in explaining why the

perceived length and temporal duration of a moving visual object is

affected by its physical velocity. Since their approach has theoretical

and methodological similarities to approaches which address predicting

estimates of tone frequencies, it is worthwhile to examine it in more

detail. In their model it is necessary to distinguish between what they

term physical and perceptual frames of reference. As an example,

consider a dot of light translating horizontally at velocity v with

respect to a fixed observer. Here there are two physical frames; one

corresponding to the fixed observer and one to the moving light source.

These physical frames are governed by a limiting value, which in this case is the speed of light. Because of this limiting value, velocities

in these reference frames are related by Lorentz rules. There are standard formulas for the addition and subtraction of velocities accord­

ing to these rules. For example, consider there to be two frames of reference, one moving with velocity v relative to the other which is

considered to be stationary. If an event occurs with velocity u in the stationary frame, Galilean rules would indicate that its velocity with respect 44

to the moving frame, namely u*, is u* = u -v. However, the existence

of a threshold velocity denoted by c, changes this equation as follows

(Arzelies, 1966) according to a Lorentz formulation.

1-2 c

In the limiting case, where there is no threshold ( i . e . , c = °°) , this

reduces to the Galilean formulation.

Corresponding to the physical frames of reference, Caelli et al.

consider there to be stationary and moving perceptual frames. These

frames are limited by the maximum velocity of perceived movement, which

is denoted here as c* (using Caelli et al.'s notation). This threshold

value is considerably lower than the physical limit, and they define it

as a measure of the finite propogation rate of signals in the human visual system. They applied these concepts of reference frames and

limiting signals in several experiments involving moving dots of light.

In one experiment, they used a fractionation task not unlike that

used by Stevens and Volkman in developing the mel scale for auditory

stimuli. They presented subjects with a standard moving dot of light and asked them to adjust the velocity of a second moving dot until it was half the velocity of the standard. As a model for predicting the half-velocities as set by several observers, they used a version of the

Lorentz subtraction formula. Following equation 6, they considered the velocity of the standard dot of light to be u with respect to the observer's fixed frame. The velocity of the variable moving dot was considered to be v, also with respect to the fixed frame. By the

Galilean formulas, then, v=u-vorv=l/2 u. However, considering 45

the limits on the system led them to use the following form of equation

6 to predict the half-velocities:

u * = v = ------u-v (7) 1 - (uv)/c*2

Solving this equation for v results in:

c*Z v = ------(8) u

This formula indicates that half-velocities will be over estimated. It was used to predict estimated half-velocities for a series of standard

velocities presented to subjects, and was found to be generally con­

sistent with settings made by these subjects.

To recapitulatebriefly, in theirformulation Caelli et al.

specified the following: (1) a stationary frame which is associated with the observer; (2) an event, the standard dot, moving with velocity

u with respect to the fixed observer; and (3) a moving frame with

relative velocity v, associated with the variable dot of light. The half-velocities which their model estimates, is that velocity, which when doubled yields the velocity of the standard with respect to a fixed frame

of reference. For example, if the standard dot has avelocity of

u = 10 deg/sec, and the limiting value was c* = 50 deg/sec, then the

predicted half velocity would be:

502 [1 - (1 - (10/50)2 ) 1/2] v = l------— --- -— ---- - = 5.05 . 10

This value, 5.05, is that velocity, which when doubled,, has a value of 46

10 with respect to the fixed frame in a system with a limiting velocity of 50. Their model, however, does not specify how frames of reference would be assigned in a task which involved doubling velocities rather than halving them. There are several ways the frames could be speci­ fied for such a task. The model proposed by Jones has a basis for determining a way to define these frames.

Like Caelli et al., Jones also considers there to be subjective reference frames as well as physical ones. Perceptual reference frames are associated with subjective motions and could be considered as points of view. Events which occur in the world are evaluated with respect to a particular subjective frame which was activated by another world event. Thus, many different reference frames could be considered, and a given event may be perceived differently depending upon the frame being used. For example, depending upon the frame of reference, the duration or length of an event could be either over or under estimated.

These reference frames can be generated by both visual and auditory events. In both cases, perception of other events will be affected by the threshold for that system. Jones denotes this threshold as k.

To use this theory to predict an individual's response in a given task, a model must be developed that clearly specifies stationary and moving frames of reference and allows estimation of an individual's threshold or k value. Such a model can then be used to make predictions about perceptual distortion in both visual and auditory tasks. In visual tasks, continuously moving objects are assumed to excite corres­ ponding subjective reference frames, and in auditory tasks, the harmonic motion of a pure tone frequency determines a moving frame. A goal of 47 this chapter is the development of such a model.

The two relativistic models differ from Terhardt's in that they incorporate a threshold and do not concern a learning process.

Furthermore, the model to be developed from Jones' theory differs from

Caelli et al.'s model in the type of task it can address. While the task which Terhardt's model addressed was that of adjusting a variable tone to be an octave above a standard tone, Caelli et al. addressed the task of adjusting the velocity of a moving dot of light to be half the velocity of a standard dot of light. Jones' theory was developed to address results of both these tasks. In the next section a model will be developed based upon Jones' model, to predict the results of octave adjustment tasks.

A Threshold Model of Octave Stretch

In this section, a relativistic model developed from Jones' general theory will be outlined for explaining octave stretch. In Jones' theory, frequency is represented as a logarithmic spiral. In the current model, tones are conceived of as motions along this spiral and so can serve as reference frames. The manner in which these frames are determined for a specific task, such as an octave adjustment task, will be explored.

The task of adjusting a variable tone to be an octave above a standard has similarities with a task of adjusting the velocity of a moving object to be twice the velocity of another moving object. That is, both tasks require subjects to double events which are time based.

As discussed in the last section, tasks of this type can be described 48 in terms of a relativistic model, when there is a finite motion threshold value associated with the occurence of events. In the human auditory system, audible frequencies are limited by an upper threshold of approximately 20,000 Hz (Thompson, 1967). Furthermore, there is evidence that short of these levels the auditory system loses its temporal resolving power with respect to sinusoidal periodicities

(Shouten, 1970). Therefore, Lorentz equations may be useful for relating auditory events.

A major problem rests with determining how to specify frames of reference. When an observer must double the frequency of a standard tone as in the octave stretch task, we must first consider the adaptation of the Galilean formula for addition of velocities

(u* = v + u). In general terms, the relativistic formulation is given by:

u* = i ± J L _ (9) 1 + iiZ 2 c where as before v is the relative velocity of the moving frame with respect to a fixed frame. In this case u is the velocity of an event which occurs in the moving frame, and u* is the velocity of that same event with respect to the stationary frame. For an octave stretch task frames are assigned in the following way. Assume again that there is a stationary reference frame associated with the observer. In general this frame can assume any frequency value as it identifies a subject's modal level of attention. For example, it may be associated with a very low frequency if subjects are musically naive and rely roughly on tone height, or it may be associated with a salient key

* 49

(e.g., middle C) in the case of musically sophisticated subjects. In this model it is also assumed that the standard tone itself creates a moving frame of reference. This moving frame, in the simplest case where the stationary frame is known to be associated with 0 Hz, becomes simply the frequency of the standard tone. This velocity is denoted for auditory frames as The question then becomes: "What must the frequency of the adjusted tone be relative to this moving frame of reference, such that the resulting frequency is perceived as twice the standard with respect to the observer's fixed frame?" Adpating equation

9 to this auditory mode, this is expressed mathematically as:

2£std - cio) 1 + std rel k2 where f ^ is the frequency of the adjusted tone relative to the moving frame. For example, assume that the standard frequency is 1000 Hz.

This means that the attention of a stationary observer is associated with a very low frequency which, for simplicity, is taken as zero (i.e., 0 Hz).

Thus the moving frame, normally the standard tone frequency, is given by

1000. Next assume, arbitrarily, that a threshold frequency is k = 16,000

Hz. Then, one subject, whose fixed frame is at 0 Hz, will adjust the frequency of the variable tone such that fre^ is 1007.86. This allows the variable tone to be perceived as twice the value of the standard tone with respect to the stationary frame as shown below:

1000 + 1007.86 2000 = 1 (1000)(1007.86) 160002 50

This formalization then also predicts octave stretch; for if the observer set the variable tone to have a frequency of 1007.86 relative to the 1000 Hz tone, the variable tone will be 2007.86 in the physical frame. Equation 10 automatically reveals the magnitude of octave stretch as a function of the threshold value, k. The physical octave represents the target frequency, 2fgt(j, whereas the subjective octave is merely the setting represented by the sum, fgt(j + ^rei> ^or a given value of k. As k decreases for an observer (representing a lower frequency threshold), larger values of fre^ will be required to match the target frequency.

Thus the subjective octave is defined as SO = f „ , + f J std rel Solving equation 10 for f we have:

£rel= 2 1 - 2(W k)

Then the subjective octave can be expressed in terms of the standard

frequency and the threshold value as:

so - £ td + — ^ ------r (12) 1 - 2 (fst d /k>

Equation 12 establishes the specific relativistic model that will be used to predict octave stretch in forthcoming studies. It is a special case of the more general model outlined by Jones in that it assumes, for simplicity, that the stationary observer is associated with a very low frequency value. Accordingly, the model requires estimation of only a single parameter, k. 51

In sum, models in which Lorentz transformations are employed can be used to explain the common findings of octave stretch studies.

(1) The increased stretching at higher frequencies is inherent in

equation 12. When frequencies are small relative to the threshold

value, little stretching will occur. As frequency increases, the

stretching will become more pronounced.

(2) For each observer there is the possibility of a threshold value

as well as of a different fixed frame (in the general case).

Thus, for a given subject, variability should be small.

(3) Across subjects, however, threshold values can be quite different.

Thus, in terms of an adjustment task, there may be large inter­

subject variability.

In the last few sections several models have been discussed in terms of an octave stretch task. These models all relate a subjective distortion to the physical dimension of frequency. A purely Galilean model would predict octave judgments would be veridical. Other models predict a stretching of subjective octaves. Terhardt's model predicts octave stretch based upon pitch shifts whose values are determined in psychoacoustic experiments. For any given frequency, these pitch shifts would have to be determined to predict the amount of stretch.

Furthermore, these pitch shifts are assumed to be the same across subjects and so inter-subject variability should be small. In contrast to these models, a relativistic model which incorporates a threshold value can predict stretched octaves and can also account for individual differences. 52

A Subjective Representation Based upon the Logarithmic Spiral; A Psychophysical Model with Implications for Multidimensional Scaling

As already indicated, octave stretch is merely one illustration of a more general phenomenon involving subjective distortion of musical intervals. The stretching of octaves and indeed of musical intervals in general should also be expected to be reflected in tasks other than frequency adjustment. Furthermore, given that a logarithmic spiral can be considered as an accurate representation of the physical dimen­ sion of frequency, a subjective representation should reflect this as well as the enlargement of the higher octaves. The fundamental assumption of the spiral representation is that different frequencies can be represented as continuous motions. The spiral expression, from equations 1 and 2, shows that two frequencies which differ by c can be represented as the ratio:

f e V c‘ 0 Note that in the octave stretch task, ct is 2 and this ratio is taken in equation 10 to define the target frequency, 2fgt{j. If we now extend the reasoning behind the octave stretch task to include any interval, then the most general expression for equation 10 is:

_ 0a _ _ £ std + f rel ct fs t d f - f , (13) l | std rel k2

Thus, we can rearrange this formula algebraically to achieve, not merely an expression for a subjective octave, but more generally an equation for a subjective interval, SI: Notice that in this form we have a psychophysical relationship which

incorporates both the logarithmic relations underlying the spiral and

the relativistic nature of the judgment task. The amount of stretching

of a given subjective interval will be a function of the standard under

consideration, the size of the interval to be determined, and the

t h r e s h o l d .

This model attains several goals. It incorporates the physical dimension of frequency, but does not rely on a linear representation of it. Furthermore, it predicts stretching, but allows that the amount of stretching is dependent upon several factors. Equation 14 shows that stretch is a function not only of standard frequency, but also of the size of the interval being evaluated, and the threshold value.

Differences in stretch are therefore expected across tasks and across individuals.

There are numerous ways to explore the validity of equation 14.

One involves the examination of subjects' judgments about intervals other than octaves. In scaling studies which require similarity judg­ ments about tone pairs, it is already clear that some evidence for a curvilinear spiral representation exists. However, current scaling studies do not reflect the enlargement of intervals between tones at higher frequency levels. In fact, from the earliest scaling studies of musical stimuli, often an attempt was made to suppress a particular 54

dimension. For example Levelt, Van de Geer, and Plomp (1966) presented as stimuli intervals composed of two tones, but each interval was

constructed to have the same mean frequency. Thus their solution did not reflect tone height. Krumhansl (1979), on the other hand, used

only tones within a single octave. Thus relationships of notes across

octaves were not represented.

In a study involving tones from several octaves, it would be possible

to recover dimensions relating to both tone height and tone chroma. A

study of this type would be somewhat exploratory as many of these questions have not been previously addressed. It may be that the

logarithmic representation would be roughly captured by the subjective representation. However, as noted earlier, the logarithmic spiral could also be the basis for the construction of other dimensions which are represented as groups of tones generated by particular polar angles.

These subgroups could lead to more than the two dimensional representa­ tion indicated by the logarithmic spiral. In any event, the threshold model would be evident in that intervals in the higher octaves should be stretched relative to intervals involving tones of lower frequencies.

In sum, the model developed makes several predictions. The logarithmic spiral representation indicates that a curvilinear subjective representation should be found. This curvilinear dimension should capture both tone height and tone chroma. The model further indicates that intervals in the higher octave should be stretched relative to those involving tones with lower frequencies. 55

S u m m a r y

In this chapter several models which are able to predict octave

stretch were discussed. One of these models assumes that stretching

is due to a learning process. This model was shown to be limited in

several respects. Another model considers stretching to be the direct

result of a system which is governed by a threshold. This relativistic

model based on Jones' theory was shown to have a basis for predicting

not only octave stretch, but the stretching of other intervals as well.

This stretching of subjective intervals as shown in equation 14, depends

upon not only the frequency involved in the task but also upon a

threshold value which can differ across individuals. Finally,

implications of both a logarithmic spiral representation of frequency and

the threshold model were discussed with.respect to a scaling solution

based upon notes from several octaves.

These issues will be explored in two studies. The first experiment

will be a replication of prior studies of octave stretch. Results will

be evaluated on the basis of three models: (1) a Galilean model, which

assumes that octave adjustments will be veridical; (2) a model based

upon Terhardt's theory; this model assumes stretch is due to learning which takes place during early experience with speech; pitch shift

values will be taken from Terhardt's 1974 article; (3) the relativistic

model described by equation 14; based upon individuals' observed

adjustments, estimates of the threshold value will be attained.

A second study will look at the subjective representation of

frequency as recovered by multidimensional scaling. Tones from three

octaves will be employed as stimuli so that relationships both between and within octaves can be represented.

Hypotheses relating to the two tasks of interest are summarized below:

I. In a frequency adjustment task, wherein target frequencies .require

doubling:

1. Octave stretch will be predicted by a model using subjective

Lorentz transformations. Different subjects will exhibit

different threshold values.

2. The specific octave stretch model based on the theory outlined

by Jones will more closely fit the observed settings than will

models based on Terhardt's theory or a Galilean model.

II. In a scaling study involving tones from several octaves:

1. Dissimilarity measures for intervals occuring in higher

octaves will be larger than for those occuring in lower octaves.

2. The logarithmic spiral representation may be evidenced in a

multidimensional subjective scaling representation involving

at least tone height and chroma, and perhaps other dimensions

corresponding to particularly salient musical intervals.

3. Individuals will differ in how they weight the recovered

dimensions. CHAPTER FOUR

Experiment One

Experiment one involved a frequency adjustment task wherein

subjects were asked to double the frequency of a standard tone.

M e t h o d

Subjects Eight subjects were paid $4.00 each for participation in the experiment. Subjects were required to have had at least 2 years of musical training. According to their reports, they had an average of

8.9 years of musical training and played an average of 3 instruments.

Apparatus Pure tones were generated using a Wavetek Model 171

Synthesizer/Function Generator. Two tones can be generated with the

Wavetek Model 171. One tone is fixed by setting switches to a particu­ lar frequency. The other tone is set by turning a dial. The frequency generated is accurate to .005% of the setting. Subjects heard these tones at 40 dB SPL over Koss A440 headphones. A Hewlitt Packard Model

5223L frequency counter recorded the setting of the variable tone.

Stimuli The standard tone took on ten frequency values: 200,

261.63, 300, 500, 587.33, 700, 1000, 1318.5, 1500 and 2000 Hz.

Procedure Subjects were instructed that they would hear a standard tone over the headphones. This tone corresponded to one set by the experimenter using the switches on the Wavetek. By flipping a switch

57 58 on the Wavetek, subjects could hear a second tone, the frequency of which they could adjust by turning the dial. Subjects did not see the setting of the standard tone, and the dial they adjusted had no markings. They were asked to adjust the variable tone to be an octave higher than the standard tone. They were allowed to freely switch back and forth between the standard and the variable tone, and they could take as long as they wanted to make the adjustment. Subjects were in­ structed to bracket the octave, that is, to take the variable tone both too high and too low before making a final adjustment. When the subject was satisfied that the variable tone was an octave above the standard, he informed the experimenter who recorded the final frequency of the variable tone.

Two practice trials were given to familiarize the subject with the procedure. No feedback was given, unless the subject adjusted the variable tone to some other interval such as a fifth or two octaves.

In that case, the subject was informed of his error and asked to re­ adjust the variable tone. Four repetitions of each of the ten standard tones were given during a session, which lasted approximately one hour. The ten standard tones were presented to the subjects in four blocks of trials. Within each block the order of the ten tones was randomized. The initial setting of the variable tone was greater than an octave above the standard on one half of the trials per block and less than an octave on the other half. R e s u l t s

Table 3 shows the number of overestimations and underestimations

that were made for each standard frequency. Each frequency served as

the standard 30 times (six of the subjects completed 4 settings at

each standard frequency in the hour experiment. Two subjects only

made three settings in that time). As seen in the table, more over­

estimations were made for each standard frequency than underestimations.

In general as the standard frequency increased, the number of over­

estimations also increased.

Table 4 contains descriptive information in the form reported by

Ward. In this table are shown the standard frequencies employed in

the experiment (column 1); the group mean setting at each standard in

Hz (column 2); the group mean difference between the subjective octave

(SO) and the physical octave (PO) in cents (column 3); the standard

error of this mean difference in cents (column 4); the mean of the

standard errors of individual observer's settings in cents (column 5);

and the standard error of these standard errors (column 6).

The data in Tables 3 and 4 illustrate the consistency of the

stretching of subjective octaves relative to physical octaves. Octave

settings were generally overestimations as shown in Table 3. The mean

values reported in Table 4 indicate the magnitude of this overestimation.

This table further shows the degree of variability of the octave

settings.

The data were evaluated with respect to several models. The simplest

model considered was a Galilean model, which assumes that subjective 60

Table 3: Number of times that the variable tone was set higher or lower than an octave (n = 30)

Standard Tone Number of Number of (Hz) Overestimations Underestimations

200 22 8

261.63 17 13

300 19 11

500 22 8

587.33 25 5

700 26 4

1000 29 1

1318.5 22 8

1500 29 1

2000 30 0 Table 4: Means and standard errors for octave adjustment task.

Standard Mean SO SO-PO Std. error Mean std. Std. error Tone error of std. errors (Hz) (Hz) (cents) (cents) (cents) (cents)

200 404.26 17.91 7.29 7.71 1.57

261.63 521.36 - 6.09 6.30 9.28 1.02

300 601.56 4.38 6.91 8.48 2.15

500 1003.12 5.26 3.86 5.46 0.87

587.33 1188.99 20.52 6.27 6.36 1.36

700 1424.20 29.07 6.88 6.00 1.16

1000 2020.81 17.50 2.53 5.22 1.45

1318.5 2657.68 13.19 5.70 7.86 2.06

1500 3051.76 29.02 6.58 5.40 1.26

2000 4108.06 45.40 9.49 7.15 1.84

O' 62

octaves are identical to physical octaves. The performance of this

model for each of the eight subjects is shown in Table 5. For each

subject, the average absolute deviation of the predicted values from

the observed values as well as the root mean square deviation are

shown. Values are based on ten data points for each subject, corres­

ponding to mean settings for the ten standard frequencies. Since

this model assumes subjects can estimate octaves with no error,

discrepancies from this model directly indicate the magnitude of

stretching.

The second model evaluated was that proposed by Terhardt. To

evaluate this model, it was necessary to calculate the frequency ratio

defining an octave for each of the standard frequencies used in the experiment. These were computed using equation 4 and pitch shift values given in Terhardt's 1974 article. For a frequency of 200 Hz,

this ratio was calculated to be 2.0325; for 261.63 Hz, the ratio was

2.0315; for 300 Hz it was 2.0308; and for 500 Hz it was 2.0275.

Though Terhardt's model was developed for frequencies ranging from 50

to 500 Hz, stretch becomes more pronounced as frequency increases above

500 Hz. To use Terhardt's model in the current study, therefore, it was assumed that the frequency ratio for octaves of standard tones above 500 Hz is equal to that for 500 Hz, or 2.0275. Predicted values of octave intervals based upon these frequency ratios are shown in

Table 6. According to Terhardt's formulations these predicted values are assumed to be the same for all subjects. Table 6 also shows the root mean square deviations of these predicted values from the observed octave settings as well as the average absolute deviations for each Table 5: Root mean square deviations and average absolute deviations of observed settings from values predicted by Galilean model.

** Subject RMSd* AD number (Hz) (Hz)

1 46.01 26.31

2 30.32 21.87

3 52.20 35.34

4 33.40 21.79

5 34.08 23.24

6 80.62 40.31

7 48.55 30.81

8 21.75 18.54

Root mean square deviation

Average absolute deviation 64

Table 6: Predictions and fit of Terhardt model.

Standard Predicted

Tone Setting

200 406.5

261.63 531.50

300 609.24

500 1013.75

587.33 1190.81

700 1419.25

1000 2027.50

1318.5 2673.26

1500 3041.25

2000 4005.00

Subject RMSD AD

1 25.79 17.76

2 10.46 8.69

3 28.23 16.22

4 17.44 13.90

5 28.85 23.77

6 61.78 30.39

7 26.46 14.04

8 11.17 8.23 65

subject. Each of these values is based on ten observed points, which

are the mean frequency adjustments for the standard frequencies.

Finally, the Lorentz threshold model was evaluated through curve

fitting, thus permitting an estimate of the threshold parameter..

Equation 11 was used, in which the SO was the frequency setting made by

the observer. A single parameter was estimated for each subject. This parameter corresponds to the threshold value, k. Equation 11 was fit

to the data using NONLIN, a nonlinear least squares curve fitting program. This program uses an algorithm developed by Marquardt (1963) which is basically an interpolation between Newton-Raphson and gradient methods. Results of this curve fitting are shown in Table 7. For each subject the value of the parameter, k, is reported which gave the best fit, as well as the root mean square deviation and the average absolute deviation. In Appendix 1, the observed settings and the settings predicted by this model are tabled for each subject.

Discussion

As illustrated in Table 3, the phenomenon of octave enlargement was evident for all standard frequencies. Though the initial setting of the variable tone was sometimes lower and sometimes higher than an octave, and though subjects were instructed to bracket the octave before making their final setting, subjects consistently set the variable tone higher than a physical octave. Table 4 gives the mean frequency setting in Hz for each standard frequency. In all cases it was larger than the physical octave except for a standard tone of 261.63 Hz, which is middle C. 66

Table 7: Root Mean Square Deviations, Average Absolute Deviations, and Estimated k Values for Threshold Model

Subject RMSD AD Best Fitting k Value

1 11.35 8.37 11,640

2 12.75 9.83 14,678

3 24.63 17.23 11,483

4 8.49 7.59 13,579

5 24.80 19.20 15,855

6 25.23 18.23 9,039

7 9.80 7.68 11,292

8 15.60 13.18 19,635 67

However, the mean frequency, though, smaller than a physical octave, was very close (PO = 523.26 vs. SO = 521.36). As shown in this table, the largest distortions occur for tones above 500 Hz. These findings are consistent with what has been found in other studies.

Three models were evaluated with respect to their ability to pre­ dict the subjects' settings. Not surprisingly, the Galilean model showed the poorest performance. This model predicts that subjective octaves are equivalent to physical octaves. It would be a promising model if subjects were equally likely to over and underestimate octaves, so that their mean settings were close to the correct physical values.

However, as seen in Table 5, the average absolute deviations of the predicted settings from the observed setting for this task ranged from

18.54 to 40.31. As expected, the two models which have a basis for predicting enlargement perform better. For Terhardt's model the average absolute deviations ranged from 8.23 to 30.39. For the threshold model, they ranged from 7.59 to 19.20. Clearly these models are more approp­ riate than the Galilean model. And the threshold model is somewhat better than Terhardt's model.

Comparing these three models in more detail, we find that on the basis of the size of the average absolute deviation, none of the subjects were best described by the Galilean model. Terhardt's model performed better than the threshold model for three subjects. For subject 8, the average absolute deviation was 8.23 under Terhardt's model and 13.18 under the relativistic model. For subjects 2 and 3 the average absolute deviation was very close for the two models, with

Terhardt's model yielding values of 8.69 and 16.22 respectively, and 68 the threshold model yielding values of 9.83 and 17.23 respectively. The other five subjects (subjects 1, 4, 5, 6, and 7) were better fit by the threshold model.

Looking more closely at the relativistic model, we find that the largest estimated threshold value was 19,634.6 for subject 8, and the lowest estimate was 9,039.3 for subject 6. The estimates for the remaining six subjects were all between 11,000 and 16,000. As the upper limit of hearing is generally between 15,000 and 20,000 Hz, the threshold values found in this study do not appear to correspond to this limit.

However, as indicated in chapter 3, there is some evidence to suggest that there may be other limits on the auditory system. For example, though a person may be able to hear tones above the frequencies which were estimated here as thresholds, he may not be able to make judgments about these tones. That is, a tone of 17,000 Hz may not sound "higher" than a tone of 16,000 Hz, though both will be heard as sounds.

The range of the estimated threshold values reflects the fact that subjects differed in the amount of stretch they exhibited. This finding is one which Terhardt's model does not address. For example, subject 8, whose threshold value was estimated to be 19,634.6, exhibited the least amount of stretching. Of the 30 adjustments this subject made, the largest deviation of the subjective octave from the physical octave was

48 Hz. Subject 6, on the other hand, whose threshold value was estimated to be 9,039.29, exhibited pronounced stretching. The largest deviation for this subject was 281 Hz. Under either the Galilean model or

Terhardt's model, the predicted values for both of the subjects would be the same. In contrast, the relativistic model can accomodate these 69 individual differences.

In sum, three models were fit to the data. The Galilean model predicts no stretching. The model proposed by Terhardt predicts stretch, but the same stretch is predicted for all subjects. This stretch is predicted by computing, for each frequency, the ratio necessary to form an octave. Calculation of this ratio depends on knowing the pitch shift values appropriate for the particular frequency of interest.

Furthermore, as the size of the octave intervals are assumed to be learned through exposure to the complex sounds of speech, these ratios are strictly appropriate only for frequencies less than 500 Hz. In contrast, the threshold model, while predicting stretching, also allows for individual variability. The only value which must be known to predict stretch for any standard frequency is the threshold value, k. CHAPTER FIVE

Experiment Two

Experiment two involved a scaling task wherein subjects were asked to rate the similarity of pairs of pure tones drawn from several octaves.

Method

Subjects Fourteen subjects were paid $9.00 each for participating in the study. None of the subjects from experiment one participated in this experiment. Subjects were required to have had two years of musical training within the last five years, and were recruited by advertisement.

There were ten female and four male participants. Each subject completed a short questionnaire describing his or her musical training and experience. Subjects reported having an average of 9.8 years of musical training with an average of 3.5 years in the last 5 years. Subjects played an average of 3.1 instruments.

Apparatus Stimulus tapes were made by recording sequences of pure tone pairs. These pairs were generated by a Wavetek Model 159 Waveform

Generator controlled by a Cromemco Z-2 Microcomputer, and recorded on a

Nakiamichi 550 tape recorder. During the experimental session, playback by the Nakiamichi tape recorder was at a comfortable loudness level via a Kenwood Model KA-5700 amplifier and an Epicure Model 5 speaker.

Stimulus Materials Each trial consisted of a pair of tones. There

70 were 36 tones in the experiment, the tones being 12 notes of the

equally tempered scale from the fourth, fifth, and sixth octaves. As

there are 630 pairwise combinations of notes from these three octaves,

not all combinations of notes were presented. MacCallum (1978) •

demonstrated the ability of ALSCAL, a three-way multidimensional scaling

procedure, to recover the structure inherent in data sets when as many

as 60% of the stimulus pairs are omitted. Recovery was good as long as

the level of error in the data was low and there was a sufficiently

large sample size. Furthermore, when the same stimulus pairs were

omitted for all subjects, recovery was as good as when different stimulus

pairs were omitted, again assuming low error and a sufficiently large

sample size. In light of this, 50% or 315 stimulus pairs were randomly

sampled. The pairs included in the experiment included pairs in which

both tones were in the same octave as well as pairs with tones from

different octaves. These pairs ranged from tones that were one step

apart on the equally tempered scale (e.g., G5, G//5) to tones 34 steps

apart (e.g., C#4, B6) . These pairs are listed in Appendix 2.

Procedure Subjects were presented with prerecorded sequences of pairs

of tones. They were asked to judge the similarity of each pair after it was presented. Tones within each tone pair lasted one second. After

the second tone was presented, the subject had 10 seconds to make his

judgment. A 5,000 Hz warning tone signalled the start of a new trial.

This warning tone lasted 1.5 seconds and was followed by 2.5 seconds of

silence before the first tone of the next pair was presented. Subjects were tested in groups of 2 to 4. 72

Subjects were required to rate the pairs of tones on a scale of

1 to 7, with 1 describing two very similar tones and 7 describing two very dissimilar tones. These ratings were done on an answer sheet provided by the experimenter. During the instructions, subjects were encouraged to use the full range of the rating scale. In order to reduce the error as much as possible, subjects were instructed to listen to a set of 10 practice pairs and during that time to try to decide what dimension or dimensions were important to them. They were asked to use these dimensions in making their ratings throughout the experiment. The need for consistency was stressed. A copy of the instructions is in

Appendix 3.

Each of the 315 pairs of stimuli was rated twice by each subject, during two sessions which lasted approximately 1% hours each. Subjects rated one half of the stimuli twice rather than all of the stimuli once in order to get more stable measures of similarity and also to ensure that subjects were responding consistently. Each experimental session was divided into 4 blocks of trials, 3 of which contained 79 pairs of tones and 1 which contained 78 trials. Between each of these blocks, subjects were given a short rest. Order of presentation of the 4 blocks was randomized for each session.

After the second session subjects answered a questionnaire describing their musical training and experience and also indicating what dimensions they used in making their judgments.

Results

Each subject was represented by two dissimilarity matrices corres­ ponding to the two experimental sessions. Pilot data had indicated that subjects might be responding in a way inconsistent with multidimensional scaling models. That is, it was found that some subjects seemed to be attending to one dimension on some trials and to a different dimension on other trials. If that were the case, stimuli rated very similar on one trial might be rated as very dissimilar on another. To determine if this was occuring, the proportion of agreement across the two sessions was computed for each subject. Table 8 gives the cumulative proportion of responses differing by 0, 1, and 2 points on the rating scale.

Agreement across the two sessions was generally good. Therefore the two matrices were averaged for each subject and these 14 dissimilarity matrices were analyzed with the multidimensional scaling program ALSCAL.

This program was chosen as it can handle missing data. It was specified that the data were ordinal, matrix conditional, and that an individual differences model was desired.

Table 9 gives the mean rating across subjects for musical intervals within the fourth, fifth, and sixth octaves. Within an octave, there are

11 musical intervals which can be defined. These intervals correspond to the number of steps separating the tones comprising the interval. These intervals are, from smallest to largest: semitone (1 step apart), whole tone (2 steps apart), major third (3 steps apart), minor third (4 steps apart), perfect fourth (5 steps apart), diminished fifth (6 steps apart), perfect fifth (7 steps apart), minor sixth (8 steps apart), major sixth

(9 steps apart), minor seventh, (10 steps apart), and major seventh (11 steps apart)i The number of presentations of a particular interval within each of the three octaves was different due to the random sampling of pairs. For example, in the fourth octave there were two major thirds 74

Table 8 : Cumulative proportions of ratings differing by 0, 1, or 2 points across the two sessions.

Subject number 0 points 1 point 2 points

1 38.413 78.730 94.921

2 27.302 61.905 79.048

3 25.714 57.460 78.413

4 42.540 81.270 93.333

5 43.810 81.587 94.286

6 38.095 75.556 90.794

7 35.556 84.127 97.460

8 45.079 66.984 85.397

9 43.175 80.952 92.698

10 75.556 93.016 98.095

11 43.810 81.270 92.381

12 52.063 79.048 84.762

13 40.952 74.921 90.476

14 44.440 70.476 80.635 75

Table 9: Mean Ratings for musical intervals

Interval 4th octave 5th octave 6th octave

Semitone 2.225(7) 2.435(6) 2.334(3)

Whole tone 2.119(6) 2.312(8) 2.559(3)

Maj or 3rd 3.206(4) 3.50 (3) 3.543(5)

Minor 3rd 2.75 (2) 3.131(3) 3.447(6)

Perfect 4th 3.277(4) 3.277(4) 3.705(4)

Diminished 5th 4.012(3) 4.465(2) 4.679(4)

Perfect 5th 3.554(2) 3.833(3) 4.054(2)

Minor 6th 4.214(1) 4.50 (2) 5.059(3)

Major 6th 3.911(2) 4.071(1) 5.036(1)

Minor 7th 4.964(1)

Major 7th 5.107(1) 76

presented, while in the fifth octave there were three, and in the sixth

octave, six. However, all 14 subjects responded to each interval.

The number of intervals that each mean rating is based on is given in

parentheses.

To determine if the differences between these ratings were statis­

tically significant, a one-factor repeated measures analysis of variance was performed on the rating for each of the intervals from the

semitone to the major sixth. For each interval, the analysis involved mean ratings for the 14 subjects at each of three octaves. No differ­

ence was found for semitones (F = 1.51, p<.24). The difference was marginally significant for the major third (F = 2.74, p<.08) and significant for the whole tone (F = 3.65, p<.05), the minor third

(F = 6.53, p<.01), the perfect fourth (F = 11.57, p<.01), the perfect

fifth (F = 6 .68, p<.01), the minor sixth (F = 7.63j pC.Ol), and the major sixth (F = 11.11, p<.01).

Two scaling solutions were investigated. For the two dimensional solution SSTRESS was .43. Individual r-square values ranged from .463 for subject 8 to .845 for subject 7. The two dimensional solution is shown in Figure 6. The solution is basically a curvilinear dimension along which the notes from each of the 3 octaves cluster together. As shown in the figure, the order of notes within each octave does not correspond strictly to tone height. To determine if there was a third dimension which could account for this reordering, the three dimensional solution was also investigated. For this solution, SSTRESS was .335 and r-square values ranged from .642 to .832. The solution is shown in

Figures 7 and 8. Figure 7 is a plot of dimensions 1 and 2. Figure 8 is Figure 6s Two-dimensional scaling solution. Dimension l(horizontal) vs Dimension 2(vertical) Figure 7: Three-dimensional scaling solution. Dimension 1(horizontal) vs Dimension 2(vertical). ^4 00 Figure 8: Three-dimensional scaling solution. Dimension l(horizontal) vs Dimension 3(vertical). 80 a plot of dimensions 1 and 3. Table 10 gives the weights for each subject for both the two dimensional and the three dimensional solutions.

These weights indicate how important each of the dimensions are to the subjects.

Discussion

Some support was found for both a curvilinear representation of frequency and for the stretching of musical intervals. As shown in

Table 9, the mean dissimilarity ratings generally increased from the fourth to the sixth octaves. Though these tones all form identical musical intervals in terms of frequency ratios, the notes comprising the intervals in the upper octaves sound less similar to each other than notes having the same physical frequency ratio but occuring in a lower octave. This would be the case if intervals in the upper octaves were undergoing stretch as predicted by equation 14.

Consider how equation 14 might be applied in this task. The first tone of a given pair establishes a moving reference frame, based upon which the second tone of the pair is evaluated for a given target interval. The higher the frequency of this first tone, the larger the amount of stretch will be in the subjective interval. Thus, the second tone in the pair would have to be higher in frequency than is required for the physical ratio in order to be perceived as the same musical interval in a lower octave where less stretching occurs.

This stretching is also reflected in the scaling solution. As seen in Figure 6, the two dimensional solution is basically a circular arrangement in which, starting in quadrant III, the notes proceed from the fourth octave through the sixth octave. Notes in the 81

Table 10: Subject weights for the 2 and 3-dimensional scaling solutions

Two-dimensional Three-dimensional Solution Solution

Subject Number Dim. 1 Dim. 2 Dim. 1 Dim. 2 Dim. 3

1 .512 .496 .449 .476 .503

2 .771 .449 .688 .434 .394

3 .526 .504 .460 .479 .497

4 .705 .528 .605 .502 .422

5 .611 .516 .527 .500 .470

6 .694 .443 .609 .448 .444

7 .754 .526 .645 .515 .388

8 .475 .488 .429 .479 .478

9 .621 .504 .554 .495 .427

10 .666 .458 .583 .462 .461

11 .651 .549 .570 .522 .442

12 .576 .539 .500 .508 .488

13 .642 .588 .556 .541 .450

14 .520 .560 .444 .516 .494 fourth octave cluster within quadrant III. Notes in the fifth octave

are located primarily within quadrant II, but 3 tones from this octave

lie within quadrant I. The sixth octave is represented in quadrants I

and IV. Thus the lowest octave has less spread than the upper two

octaves. This may be reflecting that less stretching occurs in the lower

octave. As indicated earlier, within an octave the order of the notes

does not correspond to tone height. The arrangement of tones within an

octave appears to be reflecting the salience of certain musical intervals.

For example, in quadrant III, the note G#4 is close to F//4, which is a whole tone below it; to G4, which is a semitone below it; to C#4, which

is a fifth below it; and to A4, which is a semitone above it. Though

F//4, G4 and A4 are all close to G//4 in terms of tone height, the close­ ness of C#4 is due to the importance of the interval the perfect fifth.

However, the ordering of notes in octave 5 differs from that in octave 4. Likewise there is a different ordering for octave 6 . This may be due to the manner in which stimuli were chosen for this study.

Notes from the fifth octave can form musical intervals with notes not only in the same octave but also with notes from both the fourth and the sixth octaves. For example, F5 forms a fifth both with C6 and with A//4.

However notes in the fourth octave can combine in musical intervals only with notes in the fifth octave, as notes an octave lower were not included in the study. Similarly, notes in the sixth octave can only combine with notes in the fifth octave. Thus, the ordering of notes in octave 5 must reflect relationships tones in this octave have among themselves and also with tones from both other octaves. 83

The three dimensional solution was also examined to determine if more than two dimensions were interpretable. The three dimensional solution, shown in Figures 7 and 8, also exhibits some curvilinearity.

In Figure 7, dimensions 1 and 2 of this solution are shown. In this solution the notes, though still clustered by octave, exhibit more spread. However, the ordering of the notes within an octave does not differ greatly from the ordering in the two dimensional solution.

The third dimension, shown in Figure 8 with dimension 1, seems to be pulling flats from sharps. As seen in this figure, 11 out of 15 sharps lie above the horizontal axis, while 14 out of 21 flats lie below the axis.

Subject's weights for these dimensions are shown in Table 10.

These weights are, in general, equal for both dimensions of the two dimensional solution as well as for the three dimensions of the three dimensional solution. This may be due to the fact that the subjects were all highly trained musicians and so were a relatively homogeneous group.

In sum, hypothesis 11(1) was supported for most intervals.

Stretching was indicated by larger dissimilarity ratings for intervals in the higher dimensions. Hypothesis 11(2) received some support in that a curvilinear representation was recovered. However, subjects did not exhibit different weights for these dimensions. CHAPTER SIX

GENERAL DISCUSSION

Introduction

The octave interval, which is a particularly salient musical interval, has been shown to be a recurrent subject of psychological studies. In one form or another it has played a role in the develop­ ment of psychophysical scales relating frequency to pitch. Stevens and Volkman (1940) asked subjects to adjust a variable tone to be half the pitch of a standard tone. In so doing they were effectively asking for the note an octave below the standard. Their study was the basis for the construction of the mel scale. Ward (1954) specifically required subjects to adjust a variable tone to be an octave above a standard tone. This study produced a scale relating the subjective octave to the physically defined octave. In other studies, octave relationships have also played a prominent role. Evidence, already reviewed, suggests that in some tasks, tones separated by an octave are similar enough to be substituted for one another by both humans (Bachem, 1937) and rats

(Blackwell and Schlosberg, 1943).

Several questions surround the importance of this interval.

Consider again the definition of the octave interval from the Harvard

Dictionary of Music. "The octave interval is the most perfect consonance, so perfect that it gives the impression of duplicating the original tone,

84 85 a phenomenon for which no convincing explanation has been found."

Another often reported phenomenon regarding this interval, for which

"no convincing explanation has been found", is the stretching of subjective octaves relative to the physically defined ratio for .octaves of 2:1. Though this stretching has been addressed by several theories, the theories have in general not been quantitative and few have been rigorously tested.

This thesis attempted to develop and test a quantitative model for predicting octave stretch. The model developed has two features which set it apart from other theories. First, it is based upon a par­ ticular physical representation of frequency, the logarithmic spiral.

An important advantage of this representation is that the salience of the octave interval is seen to have a physical basis. Along the curvilinear dimension defining frequency, octaves are those tones which are separated by a rotation through 360°. Secondly, the model assumes that the perception of relationships among events, such as tones comprising an octave, is not a passive process. It is assumed that the observer perceives these relationships by establishing reference frames which are based on initial events and from which subsequent events can be evaluated. Furthermore, the degree to which this process can lead to veridical perception of events and inter-event relationships is limited by a threshold value.

This model, quantitatively expressed in equation 14, can predict the stretching of octave intervals as well as of other intervals. The amount of stretch predicted depends both upon the particular interval being evaluated as well as on the magnitude of the threshold, which 86 can vary across individuals.'. This variation in threshold values leads to a prediction of variability across subjects in tasks such as are typically performed in octave stretch studies. Also, individual variability in these tasks might further be accounted for by differences in the establishment of stationary and moving frames of reference.

This model, developed in chapter 3, was evaluated through two experiments that were reported in chapters 4 and 5. In the first study, the stretching of the octave interval was specifically investigated.

The threshold model was contrasted to two other models and shown to be generally superior. In the second experiment, a multidimensional scaling technique was employed to explore the subjective representation of frequency. Furthermore, independent evidence for the stretching of intervals other than octaves was also sought.

Experiment One

Data collected in experiment one (Table 4 of Chapter 4) were generally consistent with those reported in other octave adjustment tasks (e.g., Ward, 1954). As previously discussed these trends include stretching of subjective intervals; increased stretching as frequency increases; small intra-subject variability; and large inter-subject variability. In the current study there is clear evidence of these trends. Consider first the average data collected in this study. When the highest frequency (2,000 Hz) is used as the standard the greatest amount of stretch was observed. The mean frequency settings from the lowest to the highest show that all standards except middle C (261.63 Hz) resulted in stretching. 87

Interestingly, the amount of stretch, however, does not increase smoothly with increasing frequency. This is also consistent with

Ward's findings. The standard errors reported in this study were, in general, lower than those reported by Ward and may reflect the fact that, unlike Ward's subjects, all subjects used in the present study had had prior experience with pure tones. Intra-subject variability tended to be greater than Ward's values, particularly at the lower frequencies.

This discrepancy, however, appears to be due to one subject whose adjust­ ments were quite variable.

The threshold model was applied individually to each subject's data and was found to be superior to a simple non-threshold (Galilean) model for all subjects. For a majority of the subjects the threshold model also performed better than Terhardt's learning model. Briefly,

Terhardt's model assumed that the ratio defining a subjective octave is not exactly equal to the physically defined ratio. This discrepancy is assumed to be due to pitch shifts which reflect early experience with speech. Further, the value of these pitch shifts were assumed to be the same for all individuals. Thus the same subjective octave settings were predicted for all subjects in the current study. In contrast, the threshold model predicts different settings for each individual. This is due to the assumption that thresholds can be different across individuals.

In fitting the threshold model to the data, individual estimates of the subjects' threshold values were found.

The threshold model evaluated in experiment one is a special case of a more general approach. For the special case it was assumed that:

(1) there is a stationary frame identified with the observer and associated with 0 Hz; (2) the standard tone establishes a moving reference frame; and (3) the variable tone is adjusted relative to the moving frame such that an octave interval is perceived with respect to the stationary frame. This model, however, can be extended to one in which the stationary frame is associated with some other frequency. In this general form, the model can address findings different from the traditionally reported stretching.

For example, consider the adjustments made by subject 5. This subject, whose observed settings are shown in Appendix 1, was the subject most poorly fit by the simple threshold model. This is partly because, at the lowest frequencies (200, 261.63 and 300 Hz) this subject under­ estimated physical octaves instead of overestimating these intervals as the simple model would predict. These findings could however be accounted for by the more general case. If the stationary reference frame were associated with 440 Hz, different predictions would have been made. This amounts to assuming that subject 5 tends to have a modal level of attention centered on middle range frequencies instead of on very low ones. Under this assumption and with an estimated threshold of 15,855, the following form of equation 11 could be applied to predict the observed octave adjustment for a standard tone of 300 Hz:

(f - 440) + f . 2£ . , — 440 = Std rel std (f , -440)f - 1 , std______rel k2

This equation predicts = 299.97 and so a subjective octave of 599.97.

Thus the model predicts an underestimation. Though the difference between 89

this predicted value and the physically defined octave is small, it does

indicate the potential of this approach to address both under and over­ estimations. For simplicity, it was assumed in this study that all subjects had a stationary frame associated with 0 Hz, but this is not necessarily an accurate reflection of the frames the subjects used. In

fact, for different standard frequencies, the frequency associated with the stationary frame might be expected to vary. For example, for tones which occur within the frequency range of the fourth octave, subjects might use C4 as a stationary point; for tones in the frequency range of the fifth octave the stationary point might be shifted to C5. Shifting stationary frames could lead to overestimations at some frequencies and underestimations at others, although in general the magnitude of these distortions will increase as frequency increases.

In contrast to subject 5, the subjects best fit by the model, subjects 1, 3, and 7 did not reveal underestimation at the lower fre­ quencies. Subject 1 showed consistent stretching across all tones.

Subject 7 showed stretch for all tones except one, middle C. Subject 4 showed stretch for both lower and upper frequencies, but lack of stretch at three middle range frequencies. The simplest case of the model fits these subjects as they appear to stay primarily with one stationary frame. However, for other subjects, the estimated threshold parameters may be somewhat in error due to the fact that multiple frequencies may serve as the stationary frame while the fits were based on a 0 Hz stationary point.

Results of experiment one illustrate the potential of a relativis­ tic model. Another model, the model developed by Terhardt, is simpler to apply and also performed reasonably well. If pitch shift values are known, the predicted octave setting can be found by computation of a

frequency ratio using equation 5. However, this model is also a static model. As octave intervals are assumed to be based on learned speech

sounds, there is no basis in this model to predict variability in performance. For example, as seen in the present study, though over­ estimation was most commonly found, there was also some underestimation.

Similar findings have been remarked on by others (Sundberg and Lindqvist,

1973) and this phenomenon has been termed "negative stretch".

Terhardt’s model cannot address this finding, nor can it predict variability across subjects. The relativistic model, by requiring specification of reference frames and inclusion of a threshold value,

can account for these findings. In fact, a relatively simple case of

this model generally performed better than Terhardt’s model.

In light of this discussion, it is valuable to consider in more detail one of the major theoretical issues raised by the comparison of simple and general versions of the relativistic model. This issue concerns the manner in which the correct reference frames are established for a particular task and/or subject. This is a very difficult problem, as there are a variety of ways in which these frames can be conceived of, but one very worth tackling. This is because the general model has

the potential for addressing a great variety of experimental tasks as well as explaining individual differences. To illustrate the nature of the problem, consider what modifications of the model might be required if the octave adjustment task were changed in the following fashion. A minor change might involve requiring subjects to adjust a variable tone 91 to be an octave below a standard tone rather than above. In this case, the simple model might again be applied as a first approximation.

Assuming a threshold value of 10,000 Hz and a standard tone of 2,000 Hz, we find using equation 11 that the predicted setting is:

SO - 2000 + 2°°° -2& , 2 1 - = 979.6

Thus, this simple model predicts underestimation of a subjective lower octave. This prediction is consistent with the results of a fractionation study by Stevens and Volkmann (1940). In this study subjects were asked to adjust a variable tone to be half the pitch of a standard tone. The observed s°ttings were consistently underestimations of the correct half frequencies. For this study the simple model is probably a reasonable approach due to the fact that subjects were given a third tone of 40 Hz to use as an anchor or zero point when making their adjustments. But it is significant that in an earlier study by Stevens,

Volkmann, and Newman (1937) using an almost identical task subjects consistently overestimated the half frequencies. The single change in the experiment seems to be a critical one. In the 1937 study subjects were not provided with an anchor point. As the subjects who participated in these studies were not trained musicians, they may have had difficulty establishing a stationary frame without an anchoring point.

Furthermore, as the variability reported in these two studies was greater than that usually reported in octave adjustment studies, subjects may also have had difficulty in establishing the moving frame. If these observers were not establishing a moving frame associated with the 92 standard, but were instead trying to establish some moving frame which, when doubled, would give the standard tone with respect to their stationary frame, overestimation would occur. Thus, overestimation could be predicted by a version of the model similar to Caelli, Hoffman, and Lindman’s approach. Equation 8, developed for predicting half­ velocities, could be used to predict these overestimations of half­ frequencies .

Relativistic models, therefore, have potential for addressing both individual variability and variability across tasks. However, in order to apply a model of this type several difficult questions must be answered. What are the threshold values for the observers being evaluated? What is serving as a stationary reference frame? What is serving as a moving reference frame? If these questions can be answered, the model can account for diverse findings under one theoretical framework.

Experiment Two

The phenomenon of the stretching of subjective intervals, if indicative of a general process as described by equation 14, should be revealed in tasks other than adjusting frequencies. In experiment two, which was designed to look at the subjective representation of frequency, stretching of intervals other than octaves was evident. As shown in Table 9, most musical intervals were rated more dissimilar in the higher octaves than in the lower one. Furthermore, the results of this study indicated the importance of the octave interval, and lent support for a curvilinear representation of frequency. 93

Table 11 gives the rank ordering from low to high in terms of dissimilarity ratings for musical intervals used in this study. Some intervals were comprised of notes both of which were in the same octave (columns 2, 3, and 4). Other intervals involved one tone from the fourth octave and one from the fifth (column 5) or one from the fifth octave and one from the sixth (column 6). For example, in the fourth octave the interval with the smallest dissimilarity rating was a whole tone, while the interval with the largest dissimilarity rating was a minor sixth.

The importance of the octave interval is also illustrated in

Table 11. For musical intervals involving notes from the fourth and fifth octave, there were only two intervals which were rated as more similar than the octave. These were the semitone and wholetone inter­ vals, which involve tones differing by only one or two steps of the scale. Similarly, only a semitone was rated more similar than an octave for intervals with tones from the fifth and sixth octaves. Within each octave, tone height is evidently influencing the ratings, as intervals involving tones far apart in terms of frequency are generally rated as more dissimilar than tones close together. For example, in those cases where a major seventh was included, this interval was always rated as the least similar. This interval is composed of tones at opposite ends of the scale (e.g., C5, B5) . However, tone height is not the entire basis for the ratings. For example, in all three octaves tones comprising a major third (four steps apart) appear before tones comprising minor thirds (three steps apart). The intervals that are out of order with respect to tone height correspond to musically important 94

Table 11: Rank ordering of dissimilarity ratings by octave

Column 1 give each musical interval. Columns 2 to 6 give the ordering of these intervals with respect to dissimilarity ratings.

Interval 4th octave 5th octave 6th octave 4-5 5-1

Semitone (ST) WT WT ST ST ST

Whole tone (WT) ST ST WT WT 0

Minor 3rd (m3) M3 M3 M3 0 M3

Major 3rd (M3) m3 P4 m3 M3 P4

Perfect 4th (P4) P4 m3 P4 P4 WT

Diminished 5th (d5) P5 P5 P5 m3 P5

Perfect 5th (P5) M6 M6 d5 P5 m3

Minor 6th (m6) d5 d5 M6 M6 M6

Major 6th (M6) m6 m6 m6 m6 d5

Minor 7th (m7) — m7 — m7 m7

Major 7th (M7) — M7 — M7 M7

Octave (0) 95

intervals.

The two-dimensional scaling solution based on these dissimilarity

ratings and shown in Figure 6 was basically a circular arrangement.

This circularity allows notes an octave apart to come into closer spatial proximity than they would if frequency were represented by a

linear one-dimensional continuum. Tone height was reflected in that notes within each octave were grouped together and the octaves were spaced along the circle starting with the fourth and ending with the sixth. The ordering of the notes within each octave, which did not correspond to tone height, reflected the differential importance of some musical intervals.

This subjective representation of frequency differs from others that have been reported. A major reason for this was the inclusion in this study of stimuli from three octaves. Other scaling studies have con­ centrated on notes from a single octave (e.g., Krumhansl, 1979) or have used as stimuli musical intervals rather than individual tones (e.g.

Levelt, Van de Geer, and Plump, 1966). Thus the representations they recovered were limited. Krumhansl's study explored relationships within an octave. Levelt et al. looked at intervals without regard to the particular tones comprising them. The current study examined relation­ ships both within and between octaves. Support was found for a curvilinear subjective representation in which both octave relationships and tone height are important. 96

Implications for Further Research

The present studies, though showing support for the model proposed were limited in several ways. In the octave adjustment task, the model was evaluated by curve fitting with one parameter being estimated from the data. Ideally, independent estimates of the threshold should be obtained and these values used to predict stretch.

Another limitation of this study involves the stimuli chosen as standards. Ten frequencies ranging from 200 to 2000 Hz were employed in this study. To more rigorously test the model, more frequencies at the upper end of the frequency range should be incorporated as this is the range where stretching primarily occurs. Alternatively, a more rigorous test would involve devising experimental procedures which would be capable of leading subjects to adopt particular frames of reference as their stationary and moving frames. In that case, predic­ tions for identical tasks could be for overestimations in some cases and underestimations in others. One last approach to testing the model more thoroughly would involve using more intervals such as fourths or fifths and predicting observed settings for these intervals using equation 11.

With respect to finding a subjective representation of frequency through scaling techniques, there are several approaches which could be taken. One would involve using tones from only two octaves as stimuli and gathering data for all pairwise comparisons. In this way informa­ tion on all of the musical intervals would be available. Another approach would involve gathering the same ratings from nonmusicians as well as musicians. This would give information as to whether the 97 salience of some musical intervals was due to learning. This had been originally planned as a part of the current studies. However, pilot studies indicated that nonmusicians appeared to be flipping from dimension to dimension at different points in time. That is, tones rated as being very similar in one session would be rated as very dis­ similar in a second session. Data such as these are not appropriately analyzed by a scaling model which assumes that distances are formed as weighted sums of differences across dimensions.

Summary and Conclusions

This paper developed a model for octave stretch. This model differed from models previously proposed in that (1) it was based upon a particular physical representation of frequency, the logarithmic spiral; and (2) it involved assuming that a threshold exists which places certain restrictions on the accuracy of perception. This model was examined through two studies.

In experiment one subjects were required to adjust the frequency of a variable tone to be an octave above a standard tone. Results of this study supported hypothesis 1(1). Subjects exhibited subjective stretching of octave intervals. Further, subjects displayed different degrees of this stretching. Hypothesis 1(2) was also supported. The threshold model was found to be generally superior to a Galilean model and superior to Terhardt's model for a majority of the subjects.

Experiment two employed multidimensional scaling to examine the subjective representation of frequency. Hypothesis 11(1) was supported by this study. Evidence for stretch was found in the dissimilarity

ratings of intervals in the three octaves. For most intervals, dis­

similarity measures were greater for tone pairs from the upper octaves

than for the lower octave. Hypothesis 11(2) was not totally

supported. Evidence was found for a curvilinear dimension, but this dimension was basically circular in shape. This dimension incorporated both tone height and tone chroma. Hypothesis 11(3) was not supported.

Subjects weighted the dimensions approximately equally. This may

reflect the fact that the subjects, all being trained musicians, were a relatively homogeneous group.

The model developed in chapter three received some support from the two studies performed. The more general form of this model has the ability to address both individual differences and task specific differences. Thus it has potential for addressing a wide range of experimental findings. The support offered by the current studies in conjunction with the potential the model offers for incorporating many studies under a single theoretical framework indicates the value of pursuing further research using this model. APPENDIX A

Observed Frequency Adjustments and

Values Predicted by Threshold Model

99 Observed Frequency Adjustments and Values Predicted by Threshold Model for Subject 1 k = 11,640 Standard Observed Predicted Frequency Value Value 200 410.25 400.12 261.63 524.75 523.52 300 602.75 600.40 500 1001.75 1001.85 587.33 1183.75 1177.67 700 1417,00 1405.10 1000 2010.75 2014.98 1318.5 2650.75 2671.72 1500 3074.50 3051.53 2000 4121.75 4125.49 Observed Frequency Adjustments and Values Predicted by Threshold Model for Subject 2 1c = 14,678

Standard Observed Predicted Frequency Value Value 200 403.33 400.07 261.63 528.00 523.43 300 594,00 600.25 500 995.00 1001.16 587.33 1178.00 1176.55 700 1417.33 1403.20 1000 2029.33 2009.37 1318.5 2686.33 2658.63 1500 3035.00 3032.00 2000 4065.33 4077.13 Observed Frequency Adjustments and Values Predicted by Threshold Model for Subject 3 k = 11,483 Standard Observed Predicted Frequency value Value 200 400.50 400.12 261.63 521.00 523.53 300 612.75 600.41 500 1007.00 1001.90 587.33 1193.50 1177.75 700 1427.50 1405.24 1000 2028.75 2015.40 1318.5 2676.50 2672.71 1500 3116.75 3053.00 2000 4099.50 4129.18 Observed Frequency Adjustments and Values Predicted by Threshold Model for Subject 4 k = 13,579

Standard Observed Predicted Frequency Value Value 200 408.50 400.09 261.63 525.50 523.45 300 605.75 600.29 500 992.00 1001.36 587.33 1165.75 1176.87 700 1394.00 1403.74 1000 2009.00 2010.97 1318.5 2677.25 2662.34 1500 3044.50 3037.52 2000 4084.75 4090.71 Observed Frequency Adjustments and Values Predicted by Threshold Model for Subject 5 k = 15,855

Standard Observed Predicted Frequency Value Value 200 395.50 400.06 261.63 517.25 523.40 300 592.25 600.20 500 1005.25 1001.10 587.33 1203.00 1176.28 700 1449.50 1402.74 1000 2014.50 2008.02 1318.5 2609.25 2655.49 1500 3003.75 3027.34 2000 4085.00 4065.74 105

Observed Frequency Adjustments and Values Predicted by Threshold Model for Subject 6 k = 9,039

Standard Observed Predicted Frequency value Value 200 401.75 400.20 261.63 511.00 523.70 300 596.00 600.66 500 1005.50 1003.08 587.33 1193.50 1179.66 700 1434.50 1408.50 1000 2027.25 2025.09 1318,5 2636.50 2695.60 1500 3054.50 3087.43 2000 4244.00 4217.07 106

Observed Frequency Adjustments and Values Predicted by Threshold Model for Subject 7

'k = 11,292

Standard Observed Predicted . frequency Value Value 200 407.25 400.13 261,63 518,75 523.54 300 603.00 600.42 500 1010.50 1001.20 587.33 1199.75 1177.85 700 1421.75 1405.42 1000 2019.25 2015.93 1318.5 2666.50 2673.96 1500 3051,75 3054.87 2000 4135.50 4133.88 Observed Frequency Adjustments and Values Predicted by Threshold Model for Subject 8 = 19,635

Standard Observed Predicted frequency Value Value

200 407.00 400.04 261.63 524.67 523.35 300 606.00 600.14 500 1008.00 1000.65 587.33 1194.67 1175.71 700 1432.00 1401.78 1000 2027.67 2005.21 1318.5 2658.33 2649.00 1500 3033.33 3017.72 2000 4028.67 4042.38 APPENDIX 3

Tone Pairs Used in Scaling Study

108 109 Block 1

1. A5 B6 41. F#4 A6 2.C5 B5 42. D5 E5 3. A4 B6 43. C#5 F#5 4. F#4 A#4 44. D#6 E6 5. C#4 B6 45. F4 A#4 6. E4 F#4 46. G5 A5 7. E6 G#6 47. G4 B6 8. E4 B6 48. C6 G#6 9. C4 D5 49. G4 D5 10.C#5 A6 50. C4 F6 11. G6 B6 51. C4 F5 12. F4 G#6 52. D#5 D#6 13. D#4 G#6 53. C#4 A6 14. D6 D#6 54. F#5 D6 15. F4 B5 55. G5 B6 16. G#6 B6 56. G#4 A6 17. C4 C#4 57. G4 G6 18. A4 D#6 58. E4 D5 19. D6 G#6 59. G4 D6 20.G4 E5 60. C6 C#6 21. G5 B5 61. E5 A5 22. F#4 A# 6 62. C#5 B6 23. C5 F6 63. E5 B5 24. F4 B4 64. B4 G#5 25. D6 F#6 65. C5 F#6 26. G#5 A#5 66. D#4 D6 27. G#4 E5 67. D#4 F5 28. C5 B6 68. C#5 B5 29. C#4 D#4 69. C4 D4 30. D#4 G5 70. C5 D5 31. C#4 F6 71. G5 G#5 32. C4 F#6 72. B4 B6 33. A4 F5 73. G#4 D#6 34. D#5 F6 74.C#4 G5 35. D#4 C#5 75. G5 D6 36. C6 F6 76. F#4 C6 37. B4 A#6 77.D4 C6 38. D6 A# 6 78. E4 C#5 39. F5 F#6 79. D#4 F#5 40. D4 E5 110

Block 2

1. E5 F#4 41. F4 E5 2. F4 G#5 42. D#5 C#6 3. D5 D6 43. E4 F6 4. G5 D#6 44. D#4 E4 5. D5 A5 45. F#4 A4 6, G#4 C6 46. B4 C#5 7. G#5 B5 47. F#4 E5 8. A4 G5 48. B4 E6 9. D#4 A4 49. C5 A6 10. C#5 A5 50. G#4 C5 11. D#4 A#4 51. C5 G#6 12. C4 C#6 52. D#5 A5 13. F5 E6 53. E5 B6 14. F5 B6 54. C#5 F5 15. F4 G#4 55. E5 A6 16. C#6 G6 56. G#5 B6 17. F6 G6 57. A# 5 F#6 18. C6 D#6 58. B4 D5 19. F4 F5 59. G#5 A#6 20. D5 D#5 60. D4 D5 21. D4 E6 61. B5 B6 22. D4 F5 62. F#6 A#6 23. D#5 A#5 63. D5 C#6 24. D4 A# 5 64. A4 F#5 25. A4 A# 6 65. D#4 G6 26. F6 G#6 66. E4 A4 27. D#4 D5 67. G#4 B6 28. F#5 G#5 68. D#5 G6 29. E4 A5 69. F4 D#6 30. F6 B6 70. G5 E6 31. C4 A5 71. A5 B5 32. C#4 E4 72. D5 B5 33. F4 E6 73. F#5 B5 34. C4 B5 74. F4 C6 35. F#4 D6 75. B4 G#6 36. A4 A# 4 76. A5 G#6 37. C#5 C#6 77. D#5 G5 38. C#4 D6 78. F#4 F#6 39. D#5 D6 79. F#4 B5 40. D4 B4 Ill

Block 3

1.D#5 B6 41. C#4 E5 2. B4 F#5 42. C4 E5 3. E4 C#6 43. C4 G6 4. D4 C#6 44. D#5 F5 5. E4 F5 45. C#6 A6 6. C5 D#6 46. A#4 F5 7. A# 4 D#5 47. F#4 G#4 8. A5 D#6 48. C#5 D#6 9. C5 C#6 49. F#5 D#6 10. G6 A6 50. C#6 F#6 11. G4 F6 51. B4 A5 12. C#4 G#6 52. A5 D6 13. E4 A# 6 53. F4 B6 14. D5 A# 5 54. G#5 G#6 15. G#4 A#6 55. G#5 F#6 16. F#4 G#6 56. F5 D6 17. C6 E6 57. F#5 A#6 18. B4 C5 58. A# 4 A# 6 19. G#4 F5 59. A4 A#5 20. A# 4 D#6 60. A# 4 G6 21. F#5 A#5 61. E4 F#5 22. E4 C6 62. G6 G#6 23. F#5 A6 63. D#4 C5 24. G#5 A5 64. A4 D#5 25. B5 A#6 65. G#4 A5 26. D4 B5 66. D#6 A6 27. C4 A#6 67. D#5 A6 28. F5 A#5 68. D5 B6 29. F4 F6 69. E4 C5 30. A4 C6 70. C4 F4 31. G#4 A#4 71. A# 4 D5 32. D#5 C6 72. E5 F5 33. C#6 G#6 73. C4 G4 34. F#4 G4 74. F#4 D#5 35. F#5 G#6 75. B5 D6 36. A# 4 C6 76. G4 G5 37. F#4 F6 77. F#4 C#6 38. D#4 F#6 78. D#5 F#5 39. D4 A5 79. F5 C6 40. B5 G6 112

Block 4

1.C5 F#5 41. D#4 D#5 2. A5 A6 42. G4 E6 3. G4 A4 43. F#5 E6 4. F5 A6 44. D6 G6 5. A4 A6 45. G4 C#6 6. C#5 F#6 46. D#4 G#4 7. D4 B6 47. C#4 A#6 8.A4 C#6 48. A4 B5 9. E6 G6 49. C#4 C5 10. F#4 D5 50. C4 A4 11. B5 E6 51. C#5 D5 12. C4 E4 52. A4 E6 13. G#5 C6 53. D4 D#4 14. D6 B6 54. E5 A# 6 15. A# 4 D6 55. C6 G6 16. E5 G6 56. D4 G#5 17.D6 E6 57.F4 F#5 18. A# 5 C6 58. B5 C#6 19. A5 A# 5 59. G#4 A#5 20. G#4 F#5 60. F4 C#5 21. F#4 G6 61. C4 C6 22. F#5 F6 62. C4 G#5 23. A# 4 B4 63. G#4 A4 24. B4 B5 64. F4 A6 25. D4 G6 65. G4 C6 26. F5 G6 66. F4 C5 27. D#6 G6 67. F#4 G5 28. E4 G#5 68. D#4 0#6 29. F5 A#6 69. D6 F6 30. B4 C#6 70. D5 C6 31. C4 E6 71. A4 AS 32. C5 D#5 72. A# 4 E6 33. E5 F6 73. E4 E6 34. A4 E5 74. A# 4 G5 35. C4 C5 75. F4 A5 36. G4 A# 4 76. D5 G6 37. F6 A#6 77. D#4 B4 38. F4 G5 78. G4 D#5 39. G4 C5 40. D5 G#6 APPENDIX C

Instructions for Scaling Study

113 In this experiment you are going to be asked to rate how similar two pure tones are to one another. In spite of the fact that they are pure tones, we would like you to make judgments about them using the musical dimensions you ordinarily apply to musical tones.

You will hear a pair of tones sounded successively. After the second tone of the pair is presented you will be required to rate on a scale of

1 to 7 how similar the two tones sound to you. A scoring sheet will be provided for you to make these ratings. If the tones sound very similar to you, you should rate them 1. If they sound very dissimilar you should rate them 7. Some tones will sound very similar and some very dissimilar so don't hesitate to use these extreme scores. No tone will ever be paired with itself. As shown on the scoring sheet, numbers between 1 and 7 indicate intermediate amounts of similarity or dissimilarity. You will hear many different pairs of tones, so you should try to use the full range of the scale in making your judgments.

Before each pair of tones is presented, you will hear a high-pitched warning tone to let you know a tone pair is about to be presented.

After this warning tone there will be a short silence and then you will hear the two tones. After the second tone of the pair is sounded, you will have 10 seconds to mark your rating of the pair on the scoring sheet. You will be rating many pairs during the experiment — do not leave any blanks on the scoring sheet. There will be four blocks of tone pairs — after each you will have a short rest period.

Before making your judgments you should decide on a context for making your judgments. That is, you should decide what characteristics, attributes, of dimensions are most important to you. Before the 115 experiment begins you will hear several tone pairs to familiarize you with the range of the tones involved and with the procedure. While listening to these pairs try to decide what dimensions are important to you. There are no right or wrong judgments. What is important is that you make your judgments conscientiously, accurately, and consistently throughout the experiment. Any questions? 116

REFERENCE NOTES

1. Hahn, J., and Jones, M.R. Invariants in Auditory Frequency Rela­

tions. Submitted for publication.

2. Jones, M.R., and Hahn, J. A prospectus for a theory of space­

time expectancies: Part I. Some determinants of expectancy.

Unpublished manuscript.

3. Jones, M.R., and Hahn, J. A prospectus for a theory of space­

time expectancies: Part II. Expectancy and perceptual learning.

Unpublished manuscript.

4. Jones, M.R., Kidd,G., and Hahn, J. Space-time expectancies in

auditory pattern memory. (OSURF Technical Report No. 2)

Columbus, Ohio: Ohio State University, 1978. 117

REFERENCES

Apel, W. Harvard Dictionary of Music. Belknap Press of Harvard

University Press, Cambridge, Massachusetts, 1972.

Attneave,F, and Olson, R.K. Pitch as a Mediums A New Approach to

Psychophysical Scaling. American Journal of Psychology, 1971,

84, 147-166.

Arzelies, H. Relativistic Kinematics. Oxford: Pergamon Press,

1966.

Bachem, A. Tone height and tone chroma as two different pitch qual­

ities. Acta Psvchologica. 1950,2,80-88.

Bachem, A. Time factors in relative and absolute pitch determination.

Journal of the Acoustical Society of America, 1954, 26, 5, 751-753.

Backus, J. The Acoustical Foundations of Music. New Yorks W. W.

Norton and Company, Inc., 1969.

Balzano, G. On the basis of similarity of musical intervals: a

chrcmametric analysis. Journal of the Acoustical Society of America,

1977, 61, S51.

Blackwell, H.R. and Schlosberg, H. Octave generalization, pitch dis­

crimination, and loudness thresholds in the white rat. Journal of

Experimental Psychology, 1943, 33, 407-419.

Burns, E.M. Octave adjustment by non-western musicians. Journal of the

Acoustical Society of America. 1974, 56, S25-26.

Caelli,T., Hoffman, W., and Lindman, H, Subjective Lorentz transfor­

mations and the perception of motion. Journal of the Optical

Society of America, 1978, 68, 3, 402-411. 118

Deutsch, D. The psychology of music. In E.C. Carterette and M.P.

Friedman (Eds) Handbook of Perception (Vol. 10), New Yorks

. Academic Press, 1978.

Deutsch,D, Music Recognition. Psychological Review, 1969, 76, 300-307.

Dowling, W.J. The 1215-cent octave:convergence of western and non­

western data on pitch scaling. Journal of the Acoustical Societyof

America. 1973, 53, 373.

Elfner, L. Systematic shifts in the judgment of octaves of high fre­

quencies. Journal of the Acoustical Society of America, 1963, 36,

2, 270-276.

Jones, M.R. Time, our lost dimension: Toward a new theory of percep­

tion, attention, and memory. Psychological Review, 1976, 83,

323-355.

Jones, M.R. Only time can tell: An essay on the topology of mental

space and time. In press.

Krumhansl, C.L. The psychological representation of musical pitch in

a tonal context. Cognitive Psychology, 11, 1979, 346-374.

Krumhansl,C.L., and Shepard, R.N. Quantification of the hierarchy of

tonal functions within a diatonic context. Journal of Experimental

Psychology: Human Perception and Performance, 1979,jS, 579-594.

Kruskal, J.B. Nonmetric multidimensional scaling: A numerical method.

Psychometrika, 1964, 29, 28-42.

Levelt, W.J.M., Van de Geer,J.P., and Plomp, R. Triadic comparisons

of musical intervals. The British Journal of Mathematical and

Statistical Psychology, 1966, 19, 163-179. 119

MacCallum, R.C. Recovery of structure in inconplete data by Al^CAL,

Psychometrika, 44, 1, 1978, 69-74.

Marquardt, D.w. An algorithm for least-squares estimation of nonlinear

parameters, Journal of the Society of Industrial Applied Mathematics,

1963, 11, 2, 431-441.

Miller, J.P. and Carterette, E.C. Perceptual space for musical struc­

tures, Journal of the Acoustical Society of America, 1975, 58, 3,

711-720.

Ogden, R.M. The tonal manifold. Psychological Review, 1920, ^7, 136-146,

Rindler, W. Essential Relativity. New York: Van Nostrand Reinhold,

1969.

Risset, J.C. Pitch control and pitch paradoxes demonstrated with

computer synthesized sounds. Journal of the Acoustical Society of

America, 1969, 46, 88,

Rothenberg, D. A model for pattern perception with musical applications.

Part Is Pitch structures as order-preserving maps. Mathematical

Systems Theory. 1978, 11, 199-234.

Rothenberg, D. A model for pattern perception with musical applications.

Part II: The information content of pitch structures. Mathematical

Systems Theory. 1978, 11, 353-372.

Rothenberg, D. A model for pattern perception with musical applications.

Part III: The graph embedding of pitch structure. Mathematical

Systems Theory, 1978, 12, 73-101.

Ruckmick, C.A. A new classification of tonal qualities. Psychologica1

Review, 1929, 36, 172-180. 120

Shepard, R.N. Circularity in judgments of relative pitch. Journal of

the Acoustical Society of America. 1964, 36, 12, 2346-2353.

Shepard, R.N. Representation of structure in similarity data: Problems

and prospects. Psychometrika, 1974, 39, 373-421.

Shouten, J.F, The residue revisited.' In Frequency Analysis and

Periodicity Detection in Hearing. Plomp, R. and Smoorenburg, G.F.(Eds)

A.W. Sijthoff, Leiden, 1970.

Stevens, S.S. and Volkmann, J. The relation of pitch to frequency:

A revised scale. .The American Journal of Psychology, 1940, 53,

329-353.

Stevens, S.S., Volkmann, J., and Newman, E.B. A scale for the measure­

ment of the psychological magnitude pitch. Journal of the Acoustical

Society of America, 1937, 8, 185-190.

Sundberg, J.E.F. and Lindqvist, J. Musical octaves and pitch. Journal

of the Acoustical Society of America, 1973, 54, 4, 922-928.

Terhardt, E. Pitch, consonance, and Harmony. Journal of the Acoustical

Society of America, 1974, jj5, 5, 1061-1069.

Terhardt, E. Psychoacoustic evaluation of musical sounds. Perception

and Psychophysics, 1978, 23, 6, 483-492.

Thompson, R.F. Foundations of Physiological Psychology. Harper and

Row Publishers, New York, 1967.

Thurlow, W.R. and Erchul, W.P. Judged similarity in pitch of octave

multiples. Perception and Psychophysics, 1977, 22, 2, 177-182.

Titchener, E.B. Experimental Psychology 2 , part 2, 1905, 232-248. 121

Ward, W.D. Subjective musical pitch. Journal of the Acoustical

Society of America, 1954, 26, 3, 369-380.

Ward, W.D. Musical Perception. In J.V. Tobias (Ed), Foundations of

modern auditory theory. (Vol. 1), New York: Academic Press, 1970.