Shepard, 1982
Total Page:16
File Type:pdf, Size:1020Kb
Psychological Review VOLUME 89 NUMBER 4 JULY 1 9 8 2 Geometrical Approximations to the Structure of Musical Pitch Roger N. Shepard Stanford University ' Rectilinear scales of pitch can account for the similarity of tones close together in frequency but not for the heightened relations at special intervals, such as the octave or perfect fifth, that arise when the tones are interpreted musically. In- creasingly adequate a c c o u n t s of musical pitch are provided by increasingly gen- eralized, geometrically regular helical structures: a simple helix, a double helix, and a double helix wound around a torus in four dimensions or around a higher order helical cylinder in five dimensions. A two-dimensional "melodic map" o f these double-helical structures provides for optimally compact representations of musical scales and melodies. A two-dimensional "harmonic map," obtained by an affine transformation of the melodic map, provides for optimally compact representations of chords and harmonic relations; moreover, it is isomorphic to the toroidal structure that Krumhansl and Kessler (1982) show to represent the • psychological relations among musical keys. A piece of music, just as any other acous- the musical experience. Because the ear is tic stimulus, can be physically described in responsive to frequencies up to 20 kHz or terms of two time-varying pressure waves, more, at a sampling rate of two pressure one incident at each ear. This level of anal- values per cycle per ear, the physical spec- ysis has, however, little correspondence to ification of a half-hour symphony requires well in excess of a hundred million numbers. I first described the double helical representation of Clearly, our response to the music is based pitch and its toroidal extensions i n 1978 (Shepard, Note on a much smaller set of psychological at- 1; also see Shepard, 1978b, p. 183, 1981a, 1981c, p. tributes abstracted from this physical stim- 320). The present, more complete report, originally ulus. In this respect the perception of music drafted before I went on sabbatical leave in 1979, has been slightly revised to take account of a related, elegant is like the perception of other stimuli such development by my colleagues Krumhansl and Kessler as colors or speech sounds, where the vast (1982). number of physical values needed to specify The work reported here was supported by National the complete power spectrum of a stimulus Science Foundation Grant BNS-75-02806. It owes much to the innovative researches of Gerald Balzano is reduced to a small number of psycholog- and Carol Krumhansl, with whom I have been f o r t u n a t e ical values, corresponding, say, to locations enough to share the excitement of various attempts to on red-green, blue-yellow, and black-white bridge the gap between cognitive psychology and the dimensions for homogeneous colors (Hurv- perception of music. Less directly, the work has been ich & Jameson, 1957) or on high-low and influenced by the writings of Fred Attneave and Jay 'Dowling. Finally, I am indebted to Shelley Hurwitz for front-back dimensions for steady-state vow- assistance in the collection and analysis of the data and els (Peterson & Barney, 1952; Shepard, to Michael Kubovy for his many helpful suggestions on 1972). But what, exactly, are the basic per- the manuscript. ceptual attributes of music? Requests for reprints should be sent to Roger N. Shepard, Department of Psychology, Jordan Hall, Just as continuous signals of speech are Building 420, Stanford University, Stanford, California perceptually mapped into discrete internal 94305. representations of phonemes or syllables, Copyright 1982 by the American Psychological Association, Inc. 0033-295X/82/8904-0305J00.75 305 306 ROGER N. SHEPARD continuous signals of music are perceptually pitch (Balzano, 1980; Krumhansl & Shep- mapped into discrete internal representa- ard, 1979; Shepard, 1982) and time (Jones, tions of tones and chords; just as each speech 1976; Pressing, in press), the dimensions sound can be characterized by a small num- within which higher order musical units such ber of distinctive features, each musical tone as melodies and chords are capable of struc- can be characterized by a small number of ture-preserving transformations. perceptual dimensions of pitch, loudness, In this paper I confine myself to the case timbre, vibrato, tremolo, attack, decay, du- of pitch and to the question of how a single ration, and spatial location. In addition, psychological attribute corresponding, i n the much as the internal representations of case of pure sinusoidal tones, to a simple one- speech sounds are organized into higher level dimensional physical continuum of fre- internal representations of meaningful words, quency affords the structural richness req- phrases, and sentences, the internal repre- uisite for tonal music. One can perhaps sentations of musical tones and chords are readily conceive how structural complexity organized into higher level internal repre- is achievable in the dimension of time, sentations of meaningful melodies, progres- through overlapping patterns of rhythm and sions, and cadences. stress, but the structural properties of pitch seem to be manifested even in purely melodic The Fundamental Roles of Pitch and sequences of purely sinusoidal tones (Krum- Time in Music hansl & Shepard, 1979). In the absence of physical overlap of upper harmonics of the The perceptually salient attributes of the sort considered by Helmholtz (1862/1954) tones making up the musically significant and by Plomp & Levelt (1965), wherein does chords and melodies are not equally impor- this structure reside? tant for the determination of those higher order units. In fact, for the music of all hu- Cognitive Versus Psychoacoustic man cultures, it is the relations specifically Approaches to Pitch of pitch and time that appear to be crucial Until recently, attempts to bring scientific for the recognition of a familiar piece of methods to bear on the perception of musical music. Other attributes, for example, loud- stimuli have mostly adopted a psychoacous- ness, timbre, vibrato, attack, decay, and ap- tic approach. The goal has been to determine parent spatial location, although contribut- the dependence of psychological attributes, ing to audibility, comfort, and aesthetic such as pitch, loudness, and perceived du- quality, can be varied widely without dis- ration, on physical variables of frequency, rupting recognition or even musical appre- amplitude, and physical duration (Stevens, ciation. 1955; Stevens & Volkmann, 1940) or on The reasons for the primacy of pitch and more complex combinations of physical vari- time in music are both musical and extra- ables (de Boer, 1976; J. Goldstein, 1973; musical. From the extramusical standpoint Plomp, 1976; Terhardt, 1974; Wightman, there are compelling arguments, recently 1973). advanced by Kubovy (1981), that pitch and By contrast, the cognitive psychological time alone are the attributes that are "in- approach looks for structural relations within dispensable" for the perceptual segregation a set of perceived pitches independently of of the auditory ensemble into discrete tones. the correspondence that these structural re- From the musical standpoint a case can be lations may bear to physical variables. This made that the richness and power of music approach is particularly appropriate when depends on the listener's interpretation of such structural relations reside not in the the tones in terms of a discrete structure that stimulus but in the perceiver—a circum- is endowed with particular group-theoretic stance that is well known to students of mu- properties (Balzano, 1980, in press, Note 2). sic theory, who recognize that an interval In the case of the human auditory system, defined by a given physical difference in log moreover, the requisite properties appear to frequency may be heard very differently in be fully available only in the dimensions of different musical contexts. To elaborate on STRUCTURE OF PITCH 307 an example mentioned by Risset (1978, p. cally significant relations of pitch to emerge 526), in a C-major context the interval B- as invariant in these scales seems to be a F is heard as having a strong tendency to direct consequence of the assiduous avoid- resolve by contraction into the smaller in- ance, by the psychoacoustic investigators, of terval C-E, whereas in an F#-major context any musical context or tonal framework the physically identical interval (now called within which the listeners might interpret B-E#, however) is heard as having a simi- the stimuli musically. Attneave and Olson larly strong tendency to resolve by expansion (1971) showed that when familiar melodies, into the larger interval A^-F*. Moreover, as opposed to arbitrarily selected nonmusical Krumhansl (1979) has now provided system- tones, were to be transposed in pitch, listen- atic, quantitative evidence that the perceived ers required that the separations between the relations between the various tones within tones be preserved on the musically relevant an octave do indeed depend on the context- scale of log frequency—not on a nonlinearly induced musical key with respect to which related scale such as the mel scale. That the those tones are interpreted. How, then, are scale underlying j u d g m e n t s of musical pitch we to represent the relations of pitch be- must be logarithmic with frequency is in fact tween tones as those relations are perceived now supported by several kinds of empirical by a listener who is interpreting the tones evidence (Dowling, 1978b; Null, 1974; Ward, musically? 1954, 1970). Second, because of the unidimensionality Previous Representations of Pitch of scales of pitch such as the mel scale, per- ceived similarity must decrease monotoni- Rectilinear Representations cally with increasing separation between tones on the scale. There is, therefore, no The simplest representations that have provision for the possibility that tones sep- been proposed for pitch have been unidi- arated by a particularly significant musical mensional scales based on judgments made interval, such as the octave, may be per- in nonmusical contexts.