Contemporary Review, © 1987 Harwood Academic Publishers GmbH 1987, Vol. 2 pp. 1-61 Printed in the Photocopying permitted by license only

Music: A science of the ? Stephen McAdams Institut de Recherche et Coordination Acoustique/Musique, Paris, ()

There is an increasing interest in psychological studies of music for the advancement of both musical and scientific . An historical perspective of psychological considera- tions of music reveals a trend leading from physical thought through theories of sensation and finally up to modern cognitive . What might truly be called the field of can be shown to exist by an overview of the domains of research covered by contemporary theorists and researchers. Within the framework of one can demonstrate the importance of of the various dimensions and structures of music and of organizational processes underlying music listening. The relation between a musical structure and the form that is "accumulated" by a listener depends to a large extent on his or her musical within a given . A difference in experience may be hypothesized to result from the nature of mental schemata acquired by listeners. A great deal of work is still needed, however, to approach more affective and aesthetic aspects of musical experience.

KEY WORDS music psychology, , internal representation, organizational processes, , musical structure. There is one real, and graded, distinction between sciences like the biologies and the physical sciences. The former are unrestricted and their investigator must be prepared to follow their problems into any other science whatsoever.

Downloaded By: [McGill University] At: 22:23 22 August 2008 C.F.A. Pantin (1968, p.24)

Introduction Music is a fertile ground for the development of thought in the cognitive sciences, but why would a scientist of the mind dare attempt an analysis of the nature of musical experience? Can there be any realm of collective human production and response more obscure, mysterious and, finally, so absolutely personal and individual? At the same time, can any human science ignore one of the species' most unique capacities? The answer to this last question being an obvious "No!," the question arises of how to,proceed. With current knowledge drawn from the various human sciences, from diverse theories of music and from the speculations of practising 2 Stephen McAdams

, we must take stock of what we know and what we can formulate as a starring point for a psychology of music. A thorough scientific consideration of musical experience necessitates drawing from many fields including acoustics, psychology, sciences, artificial and . Although, any single researcher or thinker must necessarily limit the scope of his investigations, much of what today calls itself the psychology of music runs the opposite risk of narrowness of focus and lack of appropriate musical culture. There is a need at this point in the development of cognitive explorations into music to have a plan of approach and a vision of the totality with which to judge the relevance of more specific considerations. Included in this global vision is an intimate knowledge of music itself. The study of music has a great deal to offer an of the human mind. The starting assumption of the psychologist of music is that the structure and process of music can indicate the nature of certain mental structures and processes. The study of mental structures and processes is the domain of cognitive psychology, which seeks to understand internal (or mental) representations and the things that these representations allow one to do with music. The nature of the representations is inferred from the ways people listen to, memorize, perform, create and react to music. The psychologist of music seeks the neatest and most economical manner of describing music's structure in a way that most closely resembles the psychological processes by which music is created, reproduced and understood (Sloboda, 1985, p.ll). We may suppose, then, that since music is a uniquely human product, we can logically draw conclusions about the connection between the observed structure of music and the nature of the human mind that produces it to be heard by others. What might these conclusions be, and what do they contribute to a general understanding about human mental activity and about the possibilities of music itself? Another of the underlying beliefs of this endeavour is that psychologi- cal investigation and theory also have something to offer the musical disciplines. There are two main constraints, however: the work must Downloaded By: [McGill University] At: 22:23 22 August 2008 conform to the accepted procedures of scientific investigation and must, as well, succeed in being relevant to the experience of real music and not only limited to that of overly reduced pre- and proto-musical collections of , though these lower-level investigations will always be important in analysing the contribution of various psychological processes to the comprehension of music. I will attempt to demonstrate to both and the relevance of music psychology for contemporary musical and scientific thought. Starting with an historical perspective that traces psychological considerations of music from the ancient Greeks to modern cognitive psychology, I will then describe some of the main problem areas in the realm of music psychology in order to situate the framework for a more detailed discussion of the two main problems in as applied to an understanding of musical experience. These include the mental representation of the dimensions and structures of music, and Music: a science of the mind? 3

the organizational mechanisms underlying the mental processing of musical structures. The aim is to arrive at an understanding of the experience of musical form, as rich or poor as that may be depending on one's previous experience with the music of a given culture. The psychology of music clearly includes more areas of study than are treated in this article, but I have deliberately limited the scope of this volume to reception and processing of musical structure and the experience of musical form. Also, while this article is consciously biased towards the theory of psychological issues, it is balanced by the more specifically musical considerations of my musical colleagues in the other articles of this issue.

The cognitive trajectory The history of psychological considerations of music reveals a trajectory from physical thought to sensation, and cognition (Figure I).1 The Greeks of the 6th Century BC attempted an integration of the laws of nature with theories of . For , there was an important relation of ratios of small numbers between pitches (string lengths) to musical consonance. The apprehension of these ratios was a manifestation of a "higher" he called "pleasure-in-proportion." This sense was to be distinguished from the more base involved in ordinary perception. The idea was developed by Plato who made a psychological distinction between mere perception and an inbuilt response to proportion and the of the universe (the of the spheres). This response naturally leads one, he mused, to a state of being where the flux of the base senses is resolved in a higher state of equilibrium. In contemporary times this classical notion of resolution as equilibrium or a state of quiescence has yielded to a more dynamic homeostatic model of perception which takes into account the necessity within an organism of fluctuations around an optimal level. At a physiological level, things must constantly be in a state of flux or the Downloaded By: [McGill University] At: 22:23 22 August 2008 sensory systems cease to respond. The best known example of this is the fact that the eyes constantly make small random movements to keep the image of the environment moving across the retina. If this image is held in tiie same place, by pressing one's finger against the eyeball, for example, it slowly fades away. Similar kinds of "" or "" to constant stimuli have been shown in auditory perception. What this may imply at a higher, more musical, level is that the tensing and releasing functions in the development of musical ideas may characterize an aesthetic response by first defining a certain "optimal" state, and then departing from and returning to this state as a way of modulating the experienced tension. Another assumption in Pythagorian thought that is implicitly psycho- logical is the notion that there is an identical correspondence between some specific external, or physical, property (string length) and the resulting internal, or perceptual, property (sensation of pitch). This 4 Stephen Me Adams

Physicalists Downloaded By: [McGill University] At: 22:23 22 August 2008

"Natural" Psychoaesthetics Music Cognition Intonations and Microtonal Systems

Figure 1 This diagram represents a possible of the historical trajectory of music psychology (with time progressing from top to bottom) from the physically based thought of the ancient Greeks to modern cognitive science. The arrows indicate transitions to subsequent schools of thought and the double bars indicate a rupture. At the bottom are some current areas of systematic or scientific investigation of music. Music: a science of the mind? 5

"naive realism," as it is called in , has been systematically modified in Western thought by the development of psychological methods for investigating the relations between physical facts (the vibration frequency resulting from a string of certain length, tension and density, for example) and facts of sensation (pitch). This approach has shown that the correspondences between physical and perceptual dimensions are far from being simple and are very dependent on other psychological processes in the listeners such as . Some 18th Century music theorists such as Rameau and Tartini made attempts to emphasize sensation and more affective qualities of sound to a greater extent than did their predecessors of the 17th Century who were more mathematically and mechanistically inclined. However, certain aspects of this earlier rationalism persevered even into the 19th Century with the claim of Delezenne that "the sense of the interval must be innate and not a product of convention because 'our scale .. . can be found in identical form throughout Europe'," (Spender, 1980, p. 389). To his credit, and in spite of what Spender calls his anthropological parochialism, Delezenne (1826-7) did devise rigorous experimental techniques for measuring peoples abilities to tune octaves, 5ths, 3rds and 6ths and to correlate these with their degree of musical training. We find, even in the latter part of the 20th Century, a resurging of some of these earlier notions in the concern of various composers and theorists for the relative aesthetic and emotional "value" of different tuning systems based on whether or not they contain "pure" or "irrational" interval ratios. See, for example, Lou Harrison's notion of the Surd (from the Latin for deaf) which he uses to refer to irrational intervals: "The intervals of equal temperament are surds. Our word 'absurd' carries the suggestion 'from deafness'." (1971, p. 7; see also Partch, 1974, for the development of a compositional system based on small number pitch ratios, and Makeig, 1982, for a study of affective perception of intervals.) After 1850, psychology departed from philosophy and purely theore- tical considerations to establish itself in the experimental laboratory, adopting scientific method, borrowing theoretical models from the Downloaded By: [McGill University] At: 22:23 22 August 2008 physical sciences and beginning to combine an interest in the various aspects of sensation with a numbr of advances in human , particularly the physiology of the sense organs and of the nervous system. One might say that the field of (of which psychoacoustics is the auditory part) was born in with the work of physicists and physiologists such as Fechner and Helmholtz. Psychophysics is a branch of which attempts to relate physical descriptions of stimuli with measurable believed to be related to sensation, such as saying that one light is brighter than another, or that one sound is higher in pitch than another. The sensationists believed that the primacy of sense implied that science should study sense organs. They conceived of the mind as a device for recording and combining sense impressions. Intelligence, then, was a product of impression, and association (Langer, 1942). 6 Stephen McAdams

Fechner, in addition to creating the field of psychophysics and developing many of its experimental methods (I860), established the field of what is today called experimental aesthetics (1876). This was an attempt to bring objectivity into the assessment of personal preference and to search for a sensationist definition of beauty and aesthetics based on liking and disliking. It essentially represents a conception of art as the satisfaction of taste. In particular, Fechner developed a method of preference for "pleasingness" regarded as appropriate to musical aesthetics and which survives to some extent today. In spite of the fact that increasingly sophisticated methods of data analysis have made possible a clearer definition of some of the dimensions of artistic attributes, several aspects of artistic experience that are not specifically due to the structure of the art object, such as the influence of culture and personal associations, confound to a great extent any generalizations that one might make about aesthetic experience. Fechner's 16 principles of psycho-aesthetics were not entirely empirically based, and as yet no truly scientific theory of musical aesthetics has been developed, though some researchers still continue in this direction (Frances, 1984; Berlyne, 1971). According to Langer (1942), "... it seems to be an essentially barren adventure" (p. 180). In contrast to Fechner, Helmholtz (1877/1885) investigated more generalizable psychoacoustic aspects of musical sounds rather than individual taste. From his studies of the mathematical structure of the inner ear, he developed a model of pitch perception by resonance which, though no longer accepted in its original form, still shows its influence on several modern theories of pitch. His theory was based both on Ohms' Acoustical Law which states that the ear performs a Fourier, or frequency, analysis on the incoming sound wave, and on Muller's Law of Specific Nerve Energies which states that each nerve fibre leaving the inner ear reacts to but one small range of the many frequencies that stimulate the ear. This model led him to presume that each discriminable pitch stimulated one single nerve. He also noted a problem with this model that is only now beginning to be understood, Downloaded By: [McGill University] At: 22:23 22 August 2008 which was: why then do we tend to hear the complex sound coming from a single musical sound source as integrated, fused or unanalyzed, i.e. as a perceptual whole rather than as a collection of pitches? Helmholtz also developed many notions concerning the sensory basis for the perception of consonance and dissonance, of the physical origins of differences in , and of the role of unconscious perceptual processes in the perception of. melodic succession and progression. He already intuited the importance of higher level psychological processes in the creation and perception of art, not finding sufficient evidence to "reduce" aesthetic perception to the laws of sensation.

... J have endeavoured to shew that the construction of scales and of harmonic tissue is a product of artistic invention, and by no means furnished by the natural formation or natural function of our ear, as it has been hitherto Music: a science of the mind? 7

most generally asserted. Of course the laws of the natural function of our ear play a great and influential part in this result; these laws are, as it were, the building stones with which the edifice of our musical system has been erected, and the necessity of accurately understanding the nature of these materials in order to understand the construction of the edifice itself, has been clearly shewn by the course of our investigations upon this very subject. But just as people of differently directed tastes can erect extremely different kinds of buildings with the same stones, so also the shews us that the same properties of the human ear could serve as the foundation of very different musical systems. Helmholtz (1877/1885, pp. 365-6) Mach (1886) worked in the realms of both auditory and visual psychology. In addition he did important work in the perceptual analysis of temporal order and musical rhythms. His work laid the ground for later research by Fraisse (1978, 1982). Another colleague of the same era, Stumpf (1883), wrote an important work on the psychology of the qualities of musical tones. But as Susanne Langer (1942) has observed, Stumpf gave us a Tonpsychologie and not really a Musikpsychologie, a statement which summarizes well the limits of the sensationist approach to music psychology. The mentalist school, active between the two World Wars, adhered to the belief that body and mind were separate but parallel entities. Two of the greater thinkers of this period were Wundt (1896) and Titchener (1909) who believed that conscious experience could be analyzed into atoms of mental feelings and sensations. According to this view the ear presents acoustic information to the great repository of musical talent, the mind. This clear dichotomy is difficult to maintain in the face of more contemporary evidence that the information received by the sensory systems is known to be progressively integrated at all levels of psychological processing along the sensory pathways. "The notion that the ear conveys a sensation to the mind which the mind in turn listens to begs the question of how the mind might 'listen', and obscures the fact Downloaded By: [McGill University] At: 22:23 22 August 2008 that the coded pattern of brain activity during perception represents the musical experience" (Spender, 1980, p. 390; my emphasis). Three important works on the psychology of music were published during this period: Mursell (1937) concerning the , particularly , Seashore (1938) mapping out a certain number of musical phenomena such as instrumental timbre and vibrato and developing several measures of musical aptitude, and Schoen (1940) concerning the affective and aesthetic influences of music. The main thesis of the Gestalt theorists was that the perception of form or pattern was due to an innate (rather than learned) response of the nervous system as a whole to the entire pattern of energy stimulating its receptors. There is a one-to-one (or isomorphic) relation between the pattern of stimulation and the pattern of nervous activity in the brain (Kohler, 1929). Current knowledge of neurophysiological processes makes this claim untenable, but what remains from the work of the 8 Stephen McAdams

Gestaltists are two sets of analytic tools which are relevant to the perception of the properties of objects: 1) the figure-ground phenomenon that distinguishes some "object" to which we might pay attention (the figure), in spite of the presence of a lot of other information which is relegated by perception to the background, and 2) a series of laws of perceptual organization that attempt to resolve ambiguities about what is figure and what is ground. While these tools may serve as guiding principles for experimentation, they are merely descriptions of organiza- tional tendencies rather than explanations of the psychological mechan- isms underlying perception (Hochberg, 1974). At the beginning of many musical examples were cited, but little musical experimentation was actually done and most of the work concerned visual grouping. This thought has, however, influenced a number of contemporary researchers interested in the problems of musical organi- zation (see below). The mentalist and Gestalt stances were fiercely attacked by the behaviorists who adopted the empiricist principles of the association of ideas and more specifically applied them to the association between environmental stimuli and behavioral responses. Few behaviorists approached the problems of music, thus leaving a gap between the low- level experiments of psychophysics and the high-level aesthetic theories derived from Gestalt psychology or Freudian principles. The basic element of behaviorist theories of is the stimulus-response unit, which is considered to be an irreducible unit in a chain of learned associations. This principle is tacitly used in psychoacoustic discrimina- tion and detection experiments where rather than tell the subject which dimension of a sound should be paid attention to, reinforcement (feedback) is given in the form of a light blink that indicates when the correct response was received. The subject then does what ever is necessary to maximize the number of light blinks. Spender (1980) remarks that the formulation of the notion of stimulus-response is confounded by a disregard for individual differences in musical education, leading in part to a present day lack of rapport between Downloaded By: [McGill University] At: 22:23 22 August 2008 psychoacoustic and musical experimentation. A behaviorist approach to the psychology of music is found in Lundin (1967). A more recent approach to the psychology of music that picks up some of the trends of mentalist and Gestalt theory is what might be called the cognitive approach. Cognitive science has emerged from the cross-fertilization of a number of disciplines including psychology, , neurophysiology, philosophy and . Its main concern is to characterize the mental capacities of human beings (and other organisms in some cases). Its most important goals include the attempts to understand the nature of mental representation and memory, the processes of organization of perception and thought, and the ability to and solve problems. A recent work written by a /theorist and a linguist (Lerdahl & Jackendoff, 1983) makes it very clear that with appropriate conceptual tools, the concerns of music theory can be expressed in such a way as to render this domain Music: a science of the mind? 9

accessible to experimental psychology. In particular they borrowed many ideas from linguistics in the forming of their theory of the cognitive structure of musical forms belonging to the Western tonal idiom. The success of this approach is witnessed by the growing number of experimental studies on the claims made by the theory. As Sloboda (1986) has remarked, it is with this work that the psychology of music comes of age!

The realm of music psychology The psychology of music as a field studies all the possible forms of musical with the accepted scientific methods of experimental psychology (see Spender, 1980; Sloboda, 1985; Dowling & Harwood, 1986). The many and varied concerns of the field include some which fall within the realm of experimental psychology such as: — the perception of musical qualities of sound, — the cognitive processes of organization, representation and of music, — the acquisition and exercise of musical skills in performing and listening, — the emotional and aesthetic responses to music, — the processes of creation (composition and ), and — the nature of inborn and learned musical aptitudes; some which are slightly more physiological such as: — the brain functions underlying musical processes, and — the role of music in healing mental and physical illness; and some which are more social and historical such as: — the of collective music and the cultural contexts of musical experience, and — the historical development of musical material and forms. Clearly, not all of these areas have been explored in depth, despite the ancient Greek origins of psychological considerations of musical experi- Downloaded By: [McGill University] At: 22:23 22 August 2008 ence. The areas most extensively explored up to the present include the perception of musical qualities, organizational processes in auditory perception and the nature of musical aptitudes. How human beings make sense of the complex and highly structured acoustic message we call music has been the principle task of experiments since the beginning of the 1970's. It is from that time that the cognitive framework began to exert its influence on music psychology. Music psychology shares some problems with other experimental fields such as and . Music psycholo- gy and psycholinguistics attempt to explain the capacity to produce and comprehend grammatical utterances and to relate the reception of acoustic properties to understanding. Many theorists have considered the relations between language and music at levels both of the structure and syntactic relations of the products of these activities and of their notational systems. They have also considered the role of culture in the 10 Stephen McAdams

acquisition of musical and linguistic skills. The psychologist's task is to understand the mental representations and processes that make this possible (see, for example, Sloboda, 1985, chaps. 2 & 6). Psychologists of music are equally as interested in the question of which aspects of musical experience are universal or culture specific and how a given cultural experience throughout one's life limits the understanding of aspects of musical form specific to the music of other, very different . Here there is an evident overlap with some problems in comparative ethnomusicology and the possibility to bridge part of the gap between psychoacoustics and ethnology. Nonetheless, a large gap still exists between what musicians recount of their experi- ence, what psychoacousticians say about human auditory discrimina- tory powers, and what perceptual psychologists say about limits of human auditory organization. Some of the areas listed at the beginning of this section, such as affective and aesthetic response to music and all of the questions that have to do with "" in music, will be approached only with great difficulty in the near future. This is primarily due to the limits in simply defining what the real problems are that need to be addressed and investigated.

Healthy research paradigm

In a remarkably dear-sighted paper entitled "Cognition and real music: The psychology of music comes of age," (1986)2 insists that it is only within the last 10 years that a true scientific discipline deserving the title "psychology of music" has come into existence. The reason is that a coherent field of investigation, what Thomas Kuhn in his seminal book The Structure of Scientific Revolutions (1972) calls normal paradigmatic science, is only now taking form. The coherence of this form, i.e. the paridigm, requires a number of criteria to be satisfied. Researchers and theorists must have Downloaded By: [McGill University] At: 22:23 22 August 2008 (1) an agreed of experimental problems, (2) an agreed set of methods for working on the problems, (3) agreed theoretical frameworks within which to discuss them, (4) techniques and theories which are specific to the paradigm, and (5) research programs which are appropriate to the whole range of phenomena in the domain the paradigm addresses. I will attempt in this section to elaborate upon his on these five criteria as his is one of the most clear'statements to date of what it means to have a field called "psychology of music."3 There are three major experimental problem areas that are being or need to be addressed in our cognitive considerations about musical experience. 1) What is the nature of musical knowledge and representa- tion? 2) What are the processes involved in music production and Music: a science of the mind? 11

comprehension? 3) How and why do these representations and processes figure in the aesthetic and emotional effects of music? The majority of the work to date has been in the first area, with increasing activity in the second. The lack of experimental work in the third seems primarily due to difficulties in defining the nature of aesthetic and emotional effects. According to Sloboda, these kinds of questions exclude from the psychology of music a good number of areas of consideration including and pedagogy (that do tests on educational techniques which often make simplistic assumptions about the nature of musical experience), as well as experiments which derive data from musical situations but do not attempt to explain their musical significance, as is the case with a great deal of psychoacoustic and auditory perception work. While Sloboda is absolutely correct in placing emphasis on real musical materials and experimental tasks that resemble normal musical behavior, this is too strong of a position for a couple of . One is that it has become increasingly possible to mingle education and experimentation with the use of computers. This is already taking place in teaching musical skills to children (Seymour Papert at Massachusetts Institute of Technology) and to college students (Gerald Balzano at the University of California at San Diego). The second reason is related to the open-ended nature of psychology invoked in the quote at the beginning of this article. In order to explain a certain number of related to representation, processing and response to music, we cannot, without forsaking the scientific status of music psychology, refuse to descend into the lower auditory levels associated with pre-musical processing. After all, musical is first of all hearing. One of the seminal publications that did move the field in the direction of being truly musical is a paper by Longuet-Higgins (1976) in which a cognitive model of musical took as input idealized information about a performance (pitches and times of notes) and produced a sensible with some degree of success. The

Downloaded By: [McGill University] At: 22:23 22 August 2008 cognitively interesting features of the system are that it assigned the pitches and time values within a tonal/metric system using heuristics that embody psychological hypotheses about musical organization. The analysis that resulted from this "artificial musical intelligence" system accord reasonably well with the of musically trained listeners. In searching for an agreed set of experimental methods, early researchers in music psychology have had to summon the courage (or to have the maturity as Sloboda says) to leave the experimental paradigms of their less musical colleagues that are often limited to judgments of "same or different" between two tones. Some of the newer methods currently being employed include — direct recording of performances on specially prepared instru- ments that allow precise determination of performance parameters such as time of attack, duration, velocity of attack on a key, variation in air pressure in a wind instrument, and so on, 12 Stephen McAdams

— methods for measurement of memory performance by reproducing musical fragments on an instrument or notating them on paper in conventional music notation, — synthesis of sounds by computer in order to have precise control over the way a sound varies and then to be able to relate these changes to variations in reported musical experience, — studies of as a way to explore more aesthetic and qualitative aspects of musical experience and to collect verbal protocols in order to follow to some extent the thought processes of someone in the act of creative musical problem-solving (though this is of course fraught with the danger of the interaction between analytic introspective and creative processes), — recording of physiological correlates in response to music listening or generated in performing, and — computer-based simulations of musical behavior as a means of formalizing our understanding of the way a human processes a given musical piece — the attempt being to get the computer to produce some abstract structure from the musical signal that corresponds with the responses gathered from musical subjects. Sloboda emphasizes, above all, the importance of experimentation on real musical material that embodies the real complexities of music and which are performed in the context of response situations that are more closely related to normal musical activity. This constraint would separate truly musical psychology from many types of experimentation and theory that one might call pre-musical or proto-musical where the material is not truly musical and the tasks demanded of the experimen- tal subjects are quite different from those actually performed in the act of music listening, reading, performing, writing or improvising. Here again, I find Sloboda a bit restrictive, particularly with respect to the task of listening. Many of the response situations listed above (reproduction and notation) can only be performed by trained musicians and the relation between physiological recordings and psychological experience is far from beginning to be understood with any degree of Downloaded By: [McGill University] At: 22:23 22 August 2008 clarity for even very simple sound stimuli. Several of the more classical experimental paradigms relying on comparison and recognition could still be quite relevant to the experience of untrained musical listeners. A somewhat more difficult criterion for any new science to fulfil is the creation of a set of agreed theoretical frameworks. Among those listed by Sloboda (and slightly modified here) are included the assumptions — that music is lawfully constructed and these laws are used by listeners to decode the "meaning" of a musical "utterance," — that the internal representation of music listening has a hierarchical component, — that scales, meter, and rhythm are psychologically real organizing principles and are possibly instantiations of musical universals which may be found in similar forms (to some degree) in almost any musical culture, — that the processes of music composition, performance and Music: a science of the mind? 13

perception access the same core representations in the mind. These theoretical assumptions should account for a number of observable aspects of musical behavior, such as the nature and postion of errors in performance, the distribution of expressive parameters in performance, the differential memorability of musical passages, the differential salience in the perception of musical events, the judgments of well-formedness by listeners, cultural differences in music percep- tion, the process of enculturation and learning (how people acquire competence in musical language4), and, eventually, the structural and cognitive determinants of musical meaning and . The next criterion insists that the techniques and theories of the previous two criteria be specific to the paradigm. Sloboda states that there is a strong tendency among researchers to study music because it is a good example of something else, such as a complex motor skill, a language-like phenomenon, a complex auditory phenomenon or a set theoretic entity. It is or can be, of course, all these things and often the financial needs of music researchers require that they disguise their work as these other things in order to convince national funding bodies of the "scientific validity" of the proposed studies. The courageous music psychologist would, however, try to understand where music is wholly unlike anything else, all the while keeping an eye on relations with other fields since no human mental activity is entirely divorced from the-rest. Some of the unique aspects of music include the fact that it creates a non-referential (or perhaps, self-referential) world, practically making it unlike language and several other arts; that it does so through, for example, psychological dimensions which are essentially unique to music in the way they are used such as pitch hierarchies and meter; and that despite its lack of specific reference, it can have deep emotional significance. The last criterion states that the paradigm must generate research that is appropriate to the whole domain. It is easy to show that, even in its infancy, music psychology demonstrates a wide range of musical activities. Most research to date consists of work on perception, but one Downloaded By: [McGill University] At: 22:23 22 August 2008 is starting to see work on performance, memory, analysis, composition and improvisation. There is a great effort on the part of an increasing number of researchers to use real music (at least fragments, if not small pieces) rather than impoverished music-like material. What makes this work difficult is using the "real thing" in order to move towards a greater "ecological validity," or simply musical relevance in the research, without giving up rigorous scientific control. This often means supple- menting musical material with specifically constructed material that varies along pre-determined dimensions. There is also an increasing use of subjects and behaviors that are close analogues of normal musical activities such as memorizing for , sight reading, extended listening, transcribing and improvising. Most importantly, this trend toward musical relevance stimulates the interest of musicians to participate in the larger research project. A side effect of this new interest is the appearance of courses on psychoacoustics and music 14 Stephen Me Adams

psychology in music curricula and of bonafide psychologists in music departments and conservatories.' We would like to use these criteria in the exploration of composition, performance and listening. Let us now consider the products of musical activity and unfold from these the central concerns and the major substance of the field of music psychology.

The products of musical activity There are essentially three products of musical activity that need to be investigated and explained from a psychological standpoint: 1) a notation or text resulting from composition and from which the generation of sound can proceed by a third party, 2) a series of motor patterns (or patterns of muscle activity) generated by a performer in order to control a or voice, and 3) the perception and accumulation of musical images and musical form in memory and in the conscious mind. Notation or text includes, of course, the many kinds of musical scores with their varying degrees of precision of notation. An important feature of notational schemes is that they embody a large number of cultural conventions and assumptions about how the notation is to be interpre- ted. Thus, there are aspects of the performance that are not made explicit in the text but are implied in the cultural practice and usually transmitted from teacher to student either orally or by demonstration. In this, a certain degree of freedom is allowed for modifying the precise values expressed in the text. This modification is, in fact, expected of the performer: a musical text played exactly as written is often considered to be mechanical and musically uninteresting. One expects certain deviations in tempo, timing and dynamics, as well as micro- inflections of pitch and timbre to bring into relief the phrasing of a piece which serves to enhance one's appreciation of its form. The problem of the psychologist is to understand how three different things contribute Downloaded By: [McGill University] At: 22:23 22 August 2008 to the final sonic result: 1) the notated structure of the piece, 2) the knowledge structures acquired by the performing musicians relating to this piece, to the work of its composer, to the musical idiom of which it is an example, and to the culture beyond the idiom, and 3) the activity of other musicians. There is a relatively new kind of notation which is a text describing a procedure for the generation of sound sequences by a computer. This text can include everything from a list of values to be plugged into a synthesizer at given times to a whole computer program that has to make various decisions on exactly what and how to synthesize depending on what else is going on in the musical environment at a given moment. The case of the list of values fed into a synthesizer is one where there is no further musical interpretation of the notation. The commands are explicit and are executed as such, i.e. there is an absolute invariability in the realization of the score. One might say, then, that the Music: a science of the mind? 15

notation is complete with respect to its realization. The problem posed for the composer in working in such a medium is that the realm of expressive interpretation from a musical text that is usually left to the performer must be made absolutely explicit in the values fed into the computer synthesizer. They must, for example, include the small variations in tempo, timing and dynamics and are necessary for musical phrasing. The role of the composer here is one of both creator and interpreter. In the case of an interpretive program, two types of text are usually presented. One is a score in the more classic sense of a text to be interpreted musically and synchronized with other players. The other is a text that describes the nature and behavior of the "synthetic performer" and its musical decision-making (interpretive) process. Attempts to formalize this understanding (or at least intuitions about it) have resulted in computer programs that interpret musical texts (Sundberg, Askenfelt & Fryde"n, 1983), accompany other musicians (Bloch & Dannenberg, 1985; Vercoe, 1985; Chabot, Dannenberg & Bloch, 1986), or improvise in interaction with human or other "synthetic" musicians (unpublished work by George Lewis, STEIM, Amsterdam and David Wessel, IRCAM, Paris). In these cases one is obliged to model the various behaviors of a performer including perceiving sound, abstracting musical structure, making interpretive decisions and pro- ducing commands that generate sound. One quickly realizes the simplicity of one's assumptions when one hears the relative lack of musicality in the early versions of these programs. In refining the programs, one realizes the enormous number of rules necessary first to extract an abstract structure from a stream of physical signals, and then to realize an invented abstract structure in sound in a musically acceptable way. One of the main psychological interests in the production of motor patterns in performance is the relation between the temporal sequencing of muscle activity and the mental representation of the music being performed. In other words, how is a musical structure that is read from a

Downloaded By: [McGill University] At: 22:23 22 August 2008 page or stored in memory unfolded in time as an organized series of commands to the appropriate muscles. There are interesting features of this with respect to the learning of performance skills. For example, it seems that as one passes from the stage of novice to expert a greater and greater "vocabulary" of motor sequences becomes automatic and, as a result, more easily commanded. One might imagine a little muscle pattern program in the brain that is triggered by a single command which then chains together all of the necessary movements to realize it, whereas the novice is obliged to execute each movement with a separate command. In fact, the reason for the long hours of practising scales and exercises is to encode more and more efficiently and permanently these various kinds of patterns such that they become a more or less automatic sequence and no longer require the intervention of the conscious mind which slows things down and can only deal with a certain amount of information at a time. The fact that the sequence itself becomes 16 Stephen Me Adams

automatic does not necessarily imply, however, that it cannot be modified by higher level processes to make it cohere expressively with its surrounding musical context, such as creating tempo fluctuations. What is implied is that such expressive modifications are more efficiently realized by perhaps simply modulating the rate at which the sequence unfolds in time rather than recomputing the temporal relations of all the constituent movements necessary to play a fast run for example. The role of auditory and proprioceptive feedback during the execution of a motor pattern also interests psychologists. That is, how does the sound and feel of what one is producing at a given point what is produced shortly thereafter? This is interesting from the standpoint of the interaction between the mental representations of the motor commands, the incoming auditory information, and the sensory information coming from the parts of the body being used to produce the sound, such as the fingers, mouth, tongue, diaphragm and foot in the case of a playing a clarinet and tapping her foot to keep time. Another problem of interest that is related to the production of sound is the translation of notation through the into a mental representation of what is to be produced and what should be heard as an end result. One can easily imagine the development of automatisms through the repeated experience of correlating the visual representation and the imagined auditory result with the actual sound produced by the reading. The third product of musical activity, and the one that most concerns the remainder of this article specifically and this volume in general, is musical perception and the accumulation of musical form in memory. In cognitive psychology, we talk about the internal representations of perceptual objects and events, of concepts, and so on. It seems fairly clear that what is "in the head" when we perceive and remember is not a complete and identical copy of something that is or was in the world, but rather a kind of abstract description of it. To illustrate this, imagine a small girl a she heard her mother sing previously. Most Downloaded By: [McGill University] At: 22:23 22 August 2008 likely, in order to learn the melody she has to have heard it sung several times by her mother and no two of these times would be identical. She might start on different pitches from one day to the next, the sound of her voice might be clear one day and then froggy another if she has a cold, she might sing it slower one time and faster another and so on. And the mother is entirely capable in her own right of deciding whether or not the girl has correctly reproduced the melody even though she has never heard her sing it before. The fact that this is possible suggests that the way at least this type of melody is represented is somehow independent of the exact starting pitch, tempo or timbre of the voice. The representation most probably contains aspects of pitch interval relations, pitch contour, scale, etc. One can imagine that other dimensions such as rhythm, harmony and timbral relations would have their representations as well. The representation of a melody must also have the property that one can perform a certain number of transforma- Music: a science of the mind? 17

tions on it and have listeners still recognize these transformations as being related to the original. Such transformations might include transposition to another scale step in the same key (thereby changing some of the intervals by a half step), or various kinds of ornamentation and elaboration of the melody's theme. Without this capacity, all of what we call musical development would not be possible. What actually gets representated internally depends to a certain extent on the experience we have acquired up to the present as well as on a whole gamut of processes that organize the incoming sensory information into comprehensible entities such as sound sources and musical forms. The framework of cognitive psychology rests on an important distinction between mental representation and the processes of mental computation that are responsible for organization and comprehension. These two concepts,5 while being formally distinct, are entirely complimentary. Neither could exist without the other and both are necessary for comprehension (Sperber & Wilson, 1986). The following quote reveals this for the realm of visual cognition: ... piston is the process of discovering from images what is present in the world, and where it is. Vision is therefore, first and foremost, an information- processing task, but we cannot think of it just as a process. For if we are capable of knowing what is where in the world, our must somehow be capable of representating this information — in all its profusion of color and form, beauty, motion and detail. The study of vision must therefore include not only the study of how to extract from images the various aspects of the world that are useful to us, but also an inquiry into the nature of the internal representations by which we capture this information and thus make it available as a basis for decisions about our thoughts and actions. This duality — the representation and the processing of information — lies at the heart of most information-processing tasks and will profoundly shape our investiga- tion of the particular problems posed by vision. (Marr, 1982, p. 3) The remainder of this article will consider in more detail the nature of

Downloaded By: [McGill University] At: 22:23 22 August 2008 the internal representations and organizational processes involved in music listening.

Internal representation of musical dimensions and structures6 A primary presumption in music psychology is that there are underlying psychological principles that channel musical cognition. The psycholog- ist is interested in the nature of these principles and how they are acquired, if they are not innate. There are many kinds of musical activity where cognitive skills are in evidence: hearing, performing, remember- ing, imagining, reading and creating. Some of the key features of the skills behind these activities are worth noting. They have access to a vast 18 Stephen McAdams

array of mental structures (such as knowledge of the names and relations in a pitch system, their notation, and ways to reproduce them on various instruments, and knowledge of the individual differences in tone quality obtained when the same pitch is played on several instruments, etc.). These structures must be easily and quickly deployed in reading, listening and performing. In this section I will first address some general considerations on the nature of mental representations and memory processes7 and then briefly survey current understanding of these for pitch, rhythm, timbre and musical structure. One of the major contentions of the cognitive approach to psychology is that all mental activity is mediated by internal (or mental) representa- tions. What the actual forms and nature of these representations are is the major axis of cognitive research and there is anything but a consensus, even about whether they exist or not!8 Let us take, however, the point of view of Pylyshyn (1985) that we must refer to "representa- tions" in order to express certain cognitive generalizations that are, for our present purpose, useful in explaining people's capacities in music listening. As he notes, "In science, the process of seeking to understand is called explanation" (p. 1). The importance of the notion of representation as an explanatory tool for symbolic function in human thought is beginning to be understood by philosophers and scientists alike. According to Langer (1942), we use symbols to attain and organize belief, and intelligence is related to the *power of using symbols for conception and expression. She states that symbolization is a basic, characteristically human need, that is an act essential to thought and prior to it, and indeed, that it is the essential act of mind. [Symbolization] is the starting point of all intellection in the human sense, and is more general than thinking, fancying, or taking action.... The current of experience that passes through [the brain] undergoes a change of character, not through the agency of the sense by which the perception entered, but by virtue

Downloaded By: [McGill University] At: 22:23 22 August 2008 of a primary use which is made of it immediately: it is sucked into the stream of symbols which constitutes a human mind. Langer (1942, p. 46) As an element of explanation, a representation may be defined as "a formal system for making explicit certain entities or types of informa- tion, together with a specification of how the system does this" (Marr, 1982, p. 20). An example of this in the external, or public, domain is the representation of musical events and pieces of music by a symbol system with conventions for notating temporal ordering, pitch, dynamics, specific instruments and various sound production techniques specific to a given instrument. This is a formal system because it is a set of symbols with rules for putting them together, where the symbols are used to stand for, or represent, things. Any degree of complexity can be, in theory, attained with a symbol system since a symbol can also stand for a lawful combination of other symbols. Internal representations, on Music: a science of the mind? 19

the other hand, are physical causes of behavior in the form of physical codes or symbols, i.e. patterns of brain activity. Acts are governed by representations which are symbols of various kinds. Certain of these representations correspond, for example, to the perception or imagina- tion of a major chord, or to the connecting of a sequence of clarinet notes into a melody, or to the expectation of a tonic chord to resolve the tension state evoked by a dominant seventh chord. "To be in a certain representational state is to have a certain symbolic expression in some part of memory" (Pylyshyn, 1985, p. 29). One of the tasks of the psychologist is to determine the forms these representations take. The form is important in that different forms of representation, even though they carry the same meaning, may have different properties with respect to what the processes of transformation can do with them. Certain aspects of reality are more apparent in some representations than in others since representation makes certain information explicit at the expense of other information. As a physical analogy, consider for a moment the various kinds of visual representa- tion of acoustic information such as the oscilloscope, the Fourier transform and the sonogram (Figure 2). There are different aspects of the perception of sounds represented in these ways that become apparent with each one. The moment at which a sound event starts and stops is more easily seen in an oscilloscope trace and a sonogram than in a classic long-term Fourier transform since the two former have the time domain more readily apparent in their representation and the Fourier transform computes the frequency content of a sound over a long time period, in a sense collapsing the analysis over this period of time. The time domain is implicit in the Fourier spectrum but not visually explicit. As such, we have to explicitly take separate Fourier transforms over different time windows if we wish to contrast the spectral characteristics at one moment with those at another. On the other hand, certain aspects of timbre related to the form of resonance structure of a sound source are more easily "read" from the Fourier spectrum, to a great extent from the sonogram and not at all on the oscilloscope trace. Again, the frequency

Downloaded By: [McGill University] At: 22:23 22 August 2008 resonance characteristics of the sound source are present in the oscilloscope's time trace, but are not readily accessible due to the form of the representation. Psychoacousticians have the problem of finding the most appropriate and relevant acoustic representation for their stimuli that allow them to draw conclusions about the relation between the measured responses of experimental subjects and the perceptual representation that results from the stimuli. The psychoacoustician measures both the acoustic signal and the response of the experimental subject, compares them and then tries to infer what is happening inside the subjects head. For the composer working with computer sound synthesis, it would therefore be necessary to have a familiarity with several ways of representing sounds in order to focus on the aspects that are relevant to the compositional decisions at hand for a given sound context. Before returning to mental representations, let us diverge for a 20 Stephen McAdams

event 1 event 2 OSCILLOSCOPE TRACE

time

FOURIER TRANSFORM

freq freq

SONOGRAM Downloaded By: [McGill University] At: 22:23 22 August 2008

time

Figure 2 Three different visual representations (based on acoustic analyses) of a musical signal. The first shows the equivalent of an oscilloscope trace representing time on the horizontal axis and amplitude (or pressure) on the vertical axis. The second shows 2 Fourier transforms representing frequency on.. the horizontal axis and amplitude on the vertical axis, one each taken from the middle portion of sound events 1 and 2 as indicated in the diagram above. The third shows a sonogram representing time on the horizontal axis, frequency on the vertical axis and amplitude by the degree of darkening. Music: a science of the mind? 21

moment to consider the two major kinds of memory that are relevant to our concern with musical memory: short-term memory (STM) and long- term memory. STM is what we use for temporarily storing incoming information or recently activated information from long-term storage in order to make calculations on it or make comparisons with succeeding information or with other elements in long-term memory. What is stored in STM is already an interpretation of what has occurred in the world and is not a complete image of the incoming sensory information. This memory is also of a very limited capacity: only about 5-9 items can be stored at any one time (Miller, 1956). The storage capacity of STM is also a limited duration. If no effort is made to refresh the information, by repeating or rehearsing it (as one does when trying to remember a telephone number, by saying it to oneself over and over again), the information very quickly "fades" after about 6-9 seconds or is soon replaced by new information. By contrast, LTM is more like permanent information storage and has no known upper bound on the quantity that can be stored. Information in LTM is highly structured. It takes a great deal of time and effort to store new material in LTM, and then, it is often not so easily recalled, though it might be recognized immediately. In a loose way we may consider representations to include the structures assumed by of things we have experienced, the conclusions and assumptions we have arrived at that stand for beliefs or models we have about how the world works, and so on. These structures and models may be conceived of as schemata that combine symbols for concepts and relations between these concepts. All of these are operated upon by processes or rules whose goal it is to build up models of what is transpiring in the world in order best to predict what will happen next and to interpret and comprehend the meaning of what is happening (Neisser, 1976; Sowa, 1984). Both schemata as representations of knowledge and the constructive and interpretive processes that operate upon this knowledge have some interesting formal properties that makes them reasonable candidates for the way the human mind works in the perception of complex acoustic Downloaded By: [McGill University] At: 22:23 22 August 2008 structures like music. In this view, memory may be represented as a kind of storage that is itself a model of the evolving physical world. The knowledge that has been acquired from the world is represented by the state of the model at any given moment. This, of course, implies that the storage is structured in relevant ways with respect to what is important in the world. The perception process depends a great deal on different levels of storage and representation of incoming information. Sound impinging on the ear is encoded by the inner ear into a of neural information to the brain. Sensory processes in the brain that detect various features in this neural information prepare a description of the characteristics of events in the environment that stay in what is called auditory storage (or echoic memory) for a very short period of time. This sensory representa- tion is easily disturbed by the arrival of new information (a process called masking). 22 Stephen Me Adams

From this representation organizational and pattern recognition processes use rules that are based on the current context for combining the features. These rules are guided by attention which operates from previously acquired schemata to assemble the auditory image which is then stored in short-term memory for comparison with previously heard sounds, and subsequently arriving sounds. Items in this short-term storage can come from the senses or can be recalled from long-term memory. The information in short-term memory is in a special active sta,te that allows it to be used. An important feature of this active state is that only a limited amount of information can be kept active. However, the amount of information that can be kept active by a given person depends on the experience that person has had with the sound structures in question (cf. Anderson, 1985, chaps, 3,4, 6). Someone who knows a large number of Haydn symphonies in great detail will be able to more easily store passages from another Haydn symphony since he would have schemata that allow him to structure the information in efficient ways for storage. These schemata would not, however, help him store passages from music by Babbitt or Xenakis, the organizational principles being very different. Conflicting schemata that may each partially match the incoming information and the musical form that is gradually accumulating in memory may generate illusions or ambiguities. This is, of course, one of the most commonly used "cognitive" tools of composers that are interested in structural richness and ambiguity. Multiple levels of perception help deal with novelty and help process complex structures. In cases of complexity, a more bottom-up approach may be used wherein the total object is constructed from more familiar lower-level percepts. When more familiar objects or structures are encountered, though, large-scale schemata are activated where possible so the most global percept is the first to occur. This would be the case with face recognition (i.e. we immediately recognize a face as a whole without having to put together two eyes, a nose and so on) and recognition of familiar pieces of music based on a few notes. The part Downloaded By: [McGill University] At: 22:23 22 August 2008 immediately evokes the whole, in a certain sense. This raises the question of what is retained by a listener and of how the degree of complexity of retention depends on musical experience and training. There are a couple of important differences between novice listeners and those with much experience in a particular musical idiom (Sloboda, 1985, chap. 1). One is in the number and complexity of structural features in terms of which the listener is capable of representing music. Expert listeners can notice, remember and often reproduce on an instrument much finer levels of expressive sublety in the structure of musical phrases, and much more global levels of musical form that escape naive listeners. Secondly, expert listeners have a greater of the structures they are using, implying that they have a more extensive vocabulary of description which enhances memory capacity, or at least the access to remembered things. The capacity of memory structures in music listening is of paramount Music: a science of the mind? 23

importance since musical structures are extended in time. The percep- tion of movement, of transformation and of musical significance depend on the perceived element being heard in relation to remembered elements. We might say that perception really only becomes musical when it is "in relation to" events, sequences, progressions and structuring in memory. The form of a piece of music is what gets accumulated in memory, and thus the richness of that form depends very heavily on one's capacities and experience as a listener. This model of the perceptual process as a building up of representatio- nal schemata has some important assumptions embedded within it. One is that percepts are pre-fabricated building blocks that are derived from experience. Another is that a schemat is a pattern for assembling perceptual units or other schemata into larger structures or unitary wholes. And, finally, these schemata can operate on various levels to discern structures in the sensory information either at a level of expressive variation or at a level of global form. All of these assumptions • seem to have their psychological reality with respect to the way we represent and appreciate musical forms. Evidence that visual images can be transformed or examined in supports the notion that they are derived from models or regenerated from some kind of representation like schemata (cf. Kosslyn, 1980; Shepard & Cooper, 1982; Pinker, 1984). Neisser (1976) has proposed that images can serve as anticipations (or as perceptual readinesses) for what we expect to find in the surrounding environment at a given moment in a given context. This is supported by evidence that familiar forms are matched by ready-made percepts (previously assem- bled structures stored in long-term memory that can be recognized very quickly, and do not need to be reconstructed from low-level percepts), whereas unfamiliar forms are reconstructed from percepts for their parts. There is most likely a parallel processing going on here, where as we build up a musical form in LTM, from assembled fragments in STM, we also look up these fragments in LTM. This comparison affects how the new fragment gets placed in the whole form. Downloaded By: [McGill University] At: 22:23 22 August 2008 A key process in music listening is the way we select certain things from the mass of information that comes to us as a musical surface. As we listen, our attention fluctuates through the many paths of this surface. What we remember, what gets stored away for future comparison with later moments on the surface, depends to a great extent on what we have attended to in listening. There are two basic kinds of attention. Automatic attention seizes upon striking or salient features of the incoming information. This kind of attending is involuntary and under little influence by the conscious processes of the listener. Willful attention, by contrast, seeks cognitively important features of the information flow which are determined by what we expect to hear or want to hear. This kind of attention is involved in following the line of a particular instrument in a complex musical texture. In this case attention is guided by knowledge structures that have been developed through experience. These important structures of 24 Stephen McAdams

memory and comprehension are what we have called mental schemata (Bartlett, 1932; Neisser, 1976; Sowa, 1984). The ultimate result of our ability to perceive, remember, conceptualize, and act on musical information is the formation of internal frameworks, or schemata, for representing and reproducing more complex musical knowledge. Schemata ... are flexible. They certainly change as we get older and as our familiarity with musical tradition grows. (Dowling & Harwood, 1986, p. 4) A general feature of schemata as mental representations of knowledge is that they embody both general and specific knowledge structures. They incorporate general knowledge of musical stimulus properties common to many pieces of a given culture, such as tonal scales and , or specific knowledge of relations among tones within a given melody such as its contour, for example. Since music unfolds in time and musical schemata represents various kinds of progression and development through time, they also serve to generate expectations by anticipation. Both the general and specific levels of schemata are more general than the actual sound one hears in the act of listening and so the generated expectations of what is to follow would very rarely be so specific or dominant as to lead us into perceptual errors (Dowling & Harwood, 1986, p. 125). They tend more to be indicators of the most likely unfolding of events. This is a general cognitive capacity that is not, or course, limited to music listening. Schemata are just as important for sleepily going through the motions of preparing one's morning tea as they are for finding one's way to work whilst absorbed in thought or to the toilet at night in an unlit apartment. They can also be useful in deciphering very complex structures since lower-level schemata can extract structures that are embedded in a larger-scale musical form. It is with schemata that we pay attention to things in the world. Perception takes place when appropriate schemata are actively and continuous- ly tuned to the temporally extended information that specifies an individual

Downloaded By: [McGill University] At: 22:23 22 August 2008 event. Irrelevant events present information too, but remain unperceived simply because no such active tuning occurs with respect to it. (Bahrick, Walker & Neisser, 1981 — cited in Dowling & Harwood, 1986, p. 127) This notion is important in explaining the inherent selectivity of perception, i.e. we do not notice everything that is going on around us — we tend to select out of the environment things that have some meaning to us individually. It is also important in explaining the difference in effort required when listening to forms of music that are relatively new to us compared with what we normally listen to, or when listening to those that are very familiar but also very long. Musical schemata are not at all literal copies of sound events. They are more abstract entities representing meaningful aspects of sound struc- tures. This aspect of schemata is apparent with respect to language when people are asked to recall as precisely as possible a story they have Music: a science of the mind? 25

been read. None of them reproduces a verbatim account of the story. They elaborate upon the structure and the main elements of .their understanding of the story; that is, they construct a linguistic utterance that recommunicates the meaning. What is meaningful will, of course, be different for each person, as the varied witness's accounts of the scene of a crime reveal. With this notion of the image as a reconstruction from conceptual structures or schemata, it is easy to describe the relation between perceived and imagined or recalled images. Internal images have the same nature as (though they are not identical to) sensory images, and allows the brain to analyze and reinterpret an internal image using the same perceptual mechanisms used for sensory input. If we then consider the temporal nature of auditory perception, these anticipatory schemata must have some kind of temporal ordering in their structure which constrains the construction of auditory images from them, or the recognition of auditory patterns by them. The important implication here is that these auditory images require a structural coherence. Since the schemata we are presuming to underlie them are ordered structures, perceptual grouping processes define the constraints on these structured relations. It seems entirely reasonable that the structure of these schemata (at least at the level of sound events though not necessarily at the level even of a musical phrase) would be required to reflect the reasonable behavior of physical sound sources. (See Shepard, 1984 for a fascinating discussion of the embodiment through of physical constraints on internal representations of visual forms). Let us examine the nature of memory for the different features of musical material in more detail. As one might expect, most of the research in this domain has focussed on pitch and, to a lesser extent, rhythm. Very little work has yet approached the problems of remember- ing or processing timbre structures.

Downloaded By: [McGill University] At: 22:23 22 August 2008 Pitch There are several features of pitch structures that are likely candidates for memorization. These include, at the lowest level of structure, pitch height or value, chroma (or pitch class) and interval. At a higher level we have contour, scale, key and harmonic progression. These more basic elements contribute, along with the elements for rhythm and timbre, to the storing of more complicated structures such as phrases, sections and entire pieces. . For harmonic sounds the mapping of frequency to pitch may be considered as a logarithmic psychophysical function. Constant ratio relations in frequency give constant interval relations in pitch — at least" within the range of musical pitch. The mental representation of pitch is in terms of scales with discrete steps. This discretization of the pitch continuum into conventional scale degrees is found in all cultures of the 26 Stephen Me Adams

world that use pitch in their music. The steps in the scale function as perceptual categories that facilitate hearing, remembering and reproduc- ing a musical message. The dissolution of the relation of pitch classes to a scale is a relatively recent development in contemporary Western music which can pose a certain number of problems of memorability. There are a number of constraints on scales that one finds and which seem to have important psychological consequences. The steps of a scale are easily discriminable. There is a musical equivalence of two pitches related by an octave. There are a moderate number of scale degrees within the octave. The aspects of discrete, distinguishable perceptual categories and a small number of elements to be remembered are probably very important for any perceptual dimension that is to carry some aspect of form (McAdams & Saariaho, 1985). The fact of octave equivalence enhances the usable range of pitches since a pattern of intervals within an octave is considered perceptually equivalent with respect to pitch structure in any other octave. The feature of octave equivalence has another perceptual consequ- ence. It contributes to the multidimensional subjective experience of pitch. In other words, pitch is not merely organized along a single dimension of higher-lower (called pitch height), but has, in addition, another dimension which is circular (called chroma). This subjective circularity is evident in the verbal labelling schemes used in many cultures to identify the pitches of a scale, e.g. do, re, mi, fa, sol, la, ti, do in Western solfege, or sa, re, ga, ma, pa, dha, ni, sa in North Indian music, where the names repeat in each octave. Thus, chroma is a subjective attribute of pitch that is independent of the octave in which it is found. In addition to this, another circular dimension of subjective experience is found in relation to the Western tonal idiom where the interval of the perfect fifth has a strong structural importance. This dimension is the circle of fifths which represents a distance, or degree of commonality of pitch classes, between keys. A number of psychologists have investi- gated the mental structure of this subjective multidimensionality (with a technique called "multidimensional scaling") and find that it varies Downloaded By: [McGill University] At: 22:23 22 August 2008 according to the culture and musical sophistication of the listener. The structures of less experienced listeners have less dimensions that those of more experienced listeners, notably with respect to the circle of fifths dimension, but the other dimensions are relatively similar in both types of listeners. This would seem to imply that with greater experience, one acquires increasingly rich mental structures with which to process the music one hears (Shepard, 1964, 1982; Krumhansl, 1979; Krumhansl & Shephard, 1979; Krumhansl, Bharucha & Castellano, 1982; Krumhansl & Keil, 1982; Balzano, 1982). Another level of mental pitch structure assigns a hierarchy of organization to the pitches of a given scale system. This functional organization at, for example, the base of the Western tonal system establishes dynamic patterns of tension as one learns to expect that a certain scale degree will tend to move toward another. One of the more difficult tasks of serial composers and others composing outside of the tonal system is to avoid at all costs any sequential or Music: a science of the mind? 27

chordal pitch combinations that might invoke these hierarchical struc- tures and generate expectancies that are foreign to the ordering scheme they are trying to establish. With an understanding of these culturally acquired mental schemata for the organization of pitch material, it comes as no great surprise that the memory for single pitches, intervals and chords is strongly affected by the musical context in which they are placed — as sparse as this context may be in the many experimental situations where it was tested. In early studies by Krumhansl (1979), for example, additional informa- tion provided by the context helped in remembering a pitch that was related to a tonal context or interfered with one that was unrelated. This demonstrates that we remember pitches with respect to a reference frame derived from the surrounding context in such a way that the context evokes a given mental structure within which the pitch to be remembered is interpreted. In a study by Shepard & Jordan (1982), the context was a major or minor scale in which all of the intervals were "stretched" such that the "octave" fell a semitone higher. Subjects were then presented a pitch that came from the normal scale based on the starting pitch, from the normal scale based on the final pitch, or from the stretched scale as actually presented. They were asked to decide whether the pitch belonged to the scale they had just heard. Most subjects tended to choose pitches that belonged to the normal scale based on the final pitch indicating that they have a kind of fixed pattern or template representing a normal diatonic scale and that as they heard the stretched scale this template was slid along the pitch continuum. In this case it would, of course, be anchored on the real final note which was a semitone higher, and when the test pitch was presented they would make their comparison based on a shifted rather than stretched scale pattern. Very few people actually chose pitches that corresponded to the actual pitch played, indicating that memory for pitches is not absolute but occurs with respect to the structure that seems to be appropriate for a given context. An important feature of sequential pitch patterns is their contour, or Downloaded By: [McGill University] At: 22:23 22 August 2008 the pattern of ups and downs in a melody. This feature seems primarily to be used in STM. The use of contour for recognition of melody patterns seems to have importance where the relationship between the melody and a scale schema are not yet established or where there is no scale schema as in atonal music. It also seems to be a particularly strong factor in recognition for people with little musical training or listening experience (Dowling & Fujitani, 1971). This contour factor remains constant when one performs a tonal imitation of a melody (translation of the melody to another scale degree of the same scale where some major intervals become minor and vice versa). In this case the pattern of ups and downs remains constant even though the actual intervals and chromas have changed. As a dimension of similarity, it may be important in the development of musical material. Bart6k uses this similarity as a cohesive force (wittingly or no) when he augments the size of some or all of the intervals in a melodic theme: again there is a 28 Stephen McAdams

serious modification of chroma and interval size but the global contour is maintained. Frances (1984) has shown the perceptual salience of this feature in experiments where atonal were compared to either direct transpositions or imitations of these transpositions which kept the same contour but modified certain intervals. Subjects had difficulty distinguishing the transpositions from the imitations indicating that contour was the stronger feature used in the comparison. This was corroborated by Dowling & Fujitani (1971) who showed that subjects could readily distinguish similar melodies where the contours were modified. At a higher level of perceptual abstraction, studies that extract multiple subjective dimensions from listeners' comparisons among chords have shown that the intuitions of musicians on the importance of key distance have a psychological reality. In these kinds of studies (Krumhansl, Bharucha & Castellano, 1982) listeners with varying degrees of musical training are asked to judge the relative similarity of chords derived from different keys. The quantitative result is a kind of map of keys in which one finds continuous paths of harmonic modulation easily represented. Pitch intervals and chroma are primarily used in long-term memory. Studies by Dowling & Bartlett (1981) suggest that is important for melodic material stored in LTM which may mostly be material that we have become familiar with, since to get new material into LTM it must be repeated several times. However, if one has schemata for more efficiently new material, it can more easily be stored in LTM. Both experienced and inexperienced listeners, for example, store accurate interval information in LTM on first hearing a new melody when this melody is tonal (Dowling, 1982). With inexperi- enced listeners the information is simply stored as raw intervals, but with experienced listeners the intervals seem to be stored as chroma patterns linked to tonal scale schemata. Dowling & Fujitani (1971) found no difference between experienced and inexperienced listeners in discrimination of imitations and transpositions of "atonal" sequences.

Downloaded By: [McGill University] At: 22:23 22 August 2008 It should be mentioned that one often reads of pitch memory studies in the music psychology literature contrasting "tonal" with "atonal" melodies. In most cases the "atonal" melodies are constructed by randomly selecting pitches from the chromatic scale (though the work by Frances dating from the late 1950's was more carefully composed). Thus a structured tonal melody is compared to random atonal melody. This hardly does justice to the pitch structuring possibilities outside of the tonal system, and, in effect, does not test tonal versus atonal, but structured versus unstructured material. What is sometimes demons- trated is mostly a markedly unscientific (as well as naive) aesthetic bias on the part of the researcher who wants to "prove" the unreasonable lack of psychological reality in atonal music. We should take studies where researchers without appropriate musical culture place themselves in the shoes of composers with a large grain of salt (and swallow them with lots of mild to avoid the bad taste). Music: a science of the mind? 29

Rhythm

The analog of the pitch scale schema for rhythm is metric structure. The general consensus among rhythm researchers is that we use a metric beat pattern as a cognitive framework with which to organize rhythmic patterns in perception and production. Even in the case of very irregular rhythm sequences people attempt to infer a regular beat pattern (probably based on the generation of some kind of internal clock) and regular metric structure (Povel, 1984; Longuet-Higgins & Lee, 1984; Povel & Essens, 1985). This structure can be revealed by the way people's reproductions of heard rhythms deviate from the presented pattern, presumably moving towards a simpler pattern representation. Povel (1981) studied people's abilities to synchronize tapping with a repeated rhythmic pattern or to continue the pattern after it was stopped. One result of the study was an apparent tendency to reduce more complicated patterns to binary rhythms with period subdivisions that had a duration ratio of 2:1. If a metric context could be provided that made it easier to encode the more complicated patterns, accuracy of reproduction increased. Povel hypothesizes a two-level cognitive struc- ture including the determination of a regular beat pattern with a moderate tempo and then a simple (preferably binary) subdivision of beat intervals. The inferred metric structure that results from this operation then organizes the rhythmic patterns. According to Longuet- Higgins & Lee (1984), we still need a few more assumptions about this process, since a given rhythmic pattern could be placed on a metric structure starting on any of several different beats in the period of the meter. One such additional constraint would be that in the interpretive process one tends to avoid as much as possible relations between metric frame and rhythmic pattern that give syncopations. This is, of course, a very strong culturally-based assumption that would not apply to many non-Western rhythmic systems. To draw an analogy with the relation between melodic pitch contours and scale framework, Monahan (1984) suggests that patterns of Downloaded By: [McGill University] At: 22:23 22 August 2008 rhythmic subdivision are anchored to a beat framework, such that the encoding of subdivisions is of relative, rather than absolute, temporal relations. As with melodic contours, rhythmic contours (or relative patterns of long and short) can be expanded or contracted to fit different tempo schemes or translated along a pattern of accents. Listeners tend to use these two dimensions together to judge the relative similarity of rhythmic patterns. Different patterns played at different tempi might be judged similar if the inferred relative rhythmic contours are similar in form. It remains to be shown if rhythmic contours behave similarly to melodic contours with respect to their potential for memorization. Cross-cultural studies show that the form of the inferred metric structure depends entirely on cultural norms. A given rhythmic pattern might be anchored to a binary metric system with accents on the beat by a Westener and to a ternary metric system with accents just after the beat by a Central African, if we define the beat as being where one would 30 Stephen McAdams

clap in time, for example (cf. Arom, 1984, 1985). This implies that the schemata we develop for rhythmic interpretation are constrained by the norms of our culture. As Simha Arom is fond of saying, even after studying this music for 20 years he will never be able to hear these rhythms as the Africans hear them! There is a latent assumption here that there may be a limited critical period for the "natural" acquisition of a musical system in childhood as is evident in the acquisition of language. For further remarks on cross-cultural differences in rhythmic organization in African, Indian and Indonesian music, see Dowling & Harwood (1986, chap. 7). As with pitch, once again, rhythmic organization can be hierarchically structured. Much theory has been expounded on this subject and the reader is referred to work by Clarke (1985, this volume), Grisey (this volume) and by Lerdahl & Jackendoff (1983).

Timbre Not much work has been done on memory for timbral identities and timbral structures. It is clear that have a great capacity for timbral discrimination and the use of timbral differences to convey information. This is primarily found in our use of , where two of the main categories that carry meaning, i.e. vowels and consonants, have many physical similarities to the domain of sound structure we normally call timbre with respect to musical sound. The most notable research on mental structures for timbre has been done by Grey (1977) and by Wessel (1979; Risset & Wessel, 1982). In these studies, subjects were asked to judge the similarity of various pairs of instrumental and from these judgments a geometric "timbre space" was derived in which distance corresponded to degree of dissimilarity (Figure 3a). This notion corresponds psychologically to the "pitch space" represented by pitch height, the chroma circle and the circle of fifths, and may serve as a kind of mental schema that organizes Downloaded By: [McGill University] At: 22:23 22 August 2008 timbral sequences. We might expect to be able to extract from this schema a certain quality in the relation between two points and judge other relations as having a similar quality in other parts of the space just as we consider a pitch interval to be similar thrqugh all of the registers of musical pitch. Ehresman & Wessel (1978) tested this notion by asking subjects to judge the similarity of relations between pairs of timbres in an analogy task (Figure 3b,c). A pair of instrument timbres, A and B, were presented sequentially. Then a third timbre C was presented and a series of possible timbres D whose relation to C was to be judged. Listeners were to choose the D timbre whose relation to C was the most like the relation between A and B; a bit like the verbal analogy task, mother is to daughter as father is to.... The results indicate that if one considers the relation between A and B to be represented by a vector in the multidimensional timbre space, then the best completion of the analogy would be that D which formed the end-point or a vector that Music: a science of the mind? 31

a) bright muted TWO-DIMENSIONAL trombone TIMBRE SPACE GGOG English a horn clarinet sharp attack soft attack trumpet ce Do

flute bassoon clarinet sax muted iorn cello dull

b) TIMBRE ANALOGY

A is to B as C is to which D

A = English horn B = oboe C = cello

c) BEST SOLUTION IS D1

The timbre Downloaded By: [McGill University] At: 22:23 22 August 2008 interval English horn to oboe is approximately analogous to the interval cello to Eb clarinet

Figure 3 A geometric visualization of the mental representation of timbre can predict certain kinds judged relations among timbres such as the timbre interval. The top square (a) shows 12 instruments from the 2-dimensional timbre space of Wessel (1979) for the dimensions "brightness" and "bite". The next square (b) shows the subject's task which is to complete the timbral analogy (by listening, of course), i.e. to find a timbre interval similar to A-B that starts on C. The last square (c) shows that the best prediction based on the mental representation of timbral relations is satisfied by subjects' reports. We can therefore infer that there is a mental representation of timbres whose form is similar to this geometric plot where distance corresponds to degree of similarity. 32 Stephen McAdams

was parallel to the vector AB and of the same length. Thus, a timbre interval has not just distance in the timbre space, but orientation within it as well. To make a timbral transposition would involve sliding the timbral vector within some plane containing that vector. Based upon psychological considerations such as these, and others concerning the nature of potential form-bearing elements in music (cf. McAdams & Saariaho, 1985) or of general cognitive constraints on compositional systems (Lerdahl, in press), we should be able to develop ideas about the structural possibilities of timbre as an ordered and composed element in music (see chapters by Boulez, Bonnet, Lerdahl and Saariaho, in this volume).

Memory, musical structure and form It seems almost banal to reiterate that there is an immense amount of structure in musical stimuli. For the psychologist struggling to make sense out of the human capacity to comprehend music, however, a description of the stimulus is only the beginning. The psychological questions start with the given structure and ask what aspects of that structure are relevant to cognition — which of the many dimensions of a musical stimulus structure have psychological reality, and how fine a grain of discrimination on those dimensions is relevant. Powling & Harwood, 1986, p. 163) The pattern of memory process that evolves through the development of a piece include both specific memories for salient phrases or themes and less specific memories for more global invariants in the pattern. These two kinds of memory correspond more or less to Tulving's (1972) episodic and semantic memory, respectively. The time of an episode may be related to the length of the "perceptual present" or the time within which ongoing experience is available to consciousness and during which incoming information can be organized. We might think Downloaded By: [McGill University] At: 22:23 22 August 2008 of it as a kind of sliding window, the width of which depends on the process of attention and on the density of new information coming in to be processed. Estimates of its width tend to be on the order of 2-5 seconds though some extend it to about 10 seconds. The processing and organizing limits of this window have a considerable influence on the amount of information that could be integrated into a conceptual unit to serve as a theme, for example. It is rather unthinkable that a melody lasting 2 minutes could be held in memory arid understood in various kinds of transformation as a whole unit. The memory of a piece is organized in a series of episodes of different lengths that depend on the musical structure. Rhythm is a very important element for episodic musical memory since it can help to hierarchically organize material for more efficient storage. It also tends to help segment material into phrases or sub-phrases. With respect to the entire structure of a piece an episode can correspond to the level of Music: a science of the mind? 33

the phrase. Phrases tend to have a kind of stability in memory. Memory of sequences within a phrase structure has been shown to be better than for sequences that traverse a phrase (or episode) boundary. This may imply that episodes are stored as structural units (Dowling, 1973). They are probably very closely linked to structural organization processes such as auditory stream formation and segmentation, as well as to processes that infer harmonic and metric structure (see next section on organizational processes). They are also probably linked to structural schemata that give rise to expectancies and then evaluate whether the music goes in the expected direction or not. What stands out as unexpected and unique in form is more likely to be remembered. "Semantic" musical memory, or memory for global meaning or significance, is much less well understood, because less thoroughly investigated. There is first of all a problem of defining what "meaning" or "significance" mean with respect to musical structures, though we may be on the verge of being able to follow out some of the insights given us by Leonard Meyer in the 1950s with the carrying over of methodologies from psycholinguistics into music psychology. This type of memory is probably linked at a structural level to processes underlying pattern perception, and at a contextual level to processes of mental set and expectation based on wider cultural influences. No discussion of memory and musical form would be complete without some mention of the importance of hierarchical structural organization. This notion has already come up in relation to memory for pitch, rhythm and timbre, but all of these different elements must converge somewhere. The main question that considerations of hierar- chy attempt to address is how local and global features of structure are related? Concerns with "formal coherence" on the part of many composers have led to all kinds of theorizing about how far one can push structural coherence between the micro-acoustic universe and the total structure of a composition. Without getting lost in such arcana, let us, anyway, look at a few basic points. One advantage of hierarchical structures is that they are an efficient Downloaded By: [McGill University] At: 22:23 22 August 2008 way to relate local and global features. Such structures have been recognized for some time as possible ways of representing many different kinds of information in vision, , motor performance and language (see Lerdahl & Jackendoff, 1983 for a hierarchical theory of music structure organization; Lerdahl & Jackendoff, 1984 present a concise overview of their theory, and Swain, 1986, discusses some of the psychological limits that should be considered in any theory of hierarchical structure). The existence has been demonstrated with greater or lesser degrees of success for linear orderings of lists (Klahr, Chase & Lovelace, 1983; Johnson, 1970), for mental images (Reed, 1974; Reed & Johnsen, 1975), for propositional structures in language (Anderson, 1985, chap. 5), for mental schemata (Sowa, 1984, chap. 2), and for musical phrase structure (Stoffer, 1985). Other evidence for the existence of a hierarchical representation of temporal structure is drawn from research on visual or musical phrase 34 Stephen McAdams

structures. Phrasing is the introduction of various expressive variations in a sequence that enhance the perception of the structure (at least in the case oigood phrasing!). Good phrasing enhances memorability and bad phrasing hinders it, presumably by strengthening or weakening the hierarchical sense of the structure. This has been shown by recognition experiments for temporal patterns of light flashes (Restle, 1972), by musical dictation (Deutsch, 1980), and by the time it takes to indicate the presence of a click superimposed on a musical phrase (Stoffer, 1985). What one does with phrasing in music is not only to show the basic structure; it has a functional value of articulation as well. This is different from merely revealing structure. It helps one hear something else in the structure by, for example, resolving a structural ambiguity in one way or another (Longuet-Higgins & Lee, 1984). There is evidence that purely symmetrical tree structures may be too simple for communication or aesthetic purposes. Interesting variations usually take place along at least two dimensions in the development of musical material, e.g. rhythmic elaboration plus transposition in a melody (see chapter by Lerdahl on developing hierarchical structures with timbre in this volume). In the sonata allegro form, for example, two contrasting themes that are easily distinguished one from the other make it easier for listeners to keep track of where they are in the piece. The assymetry provided by alternation of different materials also makes the structure more easily remembered by breaking a structural homoge- neity (cf. Restle, 1970). Departures from structural homogeneity can be accomplished by the elaboration of repetitions and by introducing different material between repetitions as in the sonata allegro form. This may also be accomplished by embedding differing sub-phrase structures within a phrase in a hierarchical fashion (DowHng & Harwood, 1986, chap. 6). There evidently are other kinds of structural relations in addition to hierarchies, which may be of a more associational and functional nature. A theme can be associated with any number of variations on it in memory, the strength of association depending on the degree of Downloaded By: [McGill University] At: 22:23 22 August 2008 similarity and the clarity of the understanding of how one got to that transformation in the music. Relations among elements in a musical structure at this level can be ones of transposition, harmonic modula- tion, rhythmic and melodic elaboration, repetition, timbral variation and so on. The perception of the well-formedness of a phrase or section may well depend on some optimum balance between recognizable transfor- mation and the degree of difference between the original material and the transformed version. Simplistically speaking, similarity among lower level features such as melodic themes may be given relief or musical interest by dissimilarity at a higher level such as harmonic modulation or some structural transformation; or, to the contrary, greater variability of the material's features can be rendered coherent with less change at higher levels of structure. To dose this section on mental representation of musical structure I would like to summarize the three possible levels of knowledge that can Music: a science of the mind? 35

be represented by any structural description of a . The first is the musical structure itself which may be developed independently of psychological considerations, purely on the basis of the material that is proper to the piece. This is classically the work of the musicologist and music analyst. The second level is one of describing knowledge the listener has and uses in understanding and producing music, what might be called musical competence in the terms used by the linguist (1965). Here knowledge is described in terms of invariants of a given piece and style, and it represents more than merely the memory for particular instances of the piece as heard — it is rather an extracting of the grammar of an idiom. This level is very nicely exemplified by the work of Longuet-Higgins (1976) and of Lerdahl & Jackendoff (1983). The third level is one of description of the processes by which musical information is handled by the brain-mind. This level must represent the knowledge of the listener with the actual structural properties of mental (or physiological) representations and with the processes by which the knowledge is put to use by the listener, performer or creator. This level of description is the goal of theory and experimentation in music psychology. The important element in evaluating a theory is whether a given structural description makes a suitable basis for an account of the processing that may be inferred to be going on by experimentation. The form of representation is very important for discerning the nature of the processing that is done, because it heavily affects how easy it is to do certain kinds of things with it.

Organizational processes in music listening Music derives from a complex acoustic signal that is highly structured and rich in intrinsic and extrinsic context. Intrinsic context consists of the relations among the elements within a given musical structure and extrinsic context consists of the relations of the piece to a musical style

Downloaded By: [McGill University] At: 22:23 22 August 2008 and culture, including any extra-musical symbolic and codified emotio- nal associations. A psychological theory of musical organization aims to describe the way a listener organizes this complex signal into a complete form. What will be described in this section is basically a beginning approach to a theory of the processes underlying music listening and by no means claims to address all of the processes involved in musical experience. It will be limited to the perception and comprehension of musical structure and will not attempt at this point to respond specifically to aesthetic, emotional and sociocultural aspects of musical experience that contribute significantly, nonetheless, to an appreciation of musical form. The primary query of such a theory is "what can possibly be heard and what tends to be heard in a complex musical work!" The question implies that there are acoustic structures that one cannot apprehend as such, and thus proposes that there must be, after all, biological and psychological 36 Stephen McAdams

limits to the comprehension of musical form. The question also implies that there are default tendencies in hearing that one brings from the everyday world or from specialized musical training into a musical listening situation. These tendencies can place limits on the way in which certain musical structures are apprehended. As an approach to understanding musical experience through the framework of cognitive science, the theory should address the mechan- isms of processing, the forms of representation (and storage) that these processes operate upon, and the final comprehension of extended temporal structures. Some of the primary psychological interests here include: — the temporal and hierarchical ordering of processes of musical organization from psychological to cognitive levels, — the development of the notion of psychological coherence with respect to the cognitive processing of complex temporal structures and in relation to the inherent coherence of the acoustic environ- ment, — the nature and operation of preference rule systems in auditory organization that select one given possible organization of a structure over another. — the psychological constraints on perceptual dimensions (such as pitch, rhythm and timbre) that are capable of carrying temporal form, — the interaction between perceptual processing and memory struc- tures in the assembling of large-scale temporal forms, — the construction and acquisition through learning of mental schemata that permit a sense of development of tension and relaxation, expectancy and directed motion in music, and — the development of notions of lexicon and meaning in a non-verbal medium without committing the error of expecting them to behave in the same manner as language. The theory must ultimately describe the main functionalities of a "language" that is specifically musical and propose constraints on its Downloaded By: [McGill University] At: 22:23 22 August 2008 form-bearing elements as well as provide the structural and procedural grounds for eventually investigating aesthetic and emotional aspects of musical experience. We may consider five main areas that need to be included in the theory: (1) reading the acoustic surface, (2) organization of acoustic information into coherent auditory "im- ages", (3) segmentation and the extraction of the musical lexicon, (4) building structural relations, and (5) following a musical discourse. All of these levels and kinds of processing create and operate upon the mental representations of musical material and structure that were discussed in the last section. Music: a science of the mind? 37

Reading the acoustic surface This level of the theory must account for the type of information that is made available to higherjevels of processing by the peripheral auditory system as it transduces the vibratory information into nerve impulses. This is classically considered as the domain of psychoacoustics. Within the limits of its accuracy, the ear encodes a sound in such a way as to represent its temporal and spectral (frequency) characteristics. It does this by encoding the sound activity within small bands of frequencies in separate nerve channels, each of which is capable of following the fine- grained time structure within that frequency . This allows for the eventual extraction of periodic behavior as well as the tracking of overall intensity fluctuations. In this way the frequency domain is represented across the array of some 80,000 nerve fibers that enter the brain from each ear, and the time domain is represented within the pattern of activity in each nerve fibre. This parallel representation of reciprocal properties (periodicity is the mathematical inverse of frequency) is important since various characteristics of a sound-producing body are at times more accurately represented in one domain than in the other as we discussed at the beginning of the last section. At this level also, one takes into account the important effects due to masking, i.e. the partial or complete covering of one sound by others. Masking, as well as other limits of auditory discrimination, are considered to be constrained by the temporal and spectral resolving power of the ear, that is, by the degree to which the ear is capable of encoding very fine-grained differences in frequency and time. For example, the information from two sound events that are similar in character and arrive within about 25-50 milliseconds of each other will tend to be integrated over time and perceptually fused into a single event; whereas those separated by more than 50 milliseconds are easily heard as separate events. Similarly, two frequencies that fall within a certain range (within about a minor third for middle and upper registers, and as much as a fourth or more at the bottom of the piano) tend to

Downloaded By: [McGill University] At: 22:23 22 August 2008 interact with one another, creating beats and a rough sound quality, among other effects. This range, called the critical bandwidth, represents the limit within which separate frequencies tend more to be fused together than separated perceptually. These two kinds of resolution limit, temporal and spectral, depend a great deal, of course, on the constitution and behavior of the sound events. An understanding of the nature of this parallel, low-level representa- tion of the acoustic environment becomes important as an explanatory tool for the mechanisms underlying the organization of the sound world. Attentional processes guided either by listening or by visual information in a musical score, for example, may modify this phase of analysis and encoding so that the detection of "primitive" features, such as the beginning of a new sound event, may not be totally automatic, but, to some extent, under the influence of activity at more cognitive levels that "scan" the information array for certain objects or events that 38 Stephen Me Adams

may be "expected" to occur given the preceding context. Research on the perception of sound qualities such as pitch, roughness, , brightness and so on has classically been treated at this level, considering such perception to precede most other levels of auditory organization. However, recent work strongly suggests that the perception of qualities even as "primitive" as these is dependent on the result of sound source organization, which is discussed at length below. The data from experiments on sound qualities may still be considered to be relevant to our present concerns with a couple of caveats. The results are almost always obtained for single sound complexes presented in isolation to listeners. There is, therefore, very little context to conflict with an interpretation of the sound complex as a single source. One might presume, in this case, that the results indicate what one perceives after the acoustic elements have been organized into sound sources. However, it should be understood that these results are often obtained under conditions of extremely prolonged listening by highly trained and analytic listeners. They represent refined listening capacities under circumstances that are seldom found in a musical situation where one does not hear the same family of sounds presented thousands of times over headphones in a sound-proofed listening chamber. This is not to degrade the importance of these results for understanding the limits and certain aspects of the mechanisms underlying the perception of sound qualities. I want merely to signal the fact that the results of controlled laboratory listening can be quite different than would be the case in a concert hall, where in the same hour's time a myriad different sounds in a rich context go flying by. It is more and more clear that context affects considerably the perceptual end result of even very "low-level" qualities which may be considered as properties that emerge from a given organization of the sound environment.

Organization of acoustic information into coherent auditory images Downloaded By: [McGill University] At: 22:23 22 August 2008 Most of my own experimental work to date falls into this phase of the hearing process and has been concerned with elaborating a metaphor, the auditory image, that informs both compositional and experimental paradigms (McAdams, 1984a,b). A good metaphor in any domain of systematic thought should both provide a generality of description of existing knowledge and have predictive capabilities that generate new ideas, experiments and structures of thought. The auditory image is defined as a psychological representation of a sound entity that exhibits an internal consistency (or coherence) in its acoustic behavior. For this metaphor to be of any use in illuminating musical experience, it is necessary that the image, as a mental projection into consciousness, be derivable from perception, memory and imagina- tion and that there be a commonality of mental representation that allows for this. It is also important that the metaphor allow for multi- Music': a science of the mind? 39

leveled (or hierarchical) representations since a sound "entity", such as a chord or orchestral texture in polyphonic music, may also be a collection of other sound entities. In this way, the metaphor allows for the representation of more complex perceptual structures that result from the grouping of less complex structures. Two of the key concepts here are the notion of coherent behavior of a sound entity and the notion of grouping processes that operate upon or seek this coherence in their attempt to form an accurate and plausible representation of the world's behavior. There are several levels of coherence that might be considered. The first is the coherent behavior of physical sound-producing objects that obey the laws of physics. It seems reasonable to presume, since the human perceptual system evolved in relation to the physical world, that this level of coherence should be reflected to a large extent in perception itself (Shepard, 1984). Most musical instruments are physical vibrating systems, with the exception of electronic sound synthesizers and processors whose complex behavior (such as found in additive synthesis or FM synthesis) maintains a mathematical consistency, but does not necessarily correspond to that of a vibrating physical system. Sound structures produced with these instruments can often present illusory situations to the hearing system, since many of the acoustic dimensions are not always correlated among themselves as they are in a physical system. Another level of coherence is psychological, and relates to the tendencies and limits of processing complex sound structures. It is possible, for example, to create sound structures with an acoustic musical instrument that are not assimilable nor comprehended as such as musical form, and there are several examples of from the integral and stochastic process idioms that illustrate this. This does not mean that no form is heard in these works nor that they do not have aesthetic value based on something else. It rather points to the gap that is possible between the intended structure notated on the page and produced in the air by instruments and the received form Downloaded By: [McGill University] At: 22:23 22 August 2008 that is accumulated "in the head." These are examples of what Lerdahl (in press) calls "cognitively opaque" musical structures where there is a significant discrepancy between the compositional grammar plus some intuitive constraints used by the composer to create the piece, and the cognitive grammar possessed by the listener who tries to understand the work. Part of this reflects the fact that psychological coherence depends to some extent on the interpretive schemata of everyday perceptual processes and past experience. This implies that what is coherent psychologically may evolve with specialized musical experience. However, this most likely also reflects some biological limits in structural processing. That is, we have a biological predisposition to process certain information structures and not others. (Cf. Gould & Marler, 1987, for a more general consideration of the biological limits of learning.) The third level of coherence would be that of the musical 40 Stephen McAdams

. , . grouping rules simultaneous 6 K grouping sequential | * processes grouping (generation of processes J preferences)

image processes formation of attention and selection (evaluation of preferences)

acquiredmental schemata and the accumulation musical knowledge J T emergence of image I qualities et properties

- "musical surface"

processes of segmentation I and higher-level | grouping

"musical" perception Downloaded By: [McGill University] At: 22:23 22 August 2008

building of structures following musical discourse

Figure 4 Schematic diagram showing the flow of information through the various levels of organizational processing in music listening, the building up of a description of the current musical world, and the accumulation of musical knowledge. Music: a science of the mind? 41

structure developed by a composer, which may or may not correspond to a psychological coherence. A theory of musical organization proces- ses should address itself mainly to the notion of psychological coherence, but should consider physical coherence as a limiting case. A more difficult task would be to investigate musical coherence in terms of the larger aesthetic and cultural issue of what a composer might possibly expect an audience to assimilate psychologically. In order to understand the emanations of a sound entity, the auditory system must be able to extract the properties that derive from the ensemble of its elements. To do this it must know which ones to consider as an ensemble. The process of grouping is an attempt on the part of the hearing system to decide which acoustic elements "belong" to a given sound entity. This proposes that music listening is an information processing network wherein the grouping processes form a model or description of the current state of the world based on past experience and on incoming sensory data (Figure 4). This model also embodies expectations of the most probable continuation of the current situation, which constrains possible perceptual and musical interpretations of the behavior of the sources. A re-evaluation of the grouping decisions may be triggered by the detection of discrepancies between the current model and incoming sensory data, indicating the arrival of a new event and, potentially, a new source. The theory postulates two sets of rules for grouping concurrent and successive acoustic elements. These rules define psychological coher- ence and many of them reflect very strongly the relation between physical and psychological coherence at this level of organization. These rules serve to form a description of source events and to group them into source streams (Bregman, 1981; McAdams & Bregman, 1979; McAdams, 1984a,b). This is an important step for auditory theory in general since the form the rules take bridges what may be considered a long-standing gap between studies of higher level cognitive processes of hearing and the lower level processes of auditory sensation and perception that are classically studied with the quantifying methods of psychophysics. This

Downloaded By: [McGill University] At: 22:23 22 August 2008 form refines the classic Gestalt principles of organization for audition by connecting them to their underlying psychoacoustic processes. This is equally an important step for music theory as it proposes a framework within which to evaluate realistically the relation between the limits on the processing of musical structure and the assimilation and apprecia- tion of a musical form. There are several criteria for the grouping of concurrent or simulta- neous elements into source events. The acoustic elements from a single source tend to have a common spatial origin, coherent frequency behavior (McAdams, 1984b), coherent amplitude behavior (Dannen- bring & Bregman, 1978; Hall, Haggard & Fernandes, 1984; Bregman, Abramson, Doehring & Darwin, 1985), harmonicity or common periodicity of components (Darwin, 1981; McAdams, 1982; Scheffers, 1983; Hartman, McAdams & Smith, 1986; Moore, Glasberg & Peters, 1986) and coherent behavior of resonance structure (McAdams, 1984b; 42 Stephen McAdams

Darwin, 1984). This set of rules assembles acoustic elements that together constitute an event emanating from a sound source. They operate mainly on the way the various acoustic dimensions change over small periods of time and attempt to track these changes and determine which elements are changing in a similar fashion. This set may therefore be paying attention primarily to the temporal encoding of information across the nerve fiber array. The criteria for grouping sequential elements include continuity of spectral content (related to changes in pitch and timbre) and continuity of intensity (Bregman & Campbell, 1971; van Noorden, 1977; Bregman & Pinker, 1978; McAdams & Bregman, 1979; Bregman, 1981). This set of rules connects events across time that come from the same sound source. They appear to operate primarily on average estimates of spectral activity within events and thus over a slightly larger time frame than concurrent grouping. This set would appear to pay more attention to the general form of activity across the nerve fiber array and detect sudden changes in this activity indicating a new sound source entry. It would also need to invoke higher level processes to connect similar events across time. Both sets are, of course, limited by the spectral and temporal resolving power of the auditory system, i.e. perceptual integration into "texture" or a complex timbre may occur for high temporal and spectral densities. All of these criteria are graded conditions for grouping, meaning that their presence may influence, but does not necessarily guarantee, grouping. The degree to which they induce grouping depends on their strength. For example, the fusion of frequency components into a single image due to vibrato can be stronger than their separation due to the fact that the components come from two different spatial locations, i.e. frequency components split between two loudspeakers can be made to fuse into a single image if they have a common vibrato (McAdams, 1984a). The rules establish preferences among a set of logically possible auditory organizations, and a projected form is imposed on the acoustic surface that represents the evaluated highest degree of overall prefer- Downloaded By: [McGill University] At: 22:23 22 August 2008 ence. Indeed this form is the one judged to be the most coherent or stable and results in the image formed in consciousness (Figure 4). Considering these criteria in this way leaves open the possibility that they may interact with one another, which is clearly the case. The various criteria may converge on a similar structure, leading to a clarity of perception, or they may diverge, proposing different structures, in which case structural ambiguity results. Thus, the grouping rules form an interactive system that operates according to principles of reinforce- ment and conflict. These principles operate at many different levels of musical organization which is one of the properties of musical structures that gives them their vast richness. We should be careful to distinguish ambiguity from indeterminancy here. I refer specifically to a desired structural ambiguity that gives rise to a polyvalence of form. This raises the aesthetic question of what kinds of ambiguity are relevant and valued, and in what kinds of contexts. Music: a science of the mind? 43

Higher level grouping may be conjectured to involve criteria of "common fate" behavior-such as similar trajectories along a given musical dimension or even higher level musical coherence such as the degree to which elements belong to or contribute to a similar structural frame (harmonic, metric, textural, etc.)- It would appear that what is derived as a perceptual quality after one level of auditory organization may become an "element" contributing to grouping decisions at a higher level of organization. For example, the separation of several series of harmonic frequencies and the subsequent perceptual fusion of each one gives rise to a distinct pitch for each series. At the next level of organization, some of these pitches may be grouped according to the musical context as members of a common chord whose quality of being consonant or dissonant arises from their being considered as a group. Melodic context may, however, pull one of these pitches into a sequential organization and its perceptual contribution to the disso- nance of the chord, for example, may be diminished. Harmonic consonance and dissonance can thus be influenced by polyphonic organization (see Wright & Bregman, this volume). With most of the sound sources encountered in everyday life, the simultaneous criteria at the lowest level of grouping assemble acoustic elements into source events and the sequential criteria connect events appropriately into meaningful source streams. In these cases, all grouping criteria mutually reinforce one another and what emerges may be considered as a stereotypical (or prototypical) grouping (Jackendoff, 1983, chap. 8). However, some work has shown that with certain stimulus configurations, even simultaneous and sequential grouping processes can conflict with one another creating situations with multiple perceptual interpretations. The resulting perceived qualities of the sources depend on the way the conflict is resolved (Bregman & Pinker, 1978). It is clear that active and passive attentional processes can play a strong role in the resolution of these conflicts, particularly at higher levels of grouping where functional ambiguities resulting from conflict- ing "vertical" (harmonic) and "horizontal" (melodic) organizations can Downloaded By: [McGill University] At: 22:23 22 August 2008 be of great musical value. One of the more compelling conclusions from work on auditory organization, as mentioned in the previous section, is that the grouping of elements into auditory images occurs before the perceptual qualities of the source image are evaluated (at least within a relatively small time frame, probably less than a second; see Figure 4). This is evidenced by the fact that with a same number of harmonically related frequency components, for example, one can obtain either a single pitch or two pitches an octave apart simply by introducing independent vibratos onto each subset, thus inducing the auditory system to split the even and odd components into separate images. This has important consequ- ences for music composition with electronic or computer sound synthesis and for technique since it implies that arriving at a given musical quality depends on arranging one's sound elements so that they are organized appropriately by the auditory system. In 44 Stephen McAdams

essence, these principles of auditory organization give composers additional conceptual tools with which to ply their trade. The important psychological notion here is that of emergent properties of a perceptual organization. Qualities emerge from perceptual group- ings or, as Bregman claims, are assigned to sources. Thus, the timbre or pitch of a sound depends in part on the relations of the amplitudes and frequencies of the components that constitute the auditory object. If some of them are extracted from an object and associated with another, the qualities of the first one would change. At a higher level, the dissonance of a chord depends upon which of a group of recurrent notes are grouped together as the chord. If a more dissonant component is pulled into a separate melodic organization, its departure from the concurrent organization may diminish the perceived dissonance of the chord. Much of the composition of what is called texture in the music of this century, in effect, plays upon the strength of grouping of elements extended across pitch and time into large-scale groupings. It depends on a kind of balancing between fusion of the elements into a coherent organization and their fission into separate organizations. We perceive on the one hand its global texture (or the group as a whole) and on the other its grain (or a partial distinguishing of the individual elements). This play between what Boulez calls fusion and articulation is an important aspect of orchestration in the composition of musical timbre (see his chapter and that of Bonnet, this volume, as well as Erikson, 1975, chap. 6, and 1982). This suggests a continuum between fusion and fission that allows the composer to move between the perception of unity and multiplicity in an agglomeration of sounds, and thus to move between timbre and harmony (see Saariaho, this volume). The notion that fusion and fission are not two points of a binary relation but rather opposite poles of a continuum means that the strength of presence of emergent qualities can be modulated by the strength of the grouping: the stronger the tendency for a given organization, the stronger the presence of the quality. It seems clear that in intermediate cases the Downloaded By: [McGill University] At: 22:23 22 August 2008 attentional processes of the listener can strongly influence where on the continuum the perception falls. To conclude this section, let us say that what emerges from this stage of perceptual processing is what Lerdahl & Jackendoff (1984) have called the "musical surface" (see Figure 4), though I depart significantly from them since they consider the surface to be the "physical signal of a piece when it is played," and I consider it to exist at a certain level of perceptual abstraction, i.e. after the detection of events, their organiza- tion and the assignment of their qualities. The acoustic environment has been organized into objects and one may now pay attention to the events, their qualities, their relations to other events and the structure that they carry. Music: a science of the mind? 45

Segmentation and the extraction of the musical lexicon It is at this next level that truly musical organization begins to take place. The theoretical concern is how the organized stream of musical events is broken into meaningful chunks at various levels of hierarchical organi- zation and represented in memory so that recurring patterns (with variations) can be recognized later in a piece. Essentially, this aspect of a theory should describe the auditory analysis of the thematic musical material used by a composer that is subsequently re-presented in varied forms in the development of a composition. The thematic material may be considered as the musical lexon used by the composer within the confines of the grammar of his musical language to construct a musical discourse. A great deal of description of how variation and transforma- tion is effected in a musical discourse has been offered by music theorists, but what music theory itself has not addressed until recently is the psychological nature of the listening activity and its tendencies and limits of analysis and reconstruction of musical form as the sound events come incessantly through time to the ears of the listener. A laudable attempt at a more listener-oriented music theory has been recently published by Lerdahl & Jackendoff (1983) who develop a cognitive theory of musical competence for listeners experienced in the classical Western tonal idiom (e.g. the music of Bach, Haydn, Mozart and Beethoven). The major achievement of this theory is its formal expression which should in practice lend itself to scientific verification. A primary focus of the theory is how a listener hierarchically segments a stream of musical events (the musical surface structure) into motives, phrases and sections and derives a metric structure from the patterns of strong and weak events. The authors rightly intuit that a music theory which is "generative" (in the sense of Chomsky, 1965) must not only assign structural descriptions to a piece (of which many are logically possible), but must, as well, describe how a listener "selects" among the possible and arrives at a single heard structure at a given listening. To accomplish this, two types of structuring rules are postulated: the well- Downloaded By: [McGill University] At: 22:23 22 August 2008 formedness rules which specify the possible structural descriptions (the ensemble of rules corresponding more or less to a mathematical definition of hierarchy) and the preference rules which designate those descriptions which correspond to a listener's preferred hearing of the piece. The well-formedness rules put limits on what constitutes an allowable group structure and represent the rules of grammar, while the preference rules correspond more to perceptual laws of organization of form. The relative importance of the preference rules is a reflection of the sense of music as more or less pure structure to be manipulated psychologically, within certain bounds, by the listener. It is in relation to the nature and organization of preference rules, which may be considered as laws of aesthetic preference modulated by cultural convention and personal experience, that one can eventually begin seriously to entertain questions about aesthetic aspects of musical experience. 46 Stephen McAdams

Experimental work suggests that Lerdahl & Jackendoff's conception of the ensemble of preference rules for grouping structure is reasonably satisfactory for predicting the responses of listeners. Deliege (1985, in press) proposes a repartitioning of the rules based on perceptual mechanisms, as opposed to music theoretic dimensions. The two kinds of mechanisms that appear to effect segmentation operate upon 1) spectral continuity which is directly related to changes in pitch, timbre and dynamics (as with sequential grouping processes), and 2) a number of temporal factors such as change in duration, articulation and rests between groups of notes. This repartitioning may both simplify the rule set and better predict the nature of structural ambiguities that do result when rules enter into conflict with one another. Also, these perceptual mechanisms appear to be closely related to those responsible for sequential auditory organization, indicating that segmentation grouping could influence event stream formation, i.e. influence the construction of the "musical surface" which should ostensibly already be organized into source streams. The construction of the "musical surface" is accepted as a given that is not even conceivably operated upon by higher level processes in Lerdahl & Jackendoff's theory. This shows the bias of the theory as one of musical structure representation more than of structural processing. These issues will eventually need to be addressed in further refinements of the theory. Another area not considered by Lerdahl & Jackendoff which a more complete psychological theory would need to address is the involve- ment of memory processes in the storage, categorization and later recognition of musical motives and themes. Segmentational grouping is often referred to as "chunking" in memory research. A great deal of work has shown that there are severe limits on how much information can be held in short-term memory (STM) which is a kind of temporary storage for information that has recently arrived through the senses or has been retrieved from long-term memory (LTM). Remember that about 5-9 organized chunks of information can be held in STM. A chunk can be as simple as a single note, and as complicated as several notes Downloaded By: [McGill University] At: 22:23 22 August 2008 arranged in a small hierarchical organization, a phrase for instance. The more organized the information is, the more of it one can hold on to. It is convenient to consider these organized chunks as musical fragments. What this research implies for musical organization is that segments that are to serve as musical material for further development must be either very simple or well-organized, making them unified enough to be a chunk in STM, and small enough to fit into a limited time frame, what might be called the window of the perceptual present. A grouped fragment will only be successful as musical material insofar as it can be stored and can serve as a pattern against which later variations or related structures are recognized or compared. It is certain that the nature of memory encoding places certain limits on the form that groups can assume and still be easily recognized in order to "tie together" similar groups at more distant points in a composition. For a segment to contribute to the perception of relatedness with variations or transformations of the Music: a science of the mind? 47

material, it must be striking enough to have been presented enough times to be stored in LTM. This may occur through repetition within a given piece or across several of a piece. This shows the primary importance for comprehension of redundancy and repetition in music. To comprehend large-scale' temporal structures, their building blocks must be storable in LTM in order to contribute to the building of structural relations. What are equally important are the kinds of mental schemata for music one has acquired, since the more familiar the material is stylistically, the easier it will be to store in LTM as well. Considering the segmentation of the musical surface into groups as a crucial step in the extraction of the musical vocabulary of a piece raises some interesting issues about the possible nature of the material of music and the structural limits of a musical "object." This kind of formalization of the problem should eventually allow experimentation on the degree to which the segmentation criteria are universal or culture-specific. (See Reynolds, this volume for a consideration of the importance of recognition and identifiability of musical "cells" in the construction of musical form.)

Building structural relations10

This level of the theory would describe the way in which relations among musical elements assembled at lower levels begin to give rise to mental structures (Figure 4). Part of this process involves deriving a hierarchy of structurally important events or motives with respect to their position in the grouping organization determined in the previous stage. As such it corresponds partially with the "metrical structure" and "time-span reduction" rule sets of Lerdahl & Jackendoff (1983). The metrical structure rule set derives a hierarchical ordering or ranking by degree of importance of the beats that a listener attributes to a piece. The time-span reduction rule set determines the relative structural import- ance of pitch events within the heard rhythmic units of the piece. In fact, Downloaded By: [McGill University] At: 22:23 22 August 2008 Lerdahl & Jackendoff's theory is admitted to be purely hierarchical, not taking into account associative relations in the description of musical structure. The main issues at this level should include the categorization of musical material, the recognition of similarity or relatedness of various transformed versions of the material, the encoding in memory of associative relations and of hierarchical relations, and the emergence of various structural properties. It is not yet clear to what extent musical qualities such as timbre and texture lend themselves to hierarchical organization, though serious attempts are being made to move in this direction (see papers by Lerdahl and Saariaho, this volume). It is quite clear, however, that timbre can play a strong role in one's perception of the structure of the piece. It seems possible that building timbre organizations on the basis of consonance and dissonance in a much larger sense of these words could give rise to a sense of directed motion 48 Stephen Me Adams •

and thus to the kinds of anticipatory mental schemata that are the requisite for a hierarchical form-bearing dimension. The apprehension of form in music obviously depends on the perception of the structure of events across time (see papers by Clarke and Grisey, this volume). We need to investigate the extent to which and the manner in which this structure can be apprehended by a listener. The extent partially depends on morphophoric, or form-bearing, capacities of the sound medium with respect to the perceiving individual. These capacities represent basic biopsychological constraints on the apprehension of form. The manner depends on the degree and type of musical experience and training of the listener. An important question is the way in which and the degree to which musical material can be transformed or varied and still be perceived as related to previously presented material. The perception of the related- ness of musical elements separated in time and the perception of redundancy partially determine the perception of temporal form. Redundancy results from a strong perception of relatedness. The notion of relatedness bears very hevaily on the nature of the constraints on form-bearing elements in music, i.e. one cannot simply choose any arbitrary combination of notes or organize variations of that combination and expect in all cases that it will give rise to a strong or perceivable musical form. Large-scale form is in some sense limited by the nature of the material that is structured to give rise to that form, and the task of psychological theory is to understand the limits of the cognitive capacities of a listener to process such material. The perception of relatedness and the degree of relatedness depend on the perception of similarity under various classes of transformation of musical material and the nature of morphophoric capacities determines the kinds of possible transformation. If some musical element is easily remembered, and its transformations easily recognized, its potential contribution to a form will be greater than another element which is very difficult to remember. The relation between the mechanisms of grouping and the perceptual Downloaded By: [McGill University] At: 22:23 22 August 2008 dimensions along which groups are organized is of paramount import- ance for considerations of which perceptual dimensions are capable of carrying musical form, such as pitch, rhythm, and some dimensions of timbre. One can easily imagine the formation of groups with similar pitch, timbre and rhythmic content, but it is less likely that one would group musical notes on the basis of the amount of vibrato present, for example. From recent investigations in the realm of musical timbre have emerged several criteria for musical form-bearing elements. These include the following six criteria (from McAdams & Saariaho, 1985). Form-bearing elements are perceptually differentiated into discrete categories. This means they are either artificially restricted to differentiable categories, as is the case with musical pitch, or a continuous acoustic dimension is broken into categories by the auditory system, as is the case with speech phenomes. We remember discrete dimensions much better than continuous ones. This is not to say that in the case of musical Music: a science of the mind? 49

pitch, for instance, we cannot appreciate whether a note is out of tune or not: of course,-most of us can; but if a melody is played that continually changes in pitch and only passes through the appropriate pitch at the right time, one generally has a difficult time remembering this form in order to recognize it later or to compare it with another similar form. Where a categorized dimension may serve the building of structure, continuous variation along this dimension may serve for expressive transformations of the "ideal" categories as we find in expressive intonation and in micro-tonal ornamentation in Indian and , or in micro-temporal variations in expressive rhythmic perform- ance (Clarke, 1985). The classification of points along a single dimension or several correlated dimensions helps also with the characterization of the properties of musical entities. It is with the contrasts between these entities that musical structure is built. The perceptual categories are ordered so that the relations among them are functional. This relates to the syntactic characterization of the properties of musical entitites, and is a very strong characteristic of the pitches in modal and tonal music systems, e.g. the functional relationship between the seventh degree of the scale and the tonic is different than that between the seventh and the sixth. These relations are not, of course, entirely independent of the context, but they certainly have strong tendencies of their own due to their position within the system. This ordering may be of different types, e.g. the hierarchical ordering of or the serial ordering of dodecaphonic music. The way of ordering often places severe constraints on the possibilities of musical form that can successfully be realized within such a system. The classification schemes and their ordering must therefore reflect psycho- logical possibilities or these structures will not be decodable by the listener and thus not contribute in themselves to an appreciation of musical form. One of the contemporary compositional problems is the invention of psychologically relevant functionalities within the increas- ing number of musical dimensions of compositional interest. The functional relations are of varied strengths and types which allow for the Downloaded By: [McGill University] At: 22:23 22 August 2008 building and release of tension. In tonality, for example, the strength of the relation between the first and fifth degrees of the scale is much greater than that between the third and the fifth, and indeed such a movement is structurally less important for tension than in the former case. There are many different ways to describe the function and strength of relations among entities which include similarity and dissimilarity, consonance and dissonance, dominance and submission, and so on. It is important that the nature of the relations be relatively independent of the local context of the piece and yet closely bound to the "context" of the ordering system itself in order to have a generality of application. Attention can be paid directly to qualities of a category, to qualities of relations among categories, or to combinations of relations. With pitch, for instance, one can focus on the pitch itself, on the quality of an interval between two pitches or on the quality of a chord as an assembly of intervals. In different musical contexts, one or the other of these levels of attention 50 Stephen McAdams

are more important for the extraction of musical structure. This demonstrates the importance of emergent properties of perceived relations or groupings, a realm where the psychological processes of auditory organization play a strong role in what is available as musical material emerging from the acoustic structure at a given moment. There is obviously an interest in combining qualities as well as their emergent relations. This raises compositional problems concerning the interde- pendence of manipulating both musical entities and the properties that emerge from their interactions. The categories, functional relations and ordering within a classification system must either reflect the existing structure of the mind and world or be susceptible to learning by listeners if the structures are to be apprehended as musical form. This is a criterion that evokes the role of mental schemata in the organization of perceptual experience. Obviously, one of the most important elements here is the categorization, structuring and then committing to memory of experience. Important work in the develop- ment of perceptual skill has shown that the basic machinery for interpreting the world exists at birth and that is a • process of progressive discrimination, or reduction of discrimination, i.e. the elaboration of innate schemata and the further division of existing categories, or the removal of some pre-existing boundaries between categories that have no functional value. Relations among categories must be able to maintain a certain degree of invariance under various classes of transformation. If patterns composed within a set of dimensions are not susceptible to being varied and still being perceived as similar, then the set of dimensions cannot be str6ng contributors to musical form. After all, variation and transformation are some of the main features of music throughout the world. The limits of perceiving a transformed pattern as somehow related to an original one point to psychological limits of viable processes of musical transforma- tion. Pitch patterns, for example, can be transposed, played in a related key, expanded or contracted, and still be perceived as related to the original motif. Psychologically this implies a mental representation of

Downloaded By: [McGill University] At: 22:23 22 August 2008 the original concept or model (motif) which maintains certain properties in its structure during these transformations and which are perceivable as such (similar contour, for example). An exploration of this domain could ultimately open up a wide territory of rich functional substitutes for the process of variation. However, it also implies limits to the possibilities of transformation through the many quality dimensions of musical entities, and limits to the possibilities of significant coherent structuring of musical material that is to be transformed. More concerted theoretical and experimental efforts are needed to map out the kinds of transformations that are possible for each potential morphophoric dimension, such as pitch, rhythm and timbre. It remains to be seen to what extent the various "dimensions" of timbre (such as brightness, roughness, harmonicity, attack quality, etc.) are capable of carrying structural relations. The psychological processes that underly the perception of related- Music: a science of the mind? 51

ness would include processes of comparison for similarity of informa- tion just received in short-term memory and structures previously stored in long-term memory. This comparison can be conceived of as an activation of associations between the segment in STM with the material in LTM. What is of primary importance here is the process of activation that allows us to recognize the previously presented material. There must be a relation of the degree of similarity to the strength of this recognition and thus to the strength of the contribution that the current segment makes to the apprehension of formal relations. One would expect, for example, that an identical re-presentation of a theme would have a very strong association with its earlier presentation; whereas a retrograde-inversion of the theme would have a very weak association or none at all. In the former case a stronger structural tie would result than in the latter case. Though hierarchical rather than assodational, the prolongational reduction in Lerdahl & Jackendoff's theory shows a certain commonality with these aspects of the representation of musical form. (See also paper by Reynolds, this volume, on variation and transformation of musical material.) It should be emphasized again that this process is limited by the operations of the organizational processes. A transformed segment cannot be too long in time or be too complicated or it would not be easily stored in short-term memory for comparison. It cannot have its pitch intervals stretched too much at a faster tempo, for example, because the segment would tend to segregate into other segments and, first of all, be no longer processed as a whole segment and, consequently, not be recognized as related to its untransformed predecessor. Another important psychological issue is the way in which the structuring of musical material gives rise to the perception of more global properties of structure such as tempo, meter, key, interval, set, harmonic progression, textural evolution and large scale control of spectral/registral form. These last two properties are masterfully used by the composer Ligeti who sculpts massive trajectories of timbre and texture through time. We urgently need ideas of how to proceed Downloaded By: [McGill University] At: 22:23 22 August 2008 experimentally in the understanding of the contribution of organizatio- nal processes to perception of global properties.

Following a musical discourse At this level the psychological issues include questions of how one perceives and represents the flow of musical meaning: the development of tension and relaxation, a sense of continuity and progression, the creation of expectation and a sense of directed motion, and the nature and limits of auditory attention as a listener "participates" (perhaps by conscious ) in the re-creation of a musical form. It partially corresponds to the "prolongational reduction" rule set of Lerdahl & Jackendoff. This rule set derives a hierarchy of the stability of pitch events in terms of perceived patterns of tension and relaxation. 52 Stephen McAdams

The major psychological question concerns the nature of the processes that create models or schemata for what the listener believes the music to be doing and where he believes it to be going (either tacitly or consciously). This kind of mental model, in a sense, inspires expectation if the musical language is accessible to the listener. Anomalies in what actually takes place with respect to the mental model can introduce strains in the expected structural completion and give rise to musical tension. It is probably at this level that one can begin to consider seriously the structural source of emotional experience in relation to music. The temporal perspective necessary in music is a kind of veil that requires we make decisions about our directions, and the directing of attention serves to define immediate or local structural disposition. This is one of the areas that allows the richness of musical experience in that there are always several paths to be taken in the accumulation of a musical form. Another psychological issue concerns how the assembly of mental schemata that create expectations feeds back to the lower level organizational processes and influences what is actually constructed and expected as a musical form. It is at this level that one can really begin to deal with the notion of musical grammar. There are clearly similarities and differences between musical and language grammars, but not a great deal of clarity about the nature of these differences. The problem is badly in need of being properly situated in the framework of cognitive science. Sloboda (1985, chap. 2) poses the following question with respect to musical grammar: what does the structure of music tell us about the structure of mind? One view is that the rules of grammar are psychological procedures that are used to organize music. The problem with this is that there is no such thing as a unique grammar for any body of data on listeners' responses to music. Humans tend to violate the rules that seem to account for their behavior which means the grammars are wrong and that the rules are only a rough approximation. Another view is that some features of grammar have implications for the general way in Downloaded By: [McGill University] At: 22:23 22 August 2008 which we think about psychological principles: a) we decide what to emit by applying a finite set of rules to some unspoken thought we wish to utter, whether this utterance be a musical phrase or a sentence in language, b) listeners can sort utterances into acceptable and unaccept- able, c) structures embodied in generative grammars have psychological reality: elements closely linked in a grammar are closely linked psychologically, whatever the form of cognitive representation. The first of these features applies more to composition and improvisation than to listening. As with language, we might imagine that from the flash of musical insight a composer has, there are any number of ways of "uttering" a musical sequence that realizes this insight structurally. The schemata (or conceptual tools and technical training) that a composer or improvisor has at his or her disposal can be a limiting factor in how the insight is made concrete. The second feature has some rather heavy implications with respect to what is conventional and may be too Music: a science of the mind? 53

restrictive of a concept for many composers' tastes. What is acceptable relates, of course, to an acquired sense of what conforms to the convention of the grammar in question. Since the possible grammars of music have multiplied significantly in the latter part of our century, it is hard to imagine what acceptable or unacceptable means in any general sense, even if one were to limit the population of people to which the grammars apply to those highly experienced in listening to and studying contemporary music. There have actually been some reasonably successful attempts at experimentation on the psychological reality of conventional musical grammars. Stoffer (1985), for example, employed techniques drawn from psycholinguistics to study the representation of phrase structure in . He found evidence that listeners use tacit (or unconscious) knowledge about phrase structures expected in musical forms taken from the Western tonal/metric idiom and that this knowledge structure is organized in the form of hierarchies of mental representations of musical concepts. Of interest to our consideration in the processing of musical discourse is his claim that the act of "listening attentively to music may be characterized by an interaction between several cognitive schemata that form a hierarchy of musical concepts according to abstractness or generality" (p. 217). The results demons- trate the value of his syntax model for modelling structural aspects of the processing of perceived music. Nonetheless, there are restrictions on the way we might postulate a grammar to be used either for production or for comprehension. It is clear that in music, the grammar is conformed to in a rather loose way. In composing, one may know a grammar and be able to adhere to it. However, the composer also insists on moving outside the grammar. Having a global view of it, one can anticipate the listening strategies that are derived from our own culture and which are used to structure our musical experience, and seek ways to thwart these strategies in interesting ways. What constitutes an interesting way that is still somehow understandable is an important question in the realm of Downloaded By: [McGill University] At: 22:23 22 August 2008 aesthetics. Grammars that are shared among a community of music listeners, however, do not generate musical compositions. Their existence is a major consideration in determining the nature and degree of freedom a composer can exercise to transform and extend a musical style (Sloboda, 1985, chap. 2). It seems, though, that there is some confusion in 20th Century music between stepping outside a consensual grammar and creating a set of arbitrary methods of composition and stepping outside of those. While the composer may think this creates tension, it presumes the listener "feels" the method in order to be able to experience a departure from it as a violation or deviation of expectation. The important element here is the necessity of a commonly shared set of mental schemata that allows the listener to sense the music as going in the way the composer thinks it is going. The primal importance of directed motion in music and of the psychological constraints on its experience comes very much to the fore. It is here as well that the great 54 Siephen McAdams

importance of the nature of underlying mental representations for the processes of organization becomes apparent. Directed motion depends on several things. First of all, motion of any kind is perceived with respect to a stable ground. In the case of music, we might imagine that the stable ground is provided by processes that seek stable mental structures through which the incoming information is organized on many levels (see Povel & Essens, 1985, on the perception of temporal patterns in this regard). This bias toward regularity instantiates a kind of perceptual inertia. To move away from stability is to induce a sense of motion in a psychological sense. But for the motion to appear directed, some kind of trajectory must be inferred, some path of transformation has to be understood, and mental schemata must have been acquired that anticipate an end point of this trajectory. Thus, one must have a sense not only that something changes but must sense as well the nature of the process of change and the direction implied by this process. These kinds of anticipatory mental schemata direct our attentional processes to certain aspects of the musical structure, prime us for the arrival of certain things. It is in the time course of the trajectory that projects toward the anticipated event's arriving that, according to Meyer (1956), lie the seeds of affect and meaning in musical experience. There are, however, many unanswered questions from a psychological standpoint about the nature of directed motion and the understanding of the flow of musical discourse. This final level of the theory must integrate all the aspects of internal representation of musical elements, from the smallest to the largest levels of structure, and the processes by which these are accumulated in memory and give rise to our experience of form. This addresses again the major question of assembling a received structure into an experi- enced form.

The apprehension of musical form

Downloaded By: [McGill University] At: 22:23 22 August 2008 Structure is in the world, either notated on paper, stored in computer memory, on magnetic audio tape, impressed on vinyl, or present as vibrations in the air. Form is in the mind and is thus limited by the possibilities of mind. In the preceding pages we have rapidly scanned some of the psychological considerations of what underlies the deriving of musical form from the acoustic structures presented to our ears. The mind seeks a coherence of organization at various levels of structure. As such, comprehension implies the presence of internal representations and of organizational processes that operate upon these mental structures. Langer (1942) proposes the notion of structural resonance to approach the problem of the origins of meaning and emotional response to music. The structural resonance hypothesis can be viewed as a body of psychological constraints. There must be a logical resemblance of the structure of music and experience. When these symbolic structures (or Music: a science of the mind? 55

mental schemata) are similar, an important requirement for a connota- tive relationship is- satisfied and musical structures can have meaning. The problem of the psychologist is to show that this resemblance of form exists at all of the levels believed to be involved in music comprehension and to demonstrate the resonance that is believed to take place. This task is most likely far beyond our scientific capabilities at present, but there are several notions here that recall things we have already discussed. A theory of musical organization based on the assembly of mental schemata, including their anticipatory nature, their ability to be assembled into hierarchies and our tendency to use them from higher levels to lower ones in the recognition and processing of complex structures, lends itself very well to a move in this direction. There are several properties of musical structures that recommend them for symbolic use and for the eventual extraction of musical "meaning." They are composed of separable items that are easily distinguishable, easily produced and easily combined in a great variety of ways. These are, in effect, psychological constraints on form-bearing elements in general. The criterion of categorization implies also the importance of having prototypes of categories that are the more stable and easily recognized Gestalt forms of which well-formed musical motivic material is made. These discrete identities are essential for the building of complex syntax and structure. These forms must not only be readily distinguished one from another, but must also be readily remembered and repeated. However, as Boulez (this volume) remarks, the building blocks of musical structure must also have a certain neutrality of identity if they are not to invoke too strong of a centrifugal force that thrusts them into conflict with the structure of which they should be a part. They must not play too strong of a role on their own which overshadows their structural and semantic function in the music. A relative neutrality of identity allows a certain degree of blending and a certain malleability of function by which each element has a tendency to modify each other's character when they are put into combination. This is a crucial factor for a semantic network, that each element be able to

Downloaded By: [McGill University] At: 22:23 22 August 2008 serve each other element as context. One of the problems with musique concrete is that the sound elements selected as a vocabulary for a given piece often have such strong references to everyday life that they are made to cohere with an overriding structure only with great difficulty. They tend to stand out on their own and thus do not contribute to the more global structure. In a sense, the material is not only too identifiable but is also too discon- tinuous or categorized to be assimilable into a form that is foreign to its already strong semantic function. Deriving a form from a structure requires, in addition to categorization of the material, a comprehension of its transformation and elaboration. The process of transformation must be sensed, if not completely understood. It is, of course, not very interesting if everything that happens in a composition is absolutely transparent. One is bored very quickly of this type of music. However, some impression of the nature 56 Stephen McAdams

and of transformation is essential for one to appreciate the directionality of music. Boulez (1986) states that there is the necessity of a heavy directionality in the nature of a device or method in order for it to be useful musically. This must also be in the direction of the musical necessity. Transformations must be non-neutral. The question concerning the relation between structure and form in music in some broad sense asks what is the nature and, above all, what are the capacities, tendencies and limits of the comprehension of music? What does it mean to "comprehend" or "apprehend" music? What might the sciences of human cognition bring to the search for understanding? Are there universal and culturally bound aspects to modern comprehension? How is music comprehension affected by incidental experience with music and by musical training? To what extent is comprehension tied to conscious awareness of what one is experiencing? In this sense we speak often of "feeling" more than "understanding" musical form, of "apprehending" more than "perceiv- ing." Rosen (1971, p. 35) remarks on the psychological reality of structural hearing approaches to music analysis and states that the notion that these structures are consciously perceived is unsatisfactory. It "reduces" a piece too far with respect to important heard elements and needs to be further developed with a consideration of perceived and stored motivic elements and their relation to an inferred structure. The structure is extended over a much greater expanse of time than we can consciously perceive. Our limits of appreciation of musical form are thus determined by the limits of what we can represent in memory. The extent to which we can follow a musical discourse depends on our having previously acquired the necessary grammatical structures or on our being able to infer/reconstruct these structures from the musical material itself. This is the process of accumulating and experiencing a musical form. This process is an essential substance of music itself. The experience of music involves acts of the mind and the study of this experience and of the human knowledge embodied in musical culture warrants the attention of the sciences of mind. In reading the reflections

Downloaded By: [McGill University] At: 22:23 22 August 2008 of the musicians and psychologists in the following articles we should keep the reciprocal question in view: how well does this direction of thought correspond with and nourish musical thought itself?

Acknowledgements I am deeply indebted to four recent works that were the source of much inspiration for this article, namely, Spender (1980), Lerdahl & Jackendoff (1983), Sloboda (1985) and Dowling & Harwood (1986). I would like to express my gratitute to Deke Dusinberre, Dick Carter and Marco Stroppa for thoughtful and thorough critiques of the manuscript. I would also like to thank Nigel Osborne for the possibility of realizing this ambitious project. Finally I would like to thank my musical and scientific colleagues at IRCAM for inspiration and for keeping me honest Music: a science of the mind? 57

and more or less on the track. Of course, any detours that are less than musical are no fault of theirs.

Notes

1. Much of the material in this section was derived from a thorough review of the history of music psychology in Spender (1980, section I). 2. This summary of Sloboda's (1986) paper is derived from a recording of an oral presentation of the material given to the Belgian Psychological Society seminar on the Psychology of Music in Brussels in December 1985. There may have been changes in the printed version to which the reader is referred for a more thorough treatment of the subject. 3. The readers are invited to verify for themselves whether this coherence is actually taking form by consulting the several new books and journals on music psychology that are listed in the references. 4. Some people question whether we can even still speak of "language" anymore with respect to Western contemporary music. We would seem to be in a period after the fall of the tonal Tower of Babel, where there appear to be as many "languages" as there are composers. Can we still call it "language" if it is not "conventional" in the sense of being shared by a population of people? Or does a global look at these reveal a certain number of common characteristics that will eventually be absorbed by the culture? 5. It should be understood that the notions of representation and computation are terms that we use in order to express certain generalizations about what we believe to be happening in the human mind (Pylyshyn, 1985, p. 1). We use the concepts to explain the way the mind functions. In order to explain things, we need a vocabulary that allows the concepts to be communicated. 6. Much of the material briefly reviewed in this section was drawn from two important recent books on music cognition: The Musical Mind by John Sloboda (1985) and Music Cognition by Jay Dowling & Dane Harwood (1986). 7. There is a vast amount of research on memory processes. For a fairly complete introduction to current thought in this field refer to chapters 8-11 in Lindsay & Norman (1977) and chapters 6-7 in Anderson (1985). 8. See the special issue of The Behavioral and Brain Sciences, vol. 3, no 1, March 1980, on the foundations of cognitive science. 9. For more in-depth reading on this level of perception refer to Moore (1982) and Plomp (1976).

Downloaded By: [McGill University] At: 22:23 22 August 2008 10. The discussion of the next two levels of musical organization will necessarily be very general and speculative as there has not been much research on the question.

References

J.R. Anderson, Cognitive Psychology and its Implications, 2nd ed. (W.H. Freeman, New York, 1985). S. Arom, Structuration du temps dans les musiques d'Afrique centrale: périodicité, mètre, rhythmique et polyrhythmie, Revue de Musicologie, 70, 5-36 (1984). S. Arom, Polyphonies et polyrythmies d'Afrique centrale: Structure et méthodologie, Doctorat d'Etat thesis, University of Paris — Sorbonne (1985). L.E. Bahrick, A.S. Walker & U. Neisser, Selective looking by infants, Cognitive Psychology, 13, 377-390 (1981). G.J. Balzano, The pitch set as a level of description for studying musical pitch perception, in M. dynes (ed.), Music, Mind and Brain: The of Music, pp. 321-351 (Plenum Press, New York, 1982). 58 Stephen McAdams

F.C Bartlett, Remembering: A Study in Experimental and Social Psychology (Cambridge University Press, London, 1932). D. Berlyne, Æsthetics and Psychobiology (Appleton-Century-Crofts, New York, 1971). J. Bloch & R.B. Dannenberg, Real-time computer accompaniment of keyboard perform- ance, Proceedings of the 1985 International Computer Music Conference, Vancouver, British Columbia, , pp. 279-290 (Computer Music Association, Berkeley, 1985). A.S. Bregman, Asking the "what for" question in auditory perception, in: M. Kubovy & J. Pomerantz (eds.), Perceptual Organization, pp. 99-118 (Erlbaum, Hillsdale, N.J., 1981). A.S. Bregman, J. Abramson, P. Doehring & C. Darwin, Spectral integration based on common'amplitude modulation. Perception and Psychophysics, 37, 483-492 (1985). A.S. Bregman & J. Campbell, Primary auditory stream segregation and the perception of order in rapid sequences of tones, Journal of Experimental Psychology, 89, 244-249 (1971). A.S. Bregman & S. Pinker, Auditory streaming and the building of timbre, Canadian Journal of Psychology/Revue Canadienne de Psychologie, 32, 19-31 (1978). P. Boulez, Composition et technologie, conference presented in the seminar Nouvelles Technologies et Mutation des Savoirs, IRCAM, Paris, France (October, 1986). X. Chabot, R.B. Dannenberg & G. Bloch, A workstation in live performance: Composed improvisation, Proceedings of the 1986 International Computer Music Conference, Den Haag, The , pp. 57-60 (Computer Music Association, Berkeley, 1986). N. Chomsky, Aspects of the Theory of Syntax (MIT Press, Cambridge, Mass., 1965) E.F. Clarke, Structure and expression in rhythmic performance, in: P. Howell, I. Cross & R. West (eds.), Musical Structure and Cognition, pp. 209-236 (Academic Press, London, 1985). G.L. Dannenbring & A.S. Bregman, Stream segregation and the illusion of overlap, Journal of Experimental Psychology/Human Perception and Performance, 2, 544-555 (1978). C.J. Darwin, Perceptual grouping of speech components differing in and onset-time, Quarterly Journal of Experimental Psychology, 33A, 185-207 (1981). C.J. Darwin, Perceiving vowels in the presence of another sound: Constraints on formant perception, Journal of the Acoustical Society of America, 76, 1636-1647 (1984). C. Delezenne, Mémoires sur les valeurs numériques des notes de la gamme, Recueil des travaux de la Société des sciences, de l'agriculture et des arts, de Lille, 1, 1-56 (1826-7). I. Deliège, Les règles préférentielles de groupement dans la perception musicale, thesis, Université Libre de Bruxelles, Brussels, (1985). I. Deliège, Grouping conditions in music listening: An approach to Lerdahl & Jackendoff's grouping preference rules, Music Perception, (in press). D. Deutsch, The processing of structured and unstructured tonal sequences, Perception and Psychophysks, 28, 381-389 (1980). W.J. Dowling, Rhythmic groups and subjective chunks in memory for melodies. Perception Downloaded By: [McGill University] At: 22:23 22 August 2008 and Psychophysks, 14, 37-40 (1973). W.J. Dowling, Melody information processing and its development, in: D. Deutsch (ed.), The Psychology of Music, pp. 413-429 (Academic Press, New York, 1982). W.J. Dowling & J.C. Bartlett, The importance of interval information in long-term memory for melodies, Psychomusicology, 1, 30-49 (1981). W.J. Dowling & D.S. Fujitani, Contour, interval, and pitch recognition in memory for melodies, Journal of the Acoustical Society of America, 49, 524-531 (1971). W.J. Dowling & D. Harwood, Music Cognition (Academic Press, New York, 1986). D. Ehresman & D. Wessel, Perception of timbrai analogies, Rapports IRCAM, no 13 (IRCAM, Paris, 1978). R. Erickson, Sound Structure in Musk (University of California Press, Berkeley, 1975). R. Erickson, New music and psychology, in D. Deutsch (ed.). The Psychology of Musk, pp. 517-536 (Academic Press, New York, 1982). G. Fechner, Elemente der Psychophysik (Breitkopf und Härtel, Leipzig, 1860). G. Fechner, Vorschule der Æsthetik (Breitkopf und Härtel, Leipzig, 1876). P. Fraisse, Time and rhythm perception, in: E.C. Carterette & M.P. Friedman (eds.). Handbook of Perception, vol. 8: Perceptual Coding, pp. 203-254 (Academic Press, New York, 1978). Music: a science of the mind? 59

P. Fraisse, Rhythm and tempo, in: D. Deutsch (ed.), The Psychology of Music, pp. 149-180 (Academic Press, New York, 1982). R. Francès, La perception de la musique, 2nd ed. (J. Vrin, Paris, 1984). J.L. Gould & P. Marier, Learning by instinct, Scientific American, 256(1), 62-73 (1987). J.M. Grey, Multidimensional perceptual scaling of musical timbres, Journal of the Acoustical Society of America, 61, 1270-1277 (1977). J.W. Hall, M.P. Haggard & M.A. Fernandes, Detection in noise by spectro-temporal pattern analysis, Journal of the Acoustical Society of America, 76, 50-56 (1984). L. Harrison, Lou Harrison's Music Primer: Various Items About Music to 1970 (C.F. Peters, New York, 1971). W.M. Hartmann, S. McAdams & B.K. Smith, Matching the pitch of a mistuned harmonic in an otherwise periodic complex tone, Journal of the Acoustical Society of America, 80,S93(A) (1986). H. von Helmholtz, On the Sensations of Tone as a Physiological Basis for the Theory of Musk, 2nd English ed. trans. in 1885 by A.J. Ellis from the 4th German ed. 1877 (reprinted Dover, New York, 1954). J. Hochberg, Organization and the Gestalt tradition, in: E.C. Carterette & M.P. Friedman (eds.), Handbook of Perception, vol. 1: Historical and Philosophical Roots of Perception, pp. 180-211 (Academic Press, New York, 1974). R.S. Jackendoff, Semantics and Cognition, (MIT Press, Cambridge, Mass., 1983). N.F. Johnson, The role of chunking and organization in process of recall, in: G. Bower (ed.) Psychology of Language and , vol. 4 (Academic Press, New York, 1970). D. Klahr, W.G. Chase & E.A. Lovelace, Structure and process in alphabetic retrieval, Journal of Experimental Psychology/Learning, Memory and Cognition, 9, 462-477 (1983). W. Köhler, Gestalt Psychology (Horace Liveright, New York, 1929). S. Kosslyn, Image and Mind (Harvard University Press, Cambridge, Mass., 1980). C.L. Krumhansl, The psychological representation of musical pitch in a tonal context, Cognitive Psychology, 11, 346-374 (1979). C.L. Krumhansl, J. Bharucha & M.A. Castellano, Key distance effects on perceived harmonic structure in music, Perception and Psychophysics, 31, 75-85 (1982). C.L. Krumhansl & F.C. Keil, Acquisition of the hierarchy of tonal functions in music, Memory and Cognition, 10, 243-251 (1982). C.L. Krumhansl & R.N. Shepard, Quantification of the hierarchy of tonal functions within a diatonic context, Journal of Experimental Psychology/Human Perception and Performance, 5, 579-594 (1979). T.S. Kuhn, The Structure of Scientific Revolutions, 2nd ed. (University of Chicago Press, Chicago, 1972). S. Langer, Philosophy in a New Key (Harvard University Press, Cambridge, Mass., 1942). F. Lerdahl, Cognitive constraints on compositional systems, in: J. Sloboda (ed.). Generative Processes in Musk (Oxford University Press, Oxford, in press). Downloaded By: [McGill University] At: 22:23 22 August 2008 F. Lerdahl & R.S. Jackendoff, A Generative Theory of Tonal Music (MIT Press, Cambridge, Mass., 1983). F. Lerdahl & R.S. Jackendoff, An overview of hierarchical structure in music, Musk Perception, 2, 229-252 (1984). P.H. Lindsay & D.A. Norman, Human Information Processing: An Introduction to Psychology, 2nd ed. (Academic Press, New York, 1977). H.C. Longuet-Higgins, Perception of melodies, Nature, 263, 646-653 (1976). H.C. Longuet-Higgins & C.S. Lee, The rhythmic interpretation of monophonic music, Music Perception, 1, 424-411 (1984). R.W. Lundin, An Objective Psychology of Music, 2nd ed. (Ronald Press, New York, 1967). E. Mach, Beiträge zur Analyse der Empfindungen (Jena, 1886) [cited in Spender (1980)]. S. Makeig, Affective versus analytic perception of musical intervals, in: M. Clynes (ed.), Music, Mind and Brain: The Neuropsychology of Musk, pp. 227-250 (Plenum Press, New York, 1982). D. Marr, Vision (W.H. Freeman, New York, 1982). S. McAdams, Spectral fusion and the creation of auditory images, in: M. Clynes (ed.) Music, Mind and Brain: The Neuropsychology of Music, pp. 279-298 (Plenum Press, New York, 1982). 60 Stephen McAdams

S. McAdams, The auditory image: A metaphor for musical and on auditory organization, in: W.R. Crozier & A.J. Chapman (eds.), Cognitive Processes in the Perception of Art, pp. 289-324 (North-Holland, Amsterdam, 1984). (a) S. McAdams, Spectral fusion, spectral parsing and the formation of auditory images, PhD. dissertation, (available as Dept. of Music Report STAN-M-22), Stanford, California (1984). (b) S. McAdams & A.S. Bregman, Hearing musical streams, Computer Music Journal, 3(4), 26- 43 (1979). S. McAdams & K. Saariaho, Qualities and functions of musical timbre, Proceedings of the 1985 International Computer Music Conference, Vancouver, British Columbia, Canada, pp. 367-374 (Computer Music Association, Berkeley, 1985). L. Meyer, Emotion and Meaning in Music, (University of Chicago Press, Chicago, 1956). G.A. Miller, The magical number seven, plus or minus two: Some limits on our capacity for processing information, , 63, 81-97 (1956). C. Monahan, Parallels between pitch and time: The determinants of musical space, PhD. dissertation, University of California, Los Angeles (1984). B.C.J. Moore, An Introduction to the Psychology of Hearing, 2nd ed. (Academic Press, London, 1982). B.C.J. Moore, B.R. Glasberg & R.W. Peters, Thresholds for hearing mistuned partials as separate tones in harmonic complexes, Journal of the Acoustical Society of America, 80, 479-483 (1986). J.L. Mursell, The Psychology of Music (W.W. Norton, New York, 1937). U. Neisser, Cognition and Reality (W.H. Freeman, New York, 1976). L.P.A.S. van Noorden, Minimum differences of level and frequency for perceptual fission of tone sequences ABAB, Journal of the Acoustical Society of America, 61, 1041-1045 (1977). C.F.A. Pantin, The Relations between the Sciences (Cambridge University Press, London, 1968). H. Partch, The Genesis of a Music, 2nd ed. (DaCapo Press, New York, 1974). S. Pinker, Visual cognition: An introduction, in: S. Pinker (ed.), Visual Cognition, pp. 1-64 (MIT Press, Cambridge, Mass., 1984). R. Plomp, Aspects of Tone Sensation: A Psychophysical Study (Academic Press, London, 1976). DJ. Povel, Internal representation of simple temporal patterns, Journal of Experimental Psychology/Human Perception and Performance, 7, 3-18 (1981). DJ. Povel, A theoretical framework for rhythm perception, Psychological Research, 45, 315- 337 (1984). D.J. Povel & P. Essens, The perception of temporal patterns, Music Perception, 2, 411-440 (1985). Z. Pylyshyn, Computation and Cognition (MIT Press, Cambridge, Mass., 1985). S.K. Reed, Structural descriptions and the limitations of visual images, Memory and Downloaded By: [McGill University] At: 22:23 22 August 2008 Cognition, 2, 329-336 (1974). S.K. Reed & J.A. Johnsen, Detection of parts in patterns and images, Memory and Cognition, 3, 569-575 (1975). F. Restle, Theories of serial pattern learning: Structural trees, Psychological Review, 77, 481- 495 (1970). F. Restle, Serial patterns: The role of phrasing, Journal of Experimental Psychology, 92, 385- 390 (1972). J.C. Risset & D.L. Wessel, Exploration of timbre by analysis and synthesis, in: D. Deutsch (ed.), The Psychology of Music, pp. 26-58 (Academic Press, New York, 1982). C. Rosen, The Classical Style: Haydn, Mozart, Beethoven (W.W. Norton, New York, 1971). M.T.M. Scheffers, Sifting vowels: Auditory pitch analysis and sound segregation, Doctoral dissertation, University of Groningen, The Netherlands (1983). M. Schoen, The Psychology of Music (Ronald Press, New York, 1940). C. Seashore, Psychology of Music (McGraw-Hill, New York, 1938). R.N. Shepard, Circulatory in judgments of , Journal of the Acoustical Society of America, 36, 2346-2353 (1964). R.N. Shepard, Structural approximations of musical pitch, in: D. Deutsch (ed.), The Psychology of Music, pp. 343-390 (1982). Music: a science of the mind? 61

R.N. Shepard, Ecological constraints on internal representation: Resonant kinematics of perceiving, imagining, thinking and dreaming, Psychological Review, 91, 417-447 (1984). R.N. Shepard & L.A. Cooper, Mental Images and their Transformations (MIT Press, Cambridge, Mass., 1982). R.N. Shepard & D. Jordan, Auditory illusions demonstrating that pitches are assimilated to an internalized musical scale, paper presented at the Psychonomic Society meeting, Minneapolis, Minnesota (November, 1982). J. Sloboda, The Musical Mind: The Cognitive Psychology of Music (Clarendon Press, Oxford, 1985). J. Sloboda, Cognition and real music: The psychology of music comes of age, Psychologica Belgica, 26, 199-219 (1986). J.F. Sowa, Conceptual Structures: Information Processing in Mind and Machine (Addison- Wesley, Reading, Mass., 1984). N. Spender, Psychology of music (I-III), in: S. Sadie (ed.), The New Grove's Dictionary of Music and Musicians, vol. 15, pp. 388-427 (1980). D. Sperber & D. Wilson, Relevance: Communication and Cognition (Basil Blackwell, Oxford, 1986). T.S. Stoffer, Representation of phrase structure in the perception of music, Music Perception, 3, 191-220 (1985). C. Stumpf, Tonpsychologie (Herzel, Leipzig, 1883). J. Sundberg, A. Askenfelt & L. Frydén, Musical performance: A synthesis-by-rule approach, Computer Music Journal, 7(1), 37-43 (1983). J.P. Swain, The need for limits in hierarchical theories of music, Music Perception, 4, 121- 148 (1986). E.B. Titchener, A Text-book of Psychology (Macmillan, New York, 1909). E. Tulving, Episodic and semantic memory, in: E. Tulving & W. Donaldson (eds.), Organization of Memory, pp. 381-403 (Academic Press, New York, 1972). B. Vercoe, The synthetic performer in the context of live performance, Proceedings of the 1984 International Computer Music Conference, Paris, France, pp. 199-200 (Computer Music Association, Berkeley, 1985). D. Wessel, Timbre space as a musical control structure, Computer Music Journal, 3(2), 45-52 (1979). W. Wundt, Grundriss der Psychologie (Engelmann, Leipzig, 1896). Downloaded By: [McGill University] At: 22:23 22 August 2008