AN INTRODUCTION TO : THE GESTURE IN MODELIZATION

Ole Kühl Independent researcher

A BS TRA CT definition found in a dictionary or a similar text. Words are defined by words. Musical meaning is systematically grounded in human A broader definition of ‘meaning’ has been made experience [5]. The semantic relationship between possible by cognitive science, which sees meaning as musical structure and the biological constraints of the based on acquired embodied experience. For some event human brain and auditory system will be described in to be meaningful, it must in some way comply with the following. It is largely mediated through the musical prior experience, and when we follow this line of gesture, and therefore tied to the built­in tendency of ‘interpretants’ backwards in personal time, we find that human temporal to organize input in chunks. experience is ultimately based on simple, generic bodily These ideas, which are of interest inside the man­ experience. According to George Lakoff and Mark machine paradigm, are supported by recent research in Johnson, itself derives its meaning from such neuro­biology and psychology. abstract structures [6], and, according to Merlin Donald, language has developed from a primordial level of 1. IN TRODU CTION mimetic communication, the sharing of gestures [2]. The building of mathematical models and computer A proper definition of musical meaning seems to simulations of human behavior must be grounded in our be suspended between these two extremes: on the one knowledge of human action, perception and . hand, cannot ‘mean’ something the way a word In this theoretical paper I aim to provide an overview of or a sentence ‘means’; but, on the other hand, we have research pertaining to a description of musical all had the experience that music can convey moods, cognition. I shall further offer some speculations based emotional content, etc. In order to develop our on this body of knowledge that can be useful in further understanding of the way human beings use music as an modelizations of musical behavior inside the man­ expressive and communicative strategy, it therefore machine paradigm. seems feasible to study what recent relevant research has In developing and refining models of musical taught us about the biological conditions for, and the behavior that can be tested against real­life musical cognitive structures involved with, music. production, something like motivation and intentionality becomes important. Music is meaningful, and musical 3. BIOLOGICA L CON S TRA IN TS acts convey and intentional behavior. Meaning, The human auditory system processes huge amounts of then, becomes a factor to be considered in modelization. sonic information, often referred to as the auditory I propose that the semantic dimension of music stream [1]. When listening to a piece of music, what be seen as linked to the . Evidence from enters the ears as sound waves or kinetic energy psychology [9,11] and neuroscience [15] supports the ultimately emerges as music in the human mind. view that a musical phrase represents the simulation of Auditory perception pre­attentively organizes and a motor act [2,5,6]. Therefore the musical gesture can be structures this information in certain ways, partly seen as an intentional and meaningful act. This brings dependent on innate, biological properties of the system, the perceptual mechanism of chunking into the and partly dependent on learned, culturally derived foreground. mechanisms (which will not be considered here). Built­ in properties of the auditory system are sometimes seen 2. THE QU ESTION OF MEAN IN G IN MU SIC as manifesting musical universals, features of music that can be found in all cultures at all times [4,13]. The often discussed question of meaning in One of the most interesting of the properties of music remains unresolved. In order to build a the auditory system may be that of pattern extraction satisfactory model we must examine the concept of [1,10]. The human brain needs to organize its input in meaning itself, the ‘meaning of meaning’. The problem order to avoid overflow of information. Infants are seems to lie with the traditional, narrow definition of extremely well equipped to recognize and extract meaning derived from language studies, where the regularities and patterns from their surroundings, and meaning of a word is determined as the precise language , for instance, is completely dependent on this ability. In music, such features as 1) the

145 perception of simple pitch ratios, including octave temporal events in real­time? No doubt, re­entry loops similarity, fourths and fifths, and 2) musical pulse are involved in an on­line integration of (regularity extraction) are probably tied to this ability structures with perceived content [16]. Certain time­ [4]. Other interesting features that occur universally are: windows have been proposed as innate properties of the 3) the categorization of notes, and the division of the brain, and they can be considered as a second set of scale in between 5 and 7 unequal steps; 4) perception of biological constraints comparable to those given above melodic contour (upwards and downwards movement); (see table 2). 5) the formation of groups; and 6) the meter as an There is a minimum distinction window at evoked response [4] (see table 1). 10−30 ms, which is of little interest here, below which one sound becomes indistinguishable from another [7,8]. More interesting is the pre­attentive window at 1. Perception of simple pitch ratios (octaves, approximately 3­500 ms, which marks the boundary fourths and fifths) between two distinct modes of perception [3]. While the 2. Regularity extraction shorter sounds seem to reverberate through the auditory system (echoic memory?), some of the properties of the 3. Categorization of notes in scales of 5 to 7 longer sounds are sustained in working memory, steps possibly via re­entry loops. Of even more interest is the window of the 4. Perception of melodic contour subjective present, sometimes called the three second 5. Group formation window [7,8,11]. Inside this, the unstructured flow of temporal information is organized as a single temporal 6. Meter as evoked response event. Such events cannot be shorter than a half second, and are seldom longer than 6−8 sec. Normally they Table 1. Biological constraints on musical perception. range from 2−5 sec. The mechanism seems to work in a uniform way across modalities: for language, motor This type of characteristic shapes all simple music events, visual events, and music. Most likely, this everywhere and at all times (of course, art music will organization of temporal events in chunks (in the often transgress these limitations). To a Western ear it present context called musical gestures [5]) of a certain will be interesting to note the absence on the list of three size is necessary for the brain to avoid overflow and musical features that we would expect to be generic: the chaos. It indicates a deeper, amodal level of perception, major/minor − happy/sad distinction; chords and as claimed by Daniel Stern and others [9], and, as it has harmonic development; and the so­called ‘Mozart­ been shown to be functional in neonates, it can be seen effect’, according to which large­scale architectural as an innate property of the brain [11]. properties of musical form are important devices. Extant One more time­window should be mentioned evidence does not support the view that these musical here, namely that of the extended present [8,11]. It elements are given by nature, however, this whole concerns our ability to keep a limited number of events question is too complex to be given proper treatment in view at the same time, and to conceptualize them as here. units at a higher level of organization. Such formations are comparable to structures like sentences, groups of 4. TEMPORA L PROPERTIES OF THE BRA IN motor­actions with a common purpose, or simple songs.

5. CHUN KIN G > 30 msec Minimum distinction window Both the expression and the perception of music are ruled by the same biological constraints. Perhaps the 300 msec Pre­attentive window most striking feature in the evidence presented above, is 3 sec Subjective present that auditory perception organizes the auditory stream in chunks of a certain size. This process is largely > 30 sec Extended present Extended present brought about through a top­down projection of culturally derived schemas: we hear (more or less) what Table 2. Temporal constraints on musical cognition. we expect to hear, and we organize our sonic perception according to culturally derived rules. As seen from the standpoint of the cognitive There is evidence of chunk formation from construction of music, the temporal properties of the several sources. The gestalt quality of grouping found in brain show some noteworthy features. The temporal primitive perception, as discussed above, is one [1]. aspect of human cognition is not only exceedingly Another set of evidence derives from the mechanism of complex, but it is also difficult to investigate even at the binding, first found in visual perception but later current level of technology. How does the brain process thought to be of a more general nature and called

146 selective binding [2]. Yet another set of evidence can be As a working hypothesis we can say that chunks deduced from the peculiar quality of projection or segment the auditory stream and compress several types transposition of temporal events from one modality to of information, such as melodic curve, intensity, another, primarily the mapping of movement patterns dynamics, and timbre, into a single gestalt. At a higher (gestures) on to sound patterns [5,9]. Finally we see that level in the temporal hierarchy these chunks are most music – like language and other temporal event organized in groups or sequences, such as sentences and series – is organized in melodic or rhythmic phrases of a melodies. certain size [8]. It is of interest to note that nursery rhymes and The musical chunk − or gesture − represents a children’s songs from all cultures are organized mesolevel of cognitive organization, wedged between according to a simple schema, in compliance with the the microlevel and the macrolevel. The mesolevel seems chunking hypothesis (see fig. 1 and 2): a four­line to be the basic or generic level, at which we interact stanza in moderate tempo, where each line lasts about with the world. When we access the world at this level three seconds and the whole stanza twelve seconds [11]. of more or less pre­organized chunks, the brain is saved In fig. 1, we see a well­known children’s song. Each the huge burden of work necessary to process millions of line consists of paired gestures, as shown in fig. 2. tiny bits of information, and can move directly to the In the sharing with others of activities like this, operational modus. infants train and exercise their basic brain capacities of At the mesolevel, notes, rhythmic information, pattern extraction, chunking and grouping. This seems and sound patterns are organized in a single gestalt, a to be a basic form of organizing our cognition of events musical phrase or gesture. From the level of the chunk in the world, in which the micro­, the meso­ and the we can ‘look down’ towards the microlevel of macrolevel are integrated in purposeful action. subchunks, single items of information on pitch, timing and sound. And we can ‘look up’ towards the 7. CON CLU SION macrolevel of superchunks, where chunks are grouped at a higher level of cognitive organization, primarily as In discussing the biological constraints on musical melodies [14]. perception, the most noteworthy feature may be the looseness of the constraints. Our basic biological 6. THE S ON G equipment for music does not bring us anywhere near the notions we have of what music is: important features like harmony and musical form are absent at this level Itsy Bitsy spider climbing up the spout of primary adaptation. But when we think about the Down came the rain and washed the spider out great variety of music in the world’s cultures, it clearly must be so. Music is a cultural artifact, and most of its Out came the sun and dried up all the rain properties take on their particular value inside its Now Itsy Bitsy spider went up the spout again! went up the spout again! cultural setting. At the generic level, the temporal constraints on perception become important factors to be considered. Figure 1. Structure of a nursery rhyme. Not only nursery rhymes, but all kinds of songs and functional music is organized according to the temporal properties of our auditory system: the division in chunks, subchunks and superchunks is ubiquitous. This leads us to the notion of musical gesture. The idea of the musical gesture includes ‘a strong sensorimotor component and a tight coupling between perception and action’ [12]. The sequential chunking of the auditory stream of sound is a vital built­ in necessity of the perceptual system. The chunk, however, is not a gesture – it is merely a segmentation of time, demonstrating the limitations of our memory systems. The musical gesture, as an inner simulation of movement, arises in response to certain qualities of the patterns in the auditory stream (perceived for listeners, projected for players). But the importance of this phenomenal action should not be overlooked in attempts to build modelization of musical behaviour that are truly Figure 2. ‘Chunking’ a prototypical song. musical.

147 8. REFEREN CES

[1] Bregman, A. Auditory Scene analysis, The MIT Press, Cambridge, Mass., 1991. [2] Donald, M. A Mind so Rare. W. W. Norton & Co., New York, 2001. [3] Fraisse, P. “ and Tempo”. Deutsch, D., The Psychology of Music. Academic Press, New York 1982. [4] Justus, T. and Hutsler, J. “Fundamental Issues in the Evolutionary Psychology of Music: Assessing Innateness and Domain specificity” 23/1, 2005. [5] Kühl, O. Musical Semantics, Peter Lang, Bern, 2007. [6] Lakoff, G. and Johnson M. Philosophy in the Flesh. Basic Books, New York, 1999. [7] London, J. in Time. Oxford University Press, Oxford, 2004. [8] Snyder, B. Music and Memory. The MIT Press, Cambridge, Mass., 2000. [9] Stern, D. The Interpersonal World of the Infant. Karnac, London, 1998 (1985). [10]Tomasello, M. Constructing a Language. Harvard University Press, Cambridge, Mass., 2003. [11]Trevarthen, C. ''Musicality and the Intrinsic Motor Pulse”, Musicae Scientiae, Special Issue, 2000. [12]Leman, M. & Camurri, A. “Understanding Musical Expressiveness Using Interactive Multimedia Platforms”, Musicae Scientiae, Special Issue 2006. [13]McDermott, J. and Hauser, M. “The Origins of Music: Innateness, Uniqueness, and Evolution”, Music Perception, 23/1, 2005. [14]Godøy, R.I. “Gestural­Sonorous Objects: embodied extensions of Schaeffer’s conceptual apparatus.” Organised Sound, 11, 2006. [15]Gallese, V. “The Inner Sense of Action – agency and motor representations.” Journal of Consciousness Studies, 7/10, 2000. [16]Edelman, G. Bright Air, Brillant Fire. Basic Books, New York, 1992.

148