Some Perspectives on Performed

Jeff Pressing Some on Performed Department of Psychology Perspectives University of Melbourne Sound and Music in Virtual Parkville, Victoria 3052 Australia Environments Abstract The great variety of functions possible for sound in virtual environments is surveyed in relation to the traditions that primarily inform them. These traditions are examined, classifying sound into the three categories of artistic expression, information transfer, and environmental sound. The potentials of and relations between sonification, algo- rithmic composition, musicogenic and sonigenic displays, virtual musical instruments and virtual sound sources are examined, as well as the practical technical limitations that govern performance control of MIDI and real-time DSP sound synthesis in coor- dination with visual display. The importance of music-theoretic and psychological re- search is emphasized. The issues and developed categorizations are then applied to a case study: the examination of a specific virtual environment performance by a team of workers in Australia in which the author worked as composer/performer/ programmer. I Introduction The importance of sound in the design ofvirtual environments was rec- ognized early (e.g., Sutherland 1965), but the scale of practical possibilities and participant expectations has seen dramatic recent growth. This growth is trace- able in the first instance to the explosion in computer audio that took place in the 1980s; that period saw the widespread advent of special-purpose micro- computer-based software, sophisticated digital synthesizers and samplers, digital signal-processing hardware, new audio formats, and establishment of a uni- versal musical performance control protocol, MIDI. These developments contributed to a rise in the quality of audio information storage and transmis- sion. More significantly, the digitization of audio and performance control en- abled vastly improved interactivity and the ready application of artificial intelli- gence methods to sound. These potentials have been furthered by developments in multiuser networking and teleconferencing. As a result, musical and sonic information can now easily proceed bidirec- tionally between human operators separated by a great distance, or between humans and computer agents that may be configured to act as composers, im- provisers, conductors, listeners, accompanists, signal processors, marketing agents, voice recognition and production systems, intelligent musical instru- databases of customized sound and a host of Presence, Vol. 6, No, 4. August 1997. 482-503 ments, environmentally effects, © 1997 by the Massachusetts Institute of Technology other less traditionally defined categories (Chadabe, 1984; Pressing, 1990, 482 PRESENCE: VOLUME 6, NUMBER 4 Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/pres.1997.6.4.482 by guest on 01 October 2021 Pressing 483 1992; Rowe, 1992). To an apparently greater degree calls, wind sounds, and the noises of machinery. TJiis than in the visual domain, nonsimulative approaches to classification is not disjunctive—for example, both music sound, drawing strongly on the structural, kinetic, emo- and speech can transmit emotional messages, and envi- tional, motor, and nonrepresentational properties of ronmentally based animal sounds provide information music, have grown in relevance over this period. shaped by evolution that those who know the code can The diversity of potentials of sound in software-based read. environments is also apparent through burgeoning de- Despite such overlap, this broad division into three velopments in sonification and auditory representation categories appears useful; notably, it parallels the norma- (e.g., Scaletti, 1994; Kramer 1994), sound localization tive breakdown into three separate audio tracks found in (e.g., Blauert, 1983; Middlebrooks and Green, 1991; film and video production (music, dialogue, and sound Durlach, 1991; Wenzel, 1992), human-computer inter- effects/ambience). There is also considerable evidence action (e.g., Blattner, Sumikawa, and Greenberg, 1989; that the primary components of these three categories Gaver, 1989; Bailas, 1994; Cohen, 1993), and the in the receive at least partially independent processing in the psychology of music and acoustic communication (e.g., human nervous system (Peretz, 1993). For example, it is Bregman, 1990; Truax, 1984; McAdams and Bigand, possible to find persons who display amusia without 1993; Handel, 1989; Lehrdahl and Jackendoff, 1983; aphasia, and others who display aphasia without amusia, Carlyon, Darwin, and Russell, 1993). Yet the results of although the two conditions are more commonly linked these developments inevitably operate within a cultural (Marin, 1982; Sergent, 1993). Likewise, other subjects context ofwell-defined traditions ofpresentation of can display amusia or language deficiencies without any sound in such venues as theatre, opera, concert halls, significant deficit in environmental sound recognition dance clubs, work environments, film, video, computer capacity (Peretz, 1993). We examine the three categories music, and performance art. Therefore, before focusing in turn. on the function of performed sound and music in virtual Music and song constitute the first, artistic/expressive environments, it seems essential to generally characterize category, and their functions are many. In the context of the various traditions of sound functionality in electronic multimedia presentation, music and song traditionally: media. • set or enhance scene character by identifying mood, emotion, pace, historical period, and physical loca- tion; 2 Roles of Sound in Electronic and • and the use of Software-based Media provide unity continuity by themes, leitmotivs, and musical progression; 2.1 Sound Function • heighten identity, realism, and engagement by emotional and gestural amplification through song and Sound and music can have a wide variety of func- diegetic sound. tions in any society (Radocy and Boyle, 1988). The dis- cussion here is restricted to the general functions of au- The leitmotiv, a technique developed by operatic com- dio in electronic media and software-based systems, poser Richard Wagner, is a distinct musical theme linked including virtual environments (VEs) and hypermedia. to a particular character or scene. There are many non- A useful starting point is the classification of sound as operatic examples, such as Sergei Prokofiev's Peter and artistic, informational, and/or environmental. Examples the Wolf, and Walt Disney's Fantasia. Diegetic sound is of artistic (or expressive) sound are all kinds of music sound that arises from an apparent source in the pre- and song; examples of sound whose main function is sented world. An example might be the sound that oc- information transfer are speech, alarms, and sonified curs if a character sings, plays a musical instrument, or data. Examples of environmental sounds include animal operates a power lathe. Downloaded from http://www.mitpressjournals.org/doi/pdf/10.1162/pres.1997.6.4.482 by guest on 01 October 2021 484 PRESENCE: VOLUME 6, NUMBER 4 In interactive environments, music has an additional mantic content), or emotional/personal (coded function: substantially by voice timbre and prosody). Another of 2 is • major component category provided to act as a medium ofinteraction between the par- what can be termed condition This com- ticipants) and the environment. by notification. ponent includes sound procedures such as audio cueing, This last function is central in VEs, leading to the idea alerts, alarms, "earcons" (Blattner, Sumikawa, and of the virtual musical instrument, where music is pro- and notification. These duced by participant operations within the virtual envi- Greenberg, 1989), boundary procedures may be one-shot or ongoing. In the first case ronment, as discussed in detail below. Conversely, gener- they provide on/off information and commonly act as a ated or performed music can generate visual, haptic, or cue to direct the eyes, engaging the psychological ori- other types of displays.1 These are discussed further be- also act to reinforce low under the heading of musicogenic displays. enting response; they commonly visual identification of objects. In the second (ongoing) The second category, sound emphasising information case they provide quasi-continuous information, typi- transfer, includes first of all speech, which in cinematic cally to allow uninterrupted monitoring, as in the use of and theatrical terms is either dialogue (including mono- audio beacons for navigation in a VE. logue) or narration/commentary. Dialogue is diegetic Condition notification has been the subject of consid- sound—it emerges from actions in the artificial world. erable investigation because of its practical applications. Narration is nondiegetic; it is imposed on the scene from Thus, there is a substantial, primarily older and visual, without the or (for example, travelogue narrator, the literature on vigilance, a situation in which an individual ancient In interactive a Greek chorus). environments performs extended observation or inspection tasks (Jeri- exists: third option conversation between agents and son and Pickett, 1963). More recent studies ofthe ef- human participant(s). Multiple participants can commu- fects of audio signals have also been done. For example, nicate with each other via customized as interfaces, such the perceived urgency of an audio alert is affected by a Audio Windows (Ludwig, Pincever, and Cohen, 1990;

Some Perspectives on Performed

DISORDERS of AUDITORY PROCESSING: EVIDENCE for MODULARITY in AUDITION Michael R

Cortical Auditory Disorders: Clinical and Psychoacoustic Features

The Auditory Agnosias

Phonagnosia: a Dissociation Between Familiar and Unfamiliar Voices

Visual Perception

The Cognitive Neuroscience of Music

Apraxia, Neglect, and Agnosia

Detection of Errors During Speech Production: a Review of Speech Monitoring Models

LANGUAGE and APHASIAS

A Cognitive Neuroscience View of Inner Language: to Predict and to Hear

Autism & the Senses

Modularity of Music Processing