Combining Visual and Textual Representations for Flexible Interactive Audio Signal Processing*

Combining Visual and Textual Representations for Flexible Interactive Audio Signal Processing* Roger B. Dannenberg School of Computer Science, Carnegie Mellon University [email protected] Abstract text-based system for interactive computer music. Text- based languages at least offer the possibility of reconfigur- Interactive computer music systems pose new challenges for ing audio computation graphs, which is an attractive possi- audio software design. In particular, there is a need for bility for music performance. The ability to decide at any flexible run-time reconfiguration for interactive signal time to apply an effect or to change the way parameters are processing. A new version of Aura offers a graphical editor computed can make programs more versatile while mini- for building fixed graphs of unit generators. These graphs, mizing the computation load that would occur if all options called instruments, can then be instantiated, patched, and were running all the time. reconfigured freely at run time. This approach combines Aura is another text-based (until recently) programming visual programming with traditional text-based program- environment for interactive music performance systems. ming, resulting in a structured programming model that is Aura began almost 10 years ago as an object-oriented gen- easy to use, is fast in execution, and offers low audio latency eralization of the CMU MIDI Toolkit. (Dannenberg 1986) through fast instantiation time. The graphical editor has a The Aura object model provides a natural way to create, novel type resolution system and automatically generates connect, and control audio signal processing modules. Aura graphical interfaces for instrument testing. was designed with audio processing in mind, and it there- fore offers multiple threads and support for low-latency 1 Introduction real-time audio computation. Flexible audio signal processing is a key ingredient in The goal of this paper is to discuss software architecture most interactive computer music. In traditional signal- issues associated with interactive audio, especially in the processing tasks, audio processes tend to be static. For ex- light of years of experience creating interactive composi- ample, a digital audio effect performs a single function and tions with Aura. In particular, the latest version of Aura, the internal configuration of processing elements is fixed. which I will call Aura 2, combines a visual editor with a Non-interactive computer music languages, in particular the text-based scripting language in an effort to obtain the best Music n languages (Mathews 1969), introduced the idea of of both worlds of visual and text-based programming lan- dynamically instantiating “instruments” to reconfigure the guages. processing according to timed entries in a score file. In real- Aura 2 uses a visual editor to combine unit generators time interactive music systems, software must be even more into (potentially) larger units called instruments. At run- dynamic, handling different media, performing computation time, under program control, instruments can be dynami- on both symbolic and audio representations, and reconfig- cally allocated and patched to form an audio computation uring signal computation under the control of various proc- graph. The periodic evaluation of this graph is scheduled esses. along with other events that can update instrument parame- A popular approach to building interactive software is ters, create new instruments, and modify the graph. A that of MAX MSP (Puckette 1991) and its many descen- remote-procedure-call facility allows I/O, graphics, and dents. This visual programming language allows the user to control computation to run outside of the time-critical connect various modules in a directed graph. This represen- audio-computation thread. tation is very close to the “boxes and arrows” diagrams used Section 2 describes some related work, and Section 3 to describe all sorts of signal processing systems, and it can discusses design alternatives and the choices made for Aura be said with confidence that this representation has with- 2. Section 4 describes the Aura Instrument Editor, a new stood the test of time. However, visual representations have Aura component that allows users to edit signal-processing problems dealing with dynamic structures. It is difficult to algorithms graphically. This is followed by a description of express dynamic data structures or to reconfigure processing how instrument designs are incorporated into Aura pro- graphs using MAX-like languages. grams and debugged there. Section 6 discusses the range of The main alternative is to use text-based languages. programming styles supported, from static to very dynamic, SuperCollider (McCartney 1996) is a good example of a and the current status and future work are described in *Published as: Roger B. Dannenberg, “Combining Visual and Textual Representations for Flexible Interactive Signal Processing,” in The ICMC 2004 Proceedings, San Francisco: The International Computer Music Association, 2004. Section 7. Finally, Section 8 presents a summary and are interconnections known at design time vs. run time? conclusions. When connections are known in advance, they can offer more efficient compilation, more context for debugging, and 2 Related Work faster instantiation at run time. On the other hand, when connections are made at run time, configurations and recon- There are many real-time audio processing frameworks figurations can be generated interactively under computer and systems, but space limitations prevent a detailed review. control. This is certainly more flexible. Once the basic problems of audio I/O are surmounted (using Aura 2 includes a new audio subsystem that supports a systems like PortAudio (Bencina and Burk 2001) or range of processing models, from fully static audio proc- RtAudio (Scavone 2002)), the next big issue is how to essing graphs to fully dynamic ones. Static graphs of unit modularize DSP algorithms for reuse. The standard ap- generators are configured using a graphical editor (see proach is the unit generator concept, introduced in Music n, Figure 1) to form instruments, members of the Aura Instr in which signal processing objects have inputs, outputs, class. These instruments can be flexibly interconnected at some internal state, and a method that processes one unit of run time under program control. input to produce outputs. Experience with fully dynamic and reconfigurable Systems vary according to how unit generators are graphs in the first version of Aura indicates, not surpris- assembled and managed at run time. The simplest approach ingly, that most audio processing designs are mainly static. is through mainly static graphs as in MAX-like languages. While there may be good reason to allow and even promote (Chaudhary, Freed, and Wright 2000; Dechelle et al. 1998; dynamic changes to audio signal processing graphs, in Puckette 1991; Puckette 1997; Puckette 2002a; Zicarelli practice, most connections are set up when an instrument is 1998) Kyma (Scaletti and Johnson 1988) might also belong created and stay that way for as long as the instrument in this category because it uses precompiled graphs. exists. Often, boxes-and-arrows diagrams are used at the A more elaborate approach is based on Music n in which design stage, even when this must be translated to a textual a group of unit generators (called a patch or an instrument) implementation, so Aura 2 supports this style of working by is instantiated dynamically. A feature of this approach is providing a graphical patch editor. that instruments add their outputs to a global buffer. The By organizing collections of unit generators into instru- instrument can be deallocated without leaving behind any ments, we benefit in several ways. First, we do not pay for “dangling” pointers to deleted objects. The NeXT Music Kit the overhead of dynamic patching when statically allocated (Jaffe and Boynton 1989) and csound (Vercoe and Ellis unit generators executed in a fixed order will suffice. 1990) adopted this approach, and to a large extent, this is Second, when problems occur, symbolic debuggers tend to also the basis for managing instruments in SuperCollider provide more useful information when unit generators have (McCartney 1996). Similar principles are also at work in names and exist in the context of a larger object than when ChucK (Wang and Cook 2003) insofar as audio “shreds” they are dynamically allocated individually on the heap. can be started or stopped independently. Third, in-lining unit generators and other compilation tricks Another way to structure computation is to use the tree- can improve the performance of the instrument model. like structure of nested expressions, e.g. mult(osc(), env()), Finally, and most importantly, users can reason about their to denote a corresponding graph structure of unit generators. own code more easily when more of the design is static. This sort of implicit connection is found in SuperCollider, The latest version of SuperCollider reflects some of STK (Cook and Scavone 1999), Sizzle (Pope and these thoughts and design principles; in particular, Super- Ramakrishnan 2003) and Nyquist (Dannenberg 1997). Collider 3 instruments are compiled into a single object as Other systems, including CLM (Schottstaedt 1994), opposed to composing unit generators at run time. The JSyn (Burk 1998), and Aura 1 (Dannenberg and Brandt justification is in part to avoid heavy computation in order 1996) impose even less organization

Combining Visual and Textual Representations for Flexible Interactive Audio Signal Processing*

Flocking: a Framework for Declarative Music-Making on the Web

The Early History of Music Programming and Digital Synthesis, Session 20

Real-Time Multimedia Composition Using

SAOL: the MPEG-4 Structured Audio Orchestra Language

Computer Music (So Far)

Towards a Functional-Aesthetic Sonification Design Framework A

A Declarative Metaprogramming Language for Digital Signal

Deubois New Edit

Computational Composition Strategies in Audiovisual Laptop Performance

Flocking: a Framework for Declarative Music-Making on the Web

Chugens, Chubgraphs, Chugins: 3 Tiers for Extending Chuck

Subsynth: a Generic Audio Synthesis Framework for Real-Time Applications