A Declarative Metaprogramming Language for Digital Signal
Total Page:16
File Type:pdf, Size:1020Kb
Vesa Norilo Kronos: A Declarative Centre for Music and Technology University of Arts Helsinki, Sibelius Academy Metaprogramming PO Box 30 FI-00097 Uniarts, Finland Language for Digital [email protected] Signal Processing Abstract: Kronos is a signal-processing programming language based on the principles of semifunctional reactive systems. It is aimed at efficient signal processing at the elementary level, and built to scale towards higher-level tasks by utilizing the powerful programming paradigms of “metaprogramming” and reactive multirate systems. The Kronos language features expressive source code as well as a streamlined, efficient runtime. The programming model presented is adaptable for both sample-stream and event processing, offering a cleanly functional programming paradigm for a wide range of musical signal-processing problems, exemplified herein by a selection and discussion of code examples. Signal processing is fundamental to most areas handful of simple concepts. This language is a good of creative music technology. It is deployed on fit for hardware—ranging from CPUs to GPUs and both commodity computers and specialized sound- even custom-made DSP chips—but unpleasant for processing hardware to accomplish transformation humans to work in. Human programmers are instead and synthesis of musical signals. Programming these presented with a very high-level metalanguage, processors has proven resistant to the advances in which is compiled into the lower-level data flow. general computer science. Most signal processors This programming method is called Kronos. are programmed in low-level languages, such as C, often thinly wrapped in rudimentary C++. Such a workflow involves a great deal of tedious detail, as Musical Programming Paradigms these languages do not feature language constructs and Environments that would enable a sufficiently efficient imple- mentation of abstractions that would adequately The most prominent programming paradigm for generalize signal processing. Although a variety musical signal processing is the unit generator of specialized musical programming environments language (Roads 1996, pp. 787–810). Some examples have been developed, most of these do not enable of “ugen” languages include the Music N family (up the programming of actual signal processors, forcing to the contemporary Csound; see Boulanger 2000), as the user to rely on built-in black boxes that are well as Pure Data (Puckette 1996) and SuperCollider typically monolithic, inflexible, and insufficiently (McCartney 2002). general. In this article, I argue that much of this stems from the computational demands of real-time signal The Unit Generator Paradigm processing. Although much of signal processing is very simple in terms of program or data structures, The success of the unit generator paradigm is driven it is hard to take advantage of this simplicity in a by the declarative nature of the ugen graph; the general-purpose compiler to sufficiently optimize programmer describes data flows between primitive, constructs that would enable a higher-level signal- easily understood unit generators. processing idiom. As a solution, I propose a highly It is noteworthy that the typical selection of streamlined method for signal processing, starting ugens in these languages is very different from the from a minimal dataflow language that can describe primitives and libraries available in general-purpose the vast majority of signal-processing tasks with a languages. Whereas languages like C ultimately consist of the data types supported by a CPU and the Computer Music Journal, 39:4, pp. 30–48, Winter 2015 primitive operations on them, a typical ugen could doi:10.1162/COMJ a 00330 be anything from a simple mathematical operation c 2015 Massachusetts Institute of Technology. to a reverberator or a pitch tracker. Ugen languages 30 Computer Music Journal Downloaded from http://www.mitpressjournals.org/doi/pdfplus/10.1162/COMJ_a_00330 by guest on 25 September 2021 tend to offer a large library of highly specialized processing is vectorized to amortize the cost of ugens rather than focusing on a small orthogonal dynamic dispatch over buffers of data instead of set of primitives that could be used to express any individual samples. desired program. The libraries supplied with most Consider a buffer size of 128 samples, which is general-purpose languages tend to be written in often considered low enough to not interfere in those languages; this is not the case with the typical a real-time musical performance. For a ugen that ugen language. semantically maps to a single hardware instruction, misprediction could consume 36 percent to 53 percent of the time on a Sandy Bridge CPU, as The Constraints of the Ugen Interpreter derived from the numbers previously stated. To I classify the majority of musical programming reduce the impact of this cost, the ugen should environments as ugen interpreters. Such environ- spend more time doing useful work. ments are written in a general-purpose programming Improving the efficiency of the interpreter could language, along with the library of ugens that are involve either increasing the buffer size further or available for the user. These are implemented in increasing the amount of useful work a ugen does per a way that allows late binding composition: the dispatch. As larger buffer sizes introduce latency, ugens are designed to be able to connect to the ugen design is driven to favor large, monolithic inputs and outputs of other, arbitrary ugens. This blocks, very unlike the general-purpose primitives is similar to how traditional program interpreters most programming languages use as the starting work—threading user programs from predefined point, or the native instructions of the hardware. native code blobs—hence the term ugen interpreter. In addition, any buffering introduces a delay In this model, ugens must be implemented via equivalent to the buffer size to all feedback connec- parametric polymorphism: as a set of objects that tions in the system, which precludes applications share a suitable amount of structure, to be able to such as elementary filter design or many types of interconnect and interact with the environment, physical modeling. regardless of the exact type of the ugen in ques- tion. Dynamic dispatch is required, as the correct An Ousterhout Corollary signal-processing routine must be reached through an indirect branch. This is problematic for contem- John Ousterhout’s dichotomy claims that program- porary hardware, as the hardware branch prediction ming languages are either systems programming relies on the hardware instruction pointer: In an languages or scripting languages (Ousterhout 1998). interpreter, the hardware instruction pointer and To summarize, the former are statically typed, and the interpreter program location are unrelated. produce efficient programs that operate in isolation. A relevant study (Ertl and Gregg 2007) cites C is the prototypical representative of this group. misprediction rates of 50 percent to 98 percent for The latter are dynamically typed, less concerned typical interpreters. The exact cost of misprediction with efficiency, intended to glue together distinct depends on the computing hardware and is most components of a software system. These are repre- often not fully disclosed. On a Sandy Bridge CPU by sented by languages such as bash, or Ousterhout’s Intel, the cost is typically 18 clock cycles; as a point own Tcl. of reference, the chip can compute 144 floating-point The Ousterhout dichotomy is far from universally operations in 18 cycles at peak throughput. There accepted, although an interesting corollary to are state-of-the-art methods (Ertl and Gregg 2007; musical programming can be found. Unit generators Kim et al. 2009) to improve interpreter performance. are the static, isolated, and efficient components Most musical programming environments choose in most musical programming languages. They instead a simple, yet reasonably effective, means are typically built with languages aligned with of mitigating the cost of dynamic dispatch. Audio Ousterhout’s systems programming group. The Norilo 31 Downloaded from http://www.mitpressjournals.org/doi/pdfplus/10.1162/COMJ_a_00330 by guest on 25 September 2021 scripting group is mirrored by the programming Nyquist: Signals as Values surfaces such as the patching interface described by Nyquist (Dannenberg 1997) is a Lisp-based synthesis Miller Puckette (1988) or the control script languages environment that extends the XLisp interpreter in ChucK (Wang and Cook 2003) or SuperCollider. with data types and operators specific to signal Often, these control languages are not themselves processing. The main novelty here is to treat signals capable of implementing actual signal-processing as value types, which enable user programs to routines with satisfactory performance, as they inspect, modify, and pass around audio signals focus on just acting as the glue layer between black in their entirety without significant performance boxes that do the actual signal processing. penalties. Composition of signals rather than ugens A good analysis of the tradeoffs and division allows for a wider range of constructs, especially of labor between “systems” languages and “glue” regarding composition in time. languages in the domain of musical programming As for DSP, Nyquist remains close to the ugen has been previously given by Eli Brandt (2002, pp. interpreter model. Signals are lazily evaluated, and 3–4). Although the “glue” of musical