Automatic Design of Sound Synthesizers As Pure Data Patches Using Coevolutionary Mixed-Typed Cartesian Genetic Programming

Automatic Design of Sound Synthesizers as Pure Data Patches using Coevolutionary Mixed-typed Cartesian Genetic Programming Matthieu Macret Philippe Pasquier School of Interactive Arts and Technology School of Interactive Arts and Technology Simon Fraser University Simon Fraser University B.C., Canada V3T 0A3 B.C., Canada V3T 0A3 [email protected] [email protected] ABSTRACT 1. INTRODUCTION A sound synthesizer can be defined as a program that takes Replicating sounds using synthesis techniques is a prob- a few input parameters and returns a sound. The general lem frequently addressed in the field of computer music [12]. sound synthesis problem could then be formulated as: given A well-researched problem is the following: Given a tar- a sound (or a set of sounds) what program and set of input get sound and a synthesizer, what set of input parameters parameters can generate that sound (set of sounds)? We can replicate or approximate this sound. Manually estimat- propose a novel approach to tackle this problem in which ing input parameters of a particular synthesizer is usually we represent sound synthesizers using Pure Data (Pd), a difficult and time consuming, especially when there is no graphic programming language for digital signal processing. intuitive relationship between the input parameter values We search the space of possible sound synthesizers using Co- and the sound produced. This search for input parame- evolutionary Mixed-typed Cartesian Genetic Programming ters is then a good candidate for an automatic optimiza- (MT-CGP), and the set of input parameters using a stan- tion scheme. Thus, diverse optimization methods have been dard Genetic Algorithm (GA). The proposed algorithm co- used for automatic calibration, such as Particle Swarm [11], evolves a population of MT-CGP graphs, representing the HMM [34], Neural Nets [27], Cellular Automata [29] and Ge- functional forms of synthesizers, and a population of GA netic Algorithms [12]. It has been suggested that evolution- chromosomes, representing their inputs parameters. A fit- ary approaches such as Genetic Algorithms (GAs) are per- ness function based on the Mel-frequency Cepstral Coeffi- forming the best to matching musical instrument tones [27]. cients (MFCC) evaluates the distance between the target GAs have been used extensively for estimating the param- and produced sounds. Our approach is capable of suggest- eters of various synthesis techniques [1, 8, 13, 14, 17, 22, ing novel functional forms and input parameters, suitable to 28, 31] and real-world synthesizers [16, 33]. For instance, approximate a given target sound (and we hope in future we developed a GA-based system [17] able to automatically iterations a set of sounds). Since the resulting synthesiz- calibrate a ModFM synthesizer to replicate harmonic in- ers are presented as Pd patches, the user can experiment, strument tones. We then worked on a more complex prob- interact with, and reuse them. lem: calibrating the OP-1 synthesizer [16]. This commercial synthesizer contains several synthesis engines, effects Categories and Subject Descriptors and Low-Frequency-Oscillators, which make the parameter search space large, complex, but also discontinuous. In order H.5.5 [Sound and Music Computing]: Methodologies to tackle this difficult problem we proposed using a Non- and techniques. dominated Sorting Genetic Algorithm-II (NSGA-II) with multiple objectives. General Terms In this work, instead of only searching the space of input parameters for a specific synthesizer, we attempt to solve the Algorithms more general sound synthesis problem: given a sound, what program (sound synthesizer) and set of input parameters can Keywords generate that target sound? We propose a system in which Cartesian Genetic Programming; Coevolution; Sound Syn- we represent sound synthesizers as directed acyclic graphs thesis using the graphical programming language: Pure Data (Pd). We search the space of possible sound synthesizers using Co- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed evolutionary Mixed-typed Cartesian Genetic Programming for profit or commercial advantage and that copies bear this notice and the full citation (MT-CGP). The proposed algorithm co-evolves a popula- on the first page. Copyrights for components of this work owned by others than the tion of MT-CGP graphs, representing functional forms of author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or synthesizers, and a population of GA chromosomes, repre- republish, to post on servers or to redistribute to lists, requires prior specific permission senting their inputs parameters. A fitness function based and/or a fee. Request permissions from [email protected]. on the Mel-frequency Cepstral Coefficients (MFCC) evalu- GECCO’14, July 12–16, 2014, Vancouver, BC, Canada. Copyright is held by the owner/author(s). Publication rights licensed to ACM. ates the distance between the target and produced sounds. ACM 978-1-4503-2662-9/14/07 ...$15.00. We run experiments using contrived and recorded target http://dx.doi.org/10.1145/2576768.2598303. 309 sounds. The resulting sounds and synthesizers presented A control (audio) signal can only be produced by a control as Pd patches are available online [5] and are discussed in (audio) object. The presence of the sign ∼ in an object box Section 5. indicates that it is an audio object (see Figure 1c). Audio (control) signals can only be transmitted between specifi- 2. RELATED WORK cally defined audio (control) outlets and inlets. We represent the synthesizers in Pd as directed acyclic Only a few attempts have been made at solving the general graphs. Hundreds of object types are available in Pd. In sound synthesis problem. The resulting systems differ from order to limit the size of the search space, we select only other GA sound matching systems [1, 8, 13, 14, 17, 22, 28, 31] objects that are the common building blocks of sound syn- in that they are not only searching the inputs parameters but thesizers in Pd. Table 1 shows a selection of Pd objects also the functional form of the synthesizer. available to our search algorithm. This list includes arith- Takala et al. [30] evolved what they called Timbre Trees. metic, trigonometric operations for controls and signals but The nodes of these trees were arithmetic operations, analytic also oscillators and filters. These objects are commonly used functions or noise generators. They defined a roulette-wheel by Pd programmers [25] to implement major sound synthe- selection operator as well as a crossover and mutation opera- sis techniques such as additive, subtractive, FM, AM or PM tor for trees. The user had to listen to the sounds generated synthesis. Each of these Pd objects has one to three inlets using these trees and interactively evaluate them. Takala et and only one outlet. Each inlet (outlet) is connected to one al used the term Genetic Algorithm to describe their sys- or more outlets (inlets). Figure 1c shows an example of Pd tem but the term Genetic Programming (GP) seems more graphs evolved by our algorithm. In principle, any synthe- appropriate given the algorithmic process and genetic oper- sizer can be represented as a Pd patch [25]. ators they defined. They implemented directly their system in C++ without using any generic sound synthesis library. While the authors reported that they successfully used their 4. MIXED TYPED CARTESIAN GENETIC system to produce an entire class of bee-like sounds (rang- PROGRAMMING ing from mosquitos to chain saws), they did not present any proper evaluation. Cartesian Genetic Programming (CGP) [21] is a form of More recently, Garcia [9] designed a Genetic Program- Genetic Programming that encodes a directed acyclic graph ming (GP) system for automatic generation of sound syn- representation of a computer program instead of a tree. The thesis techniques. He encodes the instructions for building genotype, a string of integers, determines the functions of sound synthesis topology graphs in the GP expression trees. nodes in the graph, the connections between nodes, the con- Production rules are defined to ensure that any valid ex- nections to inputs and the locations in the graph where pression tree will produce a valid topology. The nodes of outputs are taken from. Using a graph representation is the topologies are chosen from six different types of basis very flexible as many computational structures can be rep- synthesizer components (oscillator, addition/multiplication resented as a graph. Examples of this are artificial neural operators, filters, etc). These topologies have two inputs networks, electronic circuits or mathematical equations [21]. parameters that are optimized by a canonical GA. He de- In this work, we represents Pd patches which are directed fined the fitness function as the least-squared error of the acyclic graphs in their general form. short-time spectra where the frequencies that are not heard This graph representation presents advantages over the by the average listener are ignored. The system was im- classical tree-based GP representation. In a CGP graph, plemented directly in C++ and Matlab without using any the output of a given node can be connected to one or more generic sound synthesis library. Garcia tested his system inputs of other nodes. It is not the case with a tree repre- with a FM woodwind instrument and verified that it could sentation where a node output can only be connected to one generate an expression tree with close similarity to the tar- other node. In addition, nodes of the graph are not neces- get FM equation used to design the instrument. However, sary connected indirectly or directly to the output node of there were higher energies at high frequencies. He also got the program (e.g. second node see Figure 1b). As a result, modest results evolving a synthesizer for a sampled piano nodes in the representation can have no effect on the output, tone that sounded like a string hit by a hammer.

Load more