<<

Mihir Sarkar An Approach Towards Perceptual Sound Synthesis December 8, 2005 [email protected] MIT Media Lab

THE CONCEPT

As musicians or sound designers, we look for sounds to describe our emotions or our perception of the world. Sometimes we browse sounds at random in the event that a particular tone will evoke a mood or an idea conducive to further explorations. Whether we discover and invent tones using a novel synthesis method, modify existing sounds, or even search for a preset in the sound bank of a commercial synthesizer, we usually need to delve into the technical aspects of sound design before coming up with a result that adequately transcribes our mental representation or intuition.

We note however that musicians have developed an informal language to describe timbre and other sonic characteristics: we say of a sound that it is “warm” or “fat”, “ethereal” or “buzzing”; some of these terms have a significance proper to each individual, whereas other terms such as “loud” and “soft” have a more immediate and accepted meaning.

Like the new generation of consumer equalizers, which have presets that read “pop”, “classical” or “acoustic” without mentioning the EQ settings, I propose to ground descriptive words in auditory perception through sound generators and modifiers. My Perceptual Sound Synthesizer is not based on a new synthesis algorithm; rather it is a common interface to an existing sound synthesis engine using Frequency Modulation.

THE METHOD

Disclaimer: the Perceptual Synthesizer incorporates my own descriptive words for sounds that may not necessarily meet the expectations of other individuals. Being an experiment, some tones may not sound very “musical” either. Far from the idea of demonstrating the ultimate Perceptual Synthesizer, this project is the occasion for me to explore this idea further and come up with a proof-of-concept while learning the intricacies of Csound as a sound synthesis language, and toying with FM parameters to discover their effect on our perception.

My idea was to map intuitive parameters to technical parameters. This is not necessarily a one-to-one mapping. I chose a basic FM oscillator pair because of its ability to create complex and dynamic timbres while keeping to a minimum the number of parameters it supplies. The output of the carrier oscillator is post- processed by a low-pass and a resonant filter to further alter its sonic character. The parameters are gathered from the user as MIDI-controlled values or via a (developed with the Fast Light Tool Kit) and mapped either linearly to the actual range of interest of the FM parameter(s) considered, or passed as an index to a look-up table for a more complex mapping.

Some parameters have a direct or deterministic impact: the modulation index for instance directly defines the number of sidebands, and therefore alters the “richness” of the spectrum. On the other hand the tag">c/m ratio is more subjective as it gives a sense of pitch when it is rational, and produces a metallic sound due to inharmonic partials when it is irrational.

TO EXPERIMENT YOURSELF

My Csound code and documentation is available at: http://www.media.mit.edu/~mihir/software/perceptual_synth.zip

FOR MORE INFORMATION…

R. Boulanger (ed.), The Csound Book, The MIT Press, Cambridge, MA, USA, 2000: chapters 2, 6, 9 and 12.

J. Chowning, . Bristow, FM Theory & Applications - By Musicians For Musicians, Yamaha Music Foundation, Tokyo, Japan, 1986. Perceptual Synthesizer

Mihir Sarkar MIT Media Lab

GUI (FLTK)

Master keyboard

MIDI Audio out MIDI-USB (events)

1 Speech Speech Commands Recognizer (VB / MS SDK)

GUI (FLTK)

Csound MIDI Audio out (events) orchestra

Speech Speech Commands Recognizer (VB / MS SDK)

MIDI (ActiveX) GUI (FLTK)

MIDI (controls) Audio in

Csound MIDI Audio out (events) orchestra

Csound score

2 ndx = dev / m dev m*frq c/m

oscil c*frq * vel + vib D (MIDI events) A S R +

amp (MIDI event) note on note off

oscil

out

Capture Intuitive Parameter Technical Parameter Capture real-time perceptual controls parameters Soft / Loud Amplitude, LPF Slow / Sharp Envelope

Fat / Thin Modulation index

Map parameters Warm / Bright Waveform

Light / Heavy c/m (rational)

Metallic c/m (irrational)

Define technical Ringing Reson filter bw parameters Real-time controls

Note Frequency, vibrato depth

Velocity Amplitude Synthesize sound Modulation wheel Vibrato rate

3 Improvements

• Define parameters from experiments • Define starting point / “I’m feeling lucky” • “Unified” sound synthesis engine • Musical or acoustical viability • ASR with AEC

4