CTP431- Music and Audio Computing Sound Synthesis

Graduate School of Culture Technology KAIST Juhan Nam

1 Musical Sound Synthesis

§ Modeling the patterns of musical tones and generating them

§ (Typical) musical tones – Time-wise: amplitude envelope (ADSR) à Waveform – -wise : harmonic distribution à Spectrogram

§ How are musical tones different from other sounds (e.g. speech)?

2 Types of Tones

§ Musical tones have more diverse types – Harmonic: guitar, flute, violin, organ, singing voice (vowel) – Inharmonic: piano, vibraphone – Non-harmonic: drum, percussion, singing voice (consonant)

[From Klapuri’s slides] Vibraphone *Inharmonicity in Piano

3 Information in Tones

§ Musical tones have the main information in “pitch” – Speech Tones have it mainly in “formant (i.e. spectral envelop)”

§ Examples – Original music : – Reconstruction from timbre features (MFCC) and using white-noise as a source): – Reconstruction from tonal features (Chroma):

4 Pitch Scale and Range

§ In music, pitch is arranged on a tuning system and the range is much wider

4000

3500

3000

2500 Hz − 2000

frequency 1500

1000

500

0 10 20 30 40 50 time [second]

5 Control of Tones

§ Musical Instrument

Note Number, Velocity, Music Output Duration (+ Expressions)

§ Speech

Speech Phonemes Output Synthesizer (+ Expressions)

6 Overview of Sound Synthesis Techniques

§ Signal model (analog / digital) – – Modulation Synthesis: ring modulation, frequency modulation – Synthesis: non-linear

§ Sample model (digital) – Sampling Synthesis – – Concatenative Synthesis

§ Physical model (digital) – Digital Waveguide Model

7 Theremin

§ A sinusoidal tone generator

§ Two antennas are remotely controlled to adjust pitch and volume

Theremin ( by Léon Theremin, 1928) Theremin (Clara Rockmore) https://www.youtube.com/watch?v=pSzTPGlNa5U 9 Additive Synthesis

§ Synthesize sounds by adding multiple sine oscillators – Also called Fourier synthesis

OSC Amp (Env)

OSC Amp (Env) +

......

OSC Amp (Env)

10 Hammond Organ

§ Drawbars – Control the levels of individual tonewheels

11 Hammond Organ

https://www.youtube.com/watch?v=2rqn4bYFUZU 12 Sound Examples

§ Web Audio Demo – http://femurdesign.com/theremin/ – http://www.venlabsla.com/x/additive/additive.html – http://codepen.io/anon/pen/jPGJMK

§ Examples (instruments) – Kurzweil K150 • https://soundcloud.com/rosst/sets/kurzweil-k150-fs-additive – Kawai K5, K5000

13 Subtractive Synthesis

§ Synthesize sounds by filtering wide-band oscillators – Source-Filter model – Examples • Analog : oscillators + resonant lowpass filters • Voice Synthesizers: glottal pulse train + formant filters

20 20 20

10 10 10

0 0 0

−10 −10 −10

−20 −20 −20

−30 −30 −30 Magnitude (dB) Magnitude (dB) Magnitude (dB) −40 −40 −40

−50 −50 −50

−60 −60 −60 5 10 15 20 0 0.5 1 1.5 2 2.5 5 10 15 20 Frequency (kHz) 4 Frequency (kHz) Frequency (kHz) x 10 Source Filter Filtered Source

14 Moog Synthesizers

Soft Envelope LFO Control

Keyboard Physical Control Wheels Slides Pedal Envelope

Parameter Parameter Parameter Audio Path Oscillators Filter Amp

Parameter = offset + depth*control (e.g. filter cut-off frequency) (static value) (dynamic value)

15 Oscillators

§ Classic waveforms

2 1 2

0 0 0

−2 −1 −2 0 50 100 150 200 0 50 100 150 200 0 50 100 150 200

20 20 20 0 −6dB/oct 0 −6dB/oct 0 −12dB/oct −20 −20 −20 −40 −40 −40 Magnitude (dB) Magnitude (dB) Magnitude (dB) −60 −60 −60 5 10 15 20 5 10 15 20 5 10 15 20 Frequency (kHz) Frequency (kHz) Frequency (kHz) Sawtooth Square Triangular

§ Modulation – Pulse width modulation – Hard-sync – More rich harmonics

16 Amp Envelop Generator

§ Amplitude envelope generation – ADSR curve: attack, decay, sustain and release – Each state has a pair of time and target level

Amplitude Attack Decay (dB) Sustain

Release

Note On Note Off

17 Examples

§ Web Audio Demos – http://www.google.com/doodles/robert-moogs-78th-birthday – http://webaudiodemos.appspot.com/midi-synth/index.html – http://aikelab.net/websynth/ – http://nicroto.github.io/viktor/

§ Example Sounds – SuperSaw – Leads – Pad – MoogBass – 8-Bit sounds: https://www.youtube.com/watch?v=tf0-Rrm9dI0 – TR-808: https://www.youtube.com/watch?v=YeZZk2czG1c

18 Modulation Synthesis

§ Modulation is originally from communication theory – Carrier: channel signal, e.g., radio or TV channel – Modulator: information signal, e.g., voice, video

§ Decreasing the frequency of carrier to hearing range can be used to synthesize sound

§ Types of modulation synthesis – Amplitude modulation (or ring modulation) – Frequency modulation

§ Modulation is non-linear processing – Generate new sinusoidal components

19 Ring Modulation / Amplitude Modulation

§ Change the amplitude of one source with another source – Slow change: tremolo – Fast change: generate a new tone

Modulator OSC Modulator OSC

Carrier OSC x Carrier OSC x +

(1 a (t))A cos(2 f t) am (t)Ac cos(2π fct) + m c π c

Ring Modulation Amplitude Modulation

20 Ring Modulation / Amplitude Modulation

§ Frequency domain – Expressed in terms of its sideband – The sum and difference of the two frequencies are obtained according to trigonometric identity – If the modulator is a non-sinusoidal tone, a mirrored-spectrum with regard to the carrier frequency is obtained

carrier sideband sideband

am (t) = Am sin(2π fmt))

fc-fm fc fc+fm

21 Examples

§ Tone generation – SawtoothOsc x SineOsc – https://www.youtube.com/watch?v=yw7_WQmrzuk

§ Ring modulation is often used as an audio effect – http://webaudio.prototyping.bbc.co.uk/ring-modulator/

22 Frequency Modulation

§ Change the frequency of one source with another source – Slow change: vibrato – Fast change: generate a new (and rich) tone – Invented by John Chowning in 1973 à Yamaha DX7

Modulator OSC Ac cos(2π fct + β sin(2π fmt)) frequency

Carrier OSC A β = m Index of modulation fm

23 Frequency Modulation

§ Frequency Domain – Expressed in terms of its sideband frequencies – Their amplitudes are determined by the Bessel function – The sidebands below 0 Hz or above the Nyquist frequency are folded

k=−∞ y(t) = Ac ∑ Jk (β)cos(2π( fc + kfm )t) k=−∞

carrier sideband1 sideband1 sideband2 sideband2 sideband3 sideband3

fc-3fm fc-2fm fc-fm fc fc+fm fc+2fm fc+3fm

24 Bessel Function

n β k+2n ∞ (−1) ( ) 2 Jk (β) = ∑ n=0 n!(n + k)!

1 Carrier Sideband 1 Sideband 2 Sideband 3 Sideband 4

0.5 J_(k)

0

−0.5 0 50 100 150 200 250 300 350 beta

25 Bessel Function

26 The Effect of Modulation Index

1 1

0 0 Amplitude Amplitude −1 −1 0 500 1000 1500 2000 0 500 1000 1500 2000 Time (Sample) Time (Sample) 20 20 Beta = 0 Beta = 1 0 0 −20 −20 −40 −40 Magnitude (dB) −60 Magnitude (dB) −60 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 Frequency (kHz) Frequency (kHz)

1 1

0 0 Amplitude Amplitude −1 −1 0 500 1000 1500 2000 0 500 1000 1500 2000 Time (Sample) Time (Sample) 20 20 Beta = 10 Beta = 20 0 0 −20 −20 −40 −40 Magnitude (dB) −60 Magnitude (dB) −60 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 Frequency (kHz) Frequency (kHz)

fc = 500, fm = 50 27 Yamaha DX7 (1983)

28 “Algorithms” in DX7

http://www.audiocentralmagazine.com/yamaha-dx-7-riparliamo-di-fm-e-non-solo-seconda-parte/yamaha-dx7-algorithms/ 29 Examples

§ Web Audio Demo – http://www.taktech.org/takm/WebFMSynth/

§ Sound Examples – Bell – Wood – Brass – Electric Piano – Vibraphone

30 Non-linear Synthesis (wave-shaping)

§ Generate a rich sound spectrum from a sinusoid using non-linear transfer functions (also called “distortion synthesis”)

§ Examples of transfer function: y = f(x) – y = 1.5x’ – 0.5x’3 – y = x’/(1+|x’|) x’=gx: g correspond to the “gain knob” of the distortion – y = sin(x’) T0(x)=1, T1(x)=x, – Chebyshev polynomial: Tk+1(x) = 2xTk(x)-Tk-1(x) 2 3 T2(x)=2x -1, T2(x)=4x -3x

1 1

1 0 0 Amplitude

0.5 Amplitude −1 0 50 100 150 200 −1 Time (Sample) 0 50 100 150 200 0 Time (Sample) 20

Amplitude 20 0 −0.5 0 −20 −40 −20 −1 Magnitude (dB) −60 −1 −0.5 0 0.5 1 −40 5 10 15 20 Time (Sample) Frequency (kHz) Magnitude (dB) −60 5 10 15 20 Frequency (kHz)

31 Physical Modeling

§ Modeling Newton’s laws of motion (i.e. � = ��) on musical instruments – Every instrument have a different model

§ The ideal string

– Wave equation: � = �� à � = � (�: tension, �: linear mass density) – General solution: � �, � = � (� − ) + � (� + ) àLeft-going traveling wave and right-going traveling wave

32 Physical Modeling

§ Waveguide Model – With boundary condition (fixed ends)

-1.0 -0.99

§ The Karplus-Strong model

Delay Line x(n) + Z-M y(n) Noise Burst Lowpass Filter

33 Physical Modeling

§ The Extended Karplus -Strong model

https://ccrma.stanford.edu/~jos/pasp/Extended_Karplus_Strong_Algorithm.html 34 Sample-based Synthesis

§ The majority of digital sound and music synthesis today is accomplished via the playback of stored waveforms – Media production: sound effects, narration, prompts – Digital devices: ringtone, sound alert – Musical Instruments • Native Instrument Kontakt5: 43+ GB (1000+ instruments) • Synthogy Ivory II Piano: 77GB+ (Steinway D Grand, ….)

Foley (filmmaking) Ringtones Synthogy Ivory II Piano 35 Why Don’t We Just Use Samples?

§ Advantages – Reproduce realistic sounds (needless to say) – Less use of CPU

§ Limitations – Not flexible: repeat the same sound again, not expressive – Can require a great deal of storage – Need high-quality recording – Limited to real-world sounds

§ Better ways – Modify samples based on existing sound processing techniques • Much richer spectrum of sounds – Trade-off: CPU, memory and programmability

36 Sample-based Synthesis

Off-Line Storing Off-line Recording Processing in table

- Sample editing Meta-data: pitch, - Pitch change loudness, action, text - Filter, EQ, envelope - Normalization

User Sample Online Audio On-line input Fetching Processing Effect

- Pitch, velocity, Read Samples - Pitch Change - Delay-based Timbre from the table - Filter, EQ effects - Speed, strength - Envelope (e.g. room Effects) - text

37

§ Playback samples stored in tables – Multi-sampling: choose different sample tables depending on input conditions such pitch and loudness • Velocity switching

§ Reducing sample tables in musical synthesizers – Sample looping: reduce the size of tables – Pitch shifting by re-sampling: avoid sampling every single pitch – Filtering: avoid sampling every single loudness • e.g. low-pass filtering for soft input

38 Sample Looping

§ Find a periodic segment and repeat it seamlessly during playback – Particularly for instruments with forced oscillation (e.g. woodwind) – Usually taken from the sustained part of a pitched musical note

Attack Loop Playback using looping

§ It is not easy to find an exactly clean loop – The amplitude envelopes often decays or modulated: • e.g. piano, guitar, violin – Period in sample is not integer à non-integer-size sample table?

39 Sample Looping

§ Solutions – Decaying amplitude: normalize the amplitude • Compute the envelope and multiply it inverse • Then, multiply the envelope back later – Non-integer period in sample • Use multiple periods for the loop such that the total period is close to integers * e.g. Period = 100.2 samples à 5*Period = 501 samples – Amplitude modulation • Crossfade between the end of loop and the beginning of loop meet

§ Automatic loop search – Pitch detection and zero-crossing detection: c.f. samplers

40 Concatenative Synthesis

§ Splicing sample segments based on input information – Typically done in : unit selection

§ Sample size depends on applications – ARS: limited expression and context-dependent • word or phrase level – TTS: unlimited expression and context-independent • phone or di-phone (phone-to-phone transition) level

41 Summary

Memory Programmability Reproducibility of Interpretability Computation (Storage) (by # of parameters) natural sounds of parameters power

Additive ** ***** **** **** ****

Subtractive * *** ** *** **

Non-linear * *** ** ** **

Physical model *** ** **** ***** *** ~ *****

Sample-based ***** * ***** N/A * ~ ***

42