<<

Historical Overview of Audio Spectral Modeling

Julius Smith CCRMA, Stanford University

Music 421 Applications Lecture

January 7, 2014

Julius Smith Music 421 Applications Lecture – 1 / 50 Milestones in Audio Spectral Modeling

• Fourier’s theorem (1822) Outline • Telharmonium (1898) Telharmonium • Voder (1920s) Voder • Channel Vocoder Vocoder (1920s)

Phase Vocoder • Hammond Organ (1930s)

Additive Synthesis • Vocoder (1966) FM Synthesis • Digital Organ (1968) Sinusoidal Modeling • (1969) Future • FM Brass Synthesis (1970) • Synclavier 8-bit FM/Additive synthesizer (1975) • FM singing voice (1978) • Sinusoidal Modeling (1985) • Sines + Noise (1988) • Sines + Noise + Transients (1988,1996,1998,2000) • Inverse FFT synthesis (1992) • Future Directions

Julius Smith Music 421 Applications Lecture – 2 / 50 Outline Telharmonium Voder Channel Vocoder Additive Synthesis FM Synthesis Sinusoidal Modeling Future Telharmonium (1898)

Julius Smith Music 421 Applications Lecture – 3 / 50 Telharmonium (Cahill 1898)

U.S. patent 580,035: “Art of and Apparatus for Generating and Distributing Music Electrically”

Julius Smith Music 421 Applications Lecture – 4 / 50 Telharmonium Rheotomes

Forerunner of the Hammond Organ Tone Wheels

Julius Smith Music 421 Applications Lecture – 5 / 50 Telharmonium Rotor: Forerunner of Hammond Organ Tone Wheels

Outline

Telharmonium • Telharmonium

Voder

Channel Vocoder

Phase Vocoder

Additive Synthesis

FM Synthesis

Sinusoidal Modeling

Future

Julius Smith Music 421 Applications Lecture – 6 / 50 Outline Telharmonium Voder Channel Vocoder Phase Vocoder Additive Synthesis FM Synthesis Sinusoidal Modeling Future The Voder (1939)

Julius Smith Music 421 Applications Lecture – 7 / 50 The Voder (Homer Dudley — 1939 Worlds Fair)

[< http://davidszondy.com/future/robot/voder.htm

Julius Smith Music 421 Applications Lecture – 8 / 50 Voder Keyboard

http://www.acoustics.hut.fi/publications/files/theses/ lemmetty mst/chap2.html — (from Klatt 1987)

Julius Smith Music 421 Applications Lecture – 9 / 50 Voder Schematic

http://ptolemy.eecs.berkeley.edu/~eal/audio/voder.html

Julius Smith Music 421 Applications Lecture – 10 / 50 Voder Demos

• Video Outline • Audio Telharmonium

Voder • Voder Keyboard • Voder Schematic • Voder Demos

Channel Vocoder

Phase Vocoder

Additive Synthesis

FM Synthesis

Sinusoidal Modeling

Future

Julius Smith Music 421 Applications Lecture – 11 / 50 Outline Telharmonium Voder Channel Vocoder Phase Vocoder Additive Synthesis FM Synthesis Sinusoidal Modeling Future The Channel Vocoder (1928) (“Voice Coder”)

Julius Smith Music 421 Applications Lecture – 12 / 50 Vocoder Analysis & Resynthesis (Dudley 1928)

Analysis: Outline

Telharmonium • Ten analog bandpass filters between 250 and 3000 Hz: Voder Bandpass → rectifier → lowpass filter → envelope Channel Vocoder • Voiced/Unvoiced decision made • Vocoder Examples • Fundamental F0 measured for voiced case Phase Vocoder

Additive Synthesis Synthesis:

FM Synthesis • Ten matching bandpass filters driven by a Sinusoidal Modeling

Future ◦ “buzz source” (voiced), or ◦ “hiss source” (unvoiced) • Bands were scaled by amplitude envelopes and summed • Said to have an “unpleasant electrical accent” Related Speech Models: • The Vocoder is an early source-filter model for speech • Linear Predictive Coding (LPC) of speech is another

Julius Smith Music 421 Applications Lecture – 13 / 50 Vocoder Filter Bank Analysis/Resynthesis

magnitude, or Outline magnitude and phase extraction Telharmonium

Voder A Channel Vocoder x0(t) xˆ0(t) • Vocoder Examples f

Phase Vocoder , A Transmission, Additive Synthesis xˆ(t) x(t) x1(t) Storage, xˆ1(t) Manipulation, FM Synthesis f Noise reduction, ... Sinusoidal Modeling

Future

A xN−1 xˆN−1

f

Analysis Processing Synthesis

Julius Smith Music 421 Applications Lecture – 14 / 50 Channel Vocoder Examples

Outline • Original Telharmonium

Voder • 10 channels, sine carriers Channel Vocoder • • Vocoder Examples 10 channels, narrowband-noise carriers

Phase Vocoder

Additive Synthesis • 26 channels, sine carriers FM Synthesis • 26 channels, narrowband-noise carriers Sinusoidal Modeling • 26 channels, narrowband-noise carriers, channels reversed Future • Phase Vocoder: Identity system in absence of modifications

• The FFT Phase Vocoder next transitioned to the Short- (STFT) (Allen and Rabiner 1977)

Julius Smith Music 421 Applications Lecture – 15 / 50 Outline Telharmonium Voder Channel Vocoder Phase Vocoder Additive Synthesis FM Synthesis Sinusoidal Modeling Future The Phase Vocoder (1966)

Julius Smith Music 421 Applications Lecture – 16 / 50 Phase Vocoder Analysis for Additive Synthesis (1976)

Outline Analysis Model Synthesis Model ω ∆ω Telharmonium Channel Filter ak ak(t) k+ k(t) Voder Response A F ∆ω Sine Osc Channel Vocoder k

Phase Vocoder 0 ω ω k Out Additive Synthesis • Early “channel vocoder” implementations (hardware) only FM Synthesis

Sinusoidal Modeling measured amplitude ak(t) (Dudley 1939) • Future The “phase vocoder” (Flanagan and Golden 1966) added phase tracking in each channel • Portnoff (1976) developed the FFT phase vocoder, which replaced the heterodyne comb in computer-music additive-synthesis analysis (James A. Moorer) • Inverse FFT synthesis (Rodet and Depalle 1992) gave faster sinusoidal oscillator banks

Julius Smith Music 421 Applications Lecture – 17 / 50 Amplitude and Frequency Envelopes

ak(t) Outline

Telharmonium

Voder

Channel Vocoder

Phase Vocoder

Additive Synthesis

FM Synthesis

Sinusoidal Modeling t ωk t φ˙k t Future ∆ ( )= ( )

0

t

Julius Smith Music 421 Applications Lecture – 18 / 50 Outline Telharmonium Voder Channel Vocoder Phase Vocoder Additive Synthesis FM Synthesis Sinusoidal Modeling Future Additive Synthesis (1969)

Julius Smith Music 421 Applications Lecture – 19 / 50 Classic Additive-Synthesis Analysis (Heterodyne Comb)

Outline

Telharmonium

Voder

Channel Vocoder

Phase Vocoder

Additive Synthesis • Additive Analysis • Additive Synthesis

FM Synthesis

Sinusoidal Modeling

Future

John Grey 1975 — CCRMA Tech. Reports 1 & 2 (CCRMA “STANM” reports — available online)

Julius Smith Music 421 Applications Lecture – 20 / 50 Classic Additive-Synthesis (Sinusoidal Oscillator Envelopes)

Outline

Telharmonium

Voder

Channel Vocoder

Phase Vocoder

Additive Synthesis • Additive Analysis • Additive Synthesis

FM Synthesis

Sinusoidal Modeling

Future

John Grey 1975 — CCRMA Tech. Reports 1 & 2 (CCRMA “STANM” reports — available online)

Julius Smith Music 421 Applications Lecture – 21 / 50 Classic Additive Synthesis Diagram (Computer Music, 1960s)

Outline A1(t) f1(t) A2(t) f2(t) A3(t) f3(t) A4(t) f4(t)

Telharmonium

Voder

Channel Vocoder

Phase Vocoder

Additive Synthesis • Additive Analysis • Additive Synthesis

FM Synthesis

Sinusoidal Modeling

Future

noise FIR Σ

4 t y(t)= Ai(t)sin ωi(t)dt + φi(0) Xi=1 Z0 

Julius Smith Music 421 Applications Lecture – 22 / 50 Classic Additive-Synthesis Examples

Outline • Bb Clarinet Telharmonium • Eb Clarinet Voder • Oboe Channel Vocoder • Bassoon Phase Vocoder • Tenor Saxophone Additive Synthesis • Additive Analysis • Trumpet • Additive Synthesis • English Horn FM Synthesis • French Horn Sinusoidal Modeling • Flute Future

• All of the above • Independently synthesized set

(Synthesized from original John Grey data)

Julius Smith Music 421 Applications Lecture – 23 / 50 Outline Telharmonium Voder Channel Vocoder Phase Vocoder Additive Synthesis FM Synthesis Sinusoidal Modeling Future Synthesis (1973)

Julius Smith Music 421 Applications Lecture – 24 / 50 Frequency Modulation (FM) Synthesis

Outline FM synthesis is normally used as a spectral modeling technique Telharmonium • Voder Discovered and developed (1970s) by John M. Chowning

Channel Vocoder (CCRMA Founding Director)

Phase Vocoder • Key paper: JAES 1973 (vol. 21, no. 7)

Additive Synthesis • Commercialized by Yamaha Corporation: FM Synthesis ◦ • FM Synthesis DX-7 synthesizer (1983) • FM Formula ◦ OPL chipset (SoundBlaster PC sound card) • FM Patch • FM Spectra ◦ Cell phone ring tones • FM Examples • FM Voice • On the physical modeling front, synthesis of vibrating-string Sinusoidal Modeling using finite differences started around this time: Future Hiller & Ruiz, JAES 1971 (vol. 19, no. 6)

Julius Smith Music 421 Applications Lecture – 25 / 50 Frequency Modulation (FM) Synthesis

Outline FM synthesis is normally used as a spectral modeling technique Telharmonium • Voder Discovered and developed (1970s) by John M. Chowning

Channel Vocoder (CCRMA Founding Director)

Phase Vocoder • Key paper: JAES 1973 (vol. 21, no. 7)

Additive Synthesis • Commercialized by Yamaha Corporation: FM Synthesis ◦ • FM Synthesis DX-7 synthesizer (1983) • FM Formula ◦ OPL chipset (SoundBlaster PC sound card) • FM Patch • FM Spectra ◦ Cell phone ring tones • FM Examples • FM Voice • On the physical modeling front, synthesis of vibrating-string Sinusoidal Modeling waveforms using finite differences started around this time: Future Hiller & Ruiz, JAES 1971 (vol. 19, no. 6)

Julius Smith Music 421 Applications Lecture – 25 / 50 Frequency Modulation (FM) Synthesis

Outline FM synthesis is normally used as a spectral modeling technique Telharmonium • Voder Discovered and developed (1970s) by John M. Chowning

Channel Vocoder (CCRMA Founding Director)

Phase Vocoder • Key paper: JAES 1973 (vol. 21, no. 7)

Additive Synthesis • Commercialized by Yamaha Corporation: FM Synthesis ◦ • FM Synthesis DX-7 synthesizer (1983) • FM Formula ◦ OPL chipset (SoundBlaster PC sound card) • FM Patch • FM Spectra ◦ Cell phone ring tones • FM Examples • FM Voice • On the physical modeling front, synthesis of vibrating-string Sinusoidal Modeling waveforms using finite differences started around this time: Future Hiller & Ruiz, JAES 1971 (vol. 19, no. 6)

Julius Smith Music 421 Applications Lecture – 25 / 50 Frequency Modulation (FM) Synthesis

Outline FM synthesis is normally used as a spectral modeling technique Telharmonium • Voder Discovered and developed (1970s) by John M. Chowning

Channel Vocoder (CCRMA Founding Director)

Phase Vocoder • Key paper: JAES 1973 (vol. 21, no. 7)

Additive Synthesis • Commercialized by Yamaha Corporation: FM Synthesis ◦ • FM Synthesis DX-7 synthesizer (1983) • FM Formula ◦ OPL chipset (SoundBlaster PC sound card) • FM Patch • FM Spectra ◦ Cell phone ring tones • FM Examples • FM Voice • On the physical modeling front, synthesis of vibrating-string Sinusoidal Modeling waveforms using finite differences started around this time: Future Hiller & Ruiz, JAES 1971 (vol. 19, no. 6)

Julius Smith Music 421 Applications Lecture – 25 / 50 Frequency Modulation (FM) Synthesis

Outline FM synthesis is normally used as a spectral modeling technique Telharmonium • Voder Discovered and developed (1970s) by John M. Chowning

Channel Vocoder (CCRMA Founding Director)

Phase Vocoder • Key paper: JAES 1973 (vol. 21, no. 7)

Additive Synthesis • Commercialized by Yamaha Corporation: FM Synthesis ◦ • FM Synthesis DX-7 synthesizer (1983) • FM Formula ◦ OPL chipset (SoundBlaster PC sound card) • FM Patch • FM Spectra ◦ Cell phone ring tones • FM Examples • FM Voice • On the physical modeling front, synthesis of vibrating-string Sinusoidal Modeling waveforms using finite differences started around this time: Future Hiller & Ruiz, JAES 1971 (vol. 19, no. 6)

Julius Smith Music 421 Applications Lecture – 25 / 50 Frequency Modulation (FM) Synthesis

Outline FM synthesis is normally used as a spectral modeling technique Telharmonium • Voder Discovered and developed (1970s) by John M. Chowning

Channel Vocoder (CCRMA Founding Director)

Phase Vocoder • Key paper: JAES 1973 (vol. 21, no. 7)

Additive Synthesis • Commercialized by Yamaha Corporation: FM Synthesis ◦ • FM Synthesis DX-7 synthesizer (1983) • FM Formula ◦ OPL chipset (SoundBlaster PC sound card) • FM Patch • FM Spectra ◦ Cell phone ring tones • FM Examples • FM Voice • On the physical modeling front, synthesis of vibrating-string Sinusoidal Modeling waveforms using finite differences started around this time: Future Hiller & Ruiz, JAES 1971 (vol. 19, no. 6)

Julius Smith Music 421 Applications Lecture – 25 / 50 Frequency Modulation (FM) Synthesis

Outline FM synthesis is normally used as a spectral modeling technique Telharmonium • Voder Discovered and developed (1970s) by John M. Chowning

Channel Vocoder (CCRMA Founding Director)

Phase Vocoder • Key paper: JAES 1973 (vol. 21, no. 7)

Additive Synthesis • Commercialized by Yamaha Corporation: FM Synthesis ◦ • FM Synthesis DX-7 synthesizer (1983) • FM Formula ◦ OPL chipset (SoundBlaster PC sound card) • FM Patch • FM Spectra ◦ Cell phone ring tones • FM Examples • FM Voice • On the physical modeling front, synthesis of vibrating-string Sinusoidal Modeling waveforms using finite differences started around this time: Future Hiller & Ruiz, JAES 1971 (vol. 19, no. 6)

Julius Smith Music 421 Applications Lecture – 25 / 50 Frequency Modulation (FM) Synthesis

Outline FM synthesis is normally used as a spectral modeling technique Telharmonium • Voder Discovered and developed (1970s) by John M. Chowning

Channel Vocoder (CCRMA Founding Director)

Phase Vocoder • Key paper: JAES 1973 (vol. 21, no. 7)

Additive Synthesis • Commercialized by Yamaha Corporation: FM Synthesis ◦ • FM Synthesis DX-7 synthesizer (1983) • FM Formula ◦ OPL chipset (SoundBlaster PC sound card) • FM Patch • FM Spectra ◦ Cell phone ring tones • FM Examples • FM Voice • On the physical modeling front, synthesis of vibrating-string Sinusoidal Modeling waveforms using finite differences started around this time: Future Hiller & Ruiz, JAES 1971 (vol. 19, no. 6)

Julius Smith Music 421 Applications Lecture – 25 / 50 FM Formula

Outline

Telharmonium x(t)= Ac sin[ωct + φc + Am sin(ωmt + φm)] Voder where Channel Vocoder

Phase Vocoder (Ac,ωc,φc) specify the carrier sinusoid Additive Synthesis (Am,ωm,φm) specify the modulator sinusoid

FM Synthesis • FM Synthesis Can also be called phase modulation • FM Formula • FM Patch • FM Spectra • FM Examples • FM Voice

Sinusoidal Modeling

Future

Julius Smith Music 421 Applications Lecture – 26 / 50 Simple FM “Brass” Patch (Chowning 1970–)

Jean-Claude Risset observation (1964–1969): Outline Brass bandwidth ∝ amplitude Telharmonium

Voder

Channel Vocoder

Phase Vocoder fm = f0 Additive Synthesis g FM Synthesis • FM Synthesis A F • FM Formula • FM Patch • FM Spectra • FM Examples fc = f0 • FM Voice

Sinusoidal Modeling Future A F

Out

Julius Smith Music 421 Applications Lecture – 27 / 50 FM (Bessel Function of First Kind)

Outline Harmonic number k, FM index β:

Telharmonium

Voder

Channel Vocoder

Phase Vocoder

Additive Synthesis

FM Synthesis 1 • FM Synthesis • FM Formula • FM Patch 0.5 • FM Spectra )

• FM Examples β ( k • FM Voice J 0 Sinusoidal Modeling 0

Future 2 −0.5 4 0 5 10 6 15 20 8 25 Order k 30 10 Argument β

Julius Smith Music 421 Applications Lecture – 28 / 50 Frequency Modulation (FM) Examples

All examples by John Chowning unless otherwise noted: Outline Telharmonium • FM brass synthesis Voder • Channel Vocoder Low Brass example

Phase Vocoder • Dexter Morril’s FM Trumpet Additive Synthesis • FM singing voice (1978) FM Synthesis • FM Synthesis Each synthesized using an FM operator pair • FM Formula (two sinusoidal oscillators) • FM Patch • FM Spectra • Chorus • FM Examples • FM Voice • Voices

Sinusoidal Modeling • Basso Profundo Future • Other early FM synthesis • Clicks and Drums • Big Bell • String Canon

Julius Smith Music 421 Applications Lecture – 29 / 50 FM Voice

Outline FM voice synthesis can be viewed as compressed modeling of Telharmonium spectral Voder

Channel Vocoder Carrier 2 Phase Vocoder Carrier 1 Carrier 3 Additive Synthesis

FM Synthesis • FM Synthesis • FM Formula

• FM Patch Magnitude • FM Spectra • FM Examples • FM Voice 0 Frequency f Sinusoidal Modeling Modulation Frequency (all three) Future

Julius Smith Music 421 Applications Lecture – 30 / 50 Outline Telharmonium Voder Channel Vocoder Phase Vocoder Additive Synthesis FM Synthesis Sinusoidal Modeling Future Sinusoidal Modeling Synthesis (1988)

Julius Smith Music 421 Applications Lecture – 31 / 50 Tracking Spectral Peaks in the Short-Time Fourier Transform

Outline Phases Telharmonium atan

Voder s(t) Channel Vocoder FFT Phase Vocoder Quadratic Peak Additive Synthesis dB mag Peak tracking Interpolation FM Synthesis window w(n) Amplitudes Sinusoidal Modeling • Sinusoidal Modeling • Spectral Trajectories • Sines + Noise • S+N Examples • STFT peak tracking at CCRMA: mid-1980s (PARSHL program) • S+N FX • S+N XSynth • Sines + Transients • Motivated by vocoder analysis of piano tones • S + N + Transients • • S+N+T TSM Influences: STFT (Allen and Rabiner 1977), • S+N+T Freq Map ADEC (1977), MAPLE (1979) • S+N+T Windows • HF Noise Modeling • Independently developed for speech coding by McAulay and • HF Noise Band Quatieri at Lincoln Labs (1985) • S+N+T Examples

Future Julius Smith Music 421 Applications Lecture – 32 / 50 Example Spectral Trajectories

Outline f Telharmonium

Voder

Channel Vocoder

Phase Vocoder

Additive Synthesis

FM Synthesis

Sinusoidal Modeling • Sinusoidal Modeling • Spectral Trajectories • Sines + Noise • S+N Examples • S+N FX • S+N XSynth • Sines + Transients • S + N + Transients t • S+N+T TSM • S+N+T Freq Map • S+N+T Windows • HF Noise Modeling • HF Noise Band • S+N+T Examples

Future Julius Smith Music 421 Applications Lecture – 33 / 50 Parametric Spectral Modeling

1 1 2 2 3 3 4 4 Outline A (t) f (t) A (t) f (t) A (t) f (t) A (t) f (t)

Telharmonium

Voder

Channel Vocoder

Phase Vocoder

Additive Synthesis

FM Synthesis

Sinusoidal Modeling • Sinusoidal Modeling • Spectral Trajectories • Sines + Noise • S+N Examples • S+N FX • S+N XSynth • filter(t) Sines + Transients u(t) • S + N + Transients ht(τ) • S+N+T TSM P • S+N+T Freq Map • S+N+T Windows 4 • HF Noise Modeling t i i i t ∗ • HF Noise Band y(t)= A (t)cos 0 ω (t)dt + φ (0) +(h u)(t) i=1 • S+N+T Examples P hR i Future Julius Smith Music 421 Applications Lecture – 34 / 50 Sines + Noise Sound Examples

Outline Xavier Serra thesis demos (Sines + Noise signal modeling) Telharmonium • Voder Piano Channel Vocoder ◦ Original Phase Vocoder ◦ Sinusoids alone Additive Synthesis ◦ Residual after sinusoids removed FM Synthesis ◦ Sines + noise model Sinusoidal Modeling • Sinusoidal Modeling • Voice • Spectral Trajectories • Sines + Noise ◦ • S+N Examples Original • S+N FX ◦ Sinusoids • S+N XSynth ◦ • Sines + Transients Residual • S + N + Transients ◦ Synthesis • S+N+T TSM • S+N+T Freq Map • S+N+T Windows • HF Noise Modeling • HF Noise Band • S+N+T Examples

Future Julius Smith Music 421 Applications Lecture – 35 / 50 Musical Effects with Sines+Noise Models

Outline • Piano Effects Telharmonium ◦ Voder Pitch downshift one octave ◦ Channel Vocoder Pitch flattened

Phase Vocoder ◦ Varying partial stretching Additive Synthesis • Voice Effects FM Synthesis

Sinusoidal Modeling ◦ Frequency-scale by 0.6 • Sinusoidal Modeling ◦ Frequency-scale by 0.4 and stretch partials • Spectral Trajectories • Sines + Noise ◦ Variable time-scaling, deterministic to stochastic • S+N Examples • S+N FX • S+N XSynth • Sines + Transients • S + N + Transients • S+N+T TSM • S+N+T Freq Map • S+N+T Windows • HF Noise Modeling • HF Noise Band • S+N+T Examples

Future Julius Smith Music 421 Applications Lecture – 36 / 50 Cross-Synthesis with Sines+Noise Models

Outline • Voice “modulator” Telharmonium Voder • Creaking ship’s mast “carrier” Channel Vocoder • Voice-modulated creaking mast Phase Vocoder • Same with modified spectral envelopes Additive Synthesis

FM Synthesis

Sinusoidal Modeling • Sinusoidal Modeling • Spectral Trajectories • Sines + Noise • S+N Examples • S+N FX • S+N XSynth • Sines + Transients • S + N + Transients • S+N+T TSM • S+N+T Freq Map • S+N+T Windows • HF Noise Modeling • HF Noise Band • S+N+T Examples

Future Julius Smith Music 421 Applications Lecture – 37 / 50 Sines + Transients Sound Examples

Outline In this technique, the sinusoidal sum is phase-matched at the Telharmonium cross-over point only (with no cross-fade). Voder

Channel Vocoder • Marimba

Phase Vocoder ◦ Original Additive Synthesis ◦ Sinusoidal model FM Synthesis ◦ Sinusoidal Modeling Original attack, followed by sinusoidal model • Sinusoidal Modeling • Spectral Trajectories • Piano • Sines + Noise • S+N Examples ◦ Original • S+N FX ◦ Sinusoidal model • S+N XSynth • Sines + Transients ◦ Original attack, followed by sinusoidal model • S + N + Transients • S+N+T TSM • S+N+T Freq Map • S+N+T Windows • HF Noise Modeling • HF Noise Band • S+N+T Examples

Future Julius Smith Music 421 Applications Lecture – 38 / 50 Multiresolution Sines + Noise + Transients

Outline Why Model Transients Separately? Telharmonium

Voder • Sinusoids efficiently model spectral peaks over time Channel Vocoder • Filtered noise efficiently models spectral residual vs. t Phase Vocoder • Neither is good for abrupt transients in the Additive Synthesis • Phase-matched oscillators are expensive FM Synthesis • More efficient to switch to a transient model during transients Sinusoidal Modeling • • Sinusoidal Modeling Need sinusoidal phase matching at the switching • Spectral Trajectories • Sines + Noise Transient models: • S+N Examples • S+N FX • S+N XSynth • Original waveform slice (1988) • Sines + Transients • • S + N + Transients expansion (Ali 1996) • S+N+T TSM • MPEG-2 AAC (with short window) (Levine 1998) • S+N+T Freq Map • • S+N+T Windows Frequency-domain LPC • HF Noise Modeling (time-domain amplitude envelope) (Verma 2000) • HF Noise Band • S+N+T Examples

Future Julius Smith Music 421 Applications Lecture – 39 / 50 Time Scale Modification of Sines + Noise + Transients Models

Outline transients transients sines+ sines+ sines+ Telharmonium originalsignal noise noise noise Voder

Channel Vocoder

Phase Vocoder

Additive Synthesis transients transients time-scaled sines+ sines+ sines+ FM Synthesis signal noise noise noise Sinusoidal Modeling • Sinusoidal Modeling • Spectral Trajectories • Sines + Noise time • S+N Examples • S+N FX • S+N XSynth Time-Scale Modification (TSM) becomes well defined: • Sines + Transients • S + N + Transients • Transients are translated in time • S+N+T TSM • S+N+T Freq Map • Sinusoidal envelopes are scaled in time • S+N+T Windows • • HF Noise Modeling Noise-filter envelopes also scaled in time • HF Noise Band • Dual of TSM is frequency scaling • S+N+T Examples

Future Julius Smith Music 421 Applications Lecture – 40 / 50 Sines + Noise + Transients Time-Frequency Map

16 Outline 14 Telharmonium 12 Voder

Channel Vocoder 10

Phase Vocoder 8

Additive Synthesis 6 frequency [kHz] FM Synthesis 4

Sinusoidal Modeling 2 • Sinusoidal Modeling • 0 Spectral Trajectories 0 50 100 150 200 250 • Sines + Noise 1 • S+N Examples 0.5 • S+N FX • S+N XSynth 0 • Sines + Transients amplitude • S + N + Transients −0.5 • S+N+T TSM −1 • S+N+T Freq Map 0 50 100 150 200 250 time [milliseconds] • S+N+T Windows • HF Noise Modeling (Levine 1998) • HF Noise Band • S+N+T Examples

Future Julius Smith Music 421 Applications Lecture – 41 / 50 Corresponding Analysis Windows

Outline

Telharmonium transient Voder

Channel Vocoder

Phase Vocoder high octave Additive Synthesis

FM Synthesis

Sinusoidal Modeling

• Sinusoidal Modeling middle octave • Spectral Trajectories • Sines + Noise • S+N Examples low octave • S+N FX • S+N XSynth • Sines + Transients • S + N + Transients • S+N+T TSM • S+N+T Freq Map • S+N+T Windows amplitude • HF Noise Modeling • HF Noise Band 0 50 100 150 200 250 • S+N+T Examples time [milliseconds]

Future Julius Smith Music 421 Applications Lecture – 42 / 50 Quasi-Constant-Q (Wavelet) Time-Frequency Map

16 Outline 14 Telharmonium 12 Voder

Channel Vocoder 10

Phase Vocoder 8

Additive Synthesis 6 frequency [kHz] FM Synthesis 4

Sinusoidal Modeling 2 • Sinusoidal Modeling • Spectral Trajectories 0 0 4 50 100 150 200 250 • Sines + Noise x 10 • S+N Examples 2 • S+N FX 1 • S+N XSynth 0 • Sines + Transients

• S + N + Transients amplitude −1 • S+N+T TSM −2 • S+N+T Freq Map 0 50 100 150 200 250 • S+N+T Windows time [milliseconds] • HF Noise Modeling • HF Noise Band • S+N+T Examples

Future Julius Smith Music 421 Applications Lecture – 43 / 50 Bark-Band Noise Modeling at High Frequencies (Levine 1998)

Outline 85

Telharmonium 80 Voder

Channel Vocoder 75

Phase Vocoder 70 Additive Synthesis 65 FM Synthesis

Sinusoidal Modeling 60 • Sinusoidal Modeling • Spectral Trajectories 55 • Sines + Noise magnitude [dB] • S+N Examples 50 • S+N FX • S+N XSynth 45 • Sines + Transients • S + N + Transients 40 • S+N+T TSM • S+N+T Freq Map 35 • S+N+T Windows • HF Noise Modeling 30 • 0 500 1000 1500 2000 2500 3000 3500 4000 4500 HF Noise Band frequency [Hz] • S+N+T Examples

Future Julius Smith Music 421 Applications Lecture – 44 / 50 Amplitude Envelope for One Noise Band

Outline 80

Telharmonium 70 Voder 60 Channel Vocoder

Phase Vocoder 50 Original Mag. [dB] Additive Synthesis 40 FM Synthesis 0 50 100 150 200 250 300 350

Sinusoidal Modeling • Sinusoidal Modeling 80 • Spectral Trajectories 75 • Sines + Noise 70 • S+N Examples 65 • S+N FX 60 • S+N XSynth 55 • Sines + Transients LSA Mag. [dB] • S + N + Transients 50 • S+N+T TSM 45 • 0 50 100 150 200 250 300 350 S+N+T Freq Map time [milliseconds] • S+N+T Windows • HF Noise Modeling • HF Noise Band • S+N+T Examples

Future Julius Smith Music 421 Applications Lecture – 45 / 50 Sines + Noise + Transients Sound Examples

Outline Scott Levine Thesis Demos (Sines + Noise + Transients at 32 kbps) Telharmonium (http://ccrma.stanford.edu/~scottl/thesis.html) Voder

Channel Vocoder

Phase Vocoder Additive Synthesis “It Takes Two” by Rob Base & DJ E-Z Rock FM Synthesis

Sinusoidal Modeling • Original • Sinusoidal Modeling • MPEG-AAC at 32 kbps • Spectral Trajectories • Sines + Noise • Sines+transients+noise at 32 kbps • S+N Examples • S+N FX • S+N XSynth • Multiresolution sinusoids • Sines + Transients • S + N + Transients • Residual Bark-band noise • S+N+T TSM • Transform-coded transients (AAC) • S+N+T Freq Map • S+N+T Windows • Bark-band noise above 5 kHz • HF Noise Modeling • HF Noise Band • S+N+T Examples

Future Julius Smith Music 421 Applications Lecture – 46 / 50 Time Scale Modification using Sines + Noise + Transients

Outline Scott Levine Thesis Demos (Sines + Noise + Transients at 32 kbps) Telharmonium (http://ccrma.stanford.edu/~scottl/thesis.html) Voder

Channel Vocoder

Phase Vocoder Additive Synthesis Time-Scale Modification (pitch unchanged) FM Synthesis • S+N+T time-scale factors [2.0, 1.6, 1.2, 1.0, 0.8, 0.6, 0.5] Sinusoidal Modeling • Sinusoidal Modeling • Spectral Trajectories • Sines + Noise • S+N Examples S+N+T Pitch Shifting (timing unchanged) • S+N FX • S+N XSynth • Pitch-scale factors [0.89, 0.94, 1.00, 1.06, 1.12] • Sines + Transients • S + N + Transients • S+N+T TSM • S+N+T Freq Map • S+N+T Windows • HF Noise Modeling • HF Noise Band • S+N+T Examples

Future Julius Smith Music 421 Applications Lecture – 47 / 50 Outline Telharmonium Voder Channel Vocoder Phase Vocoder Additive Synthesis FM Synthesis Sinusoidal Modeling Future Summary and Future Prospects

Julius Smith Music 421 Applications Lecture – 48 / 50 Spectral Modeling History Highlights

• Bernoulli’s modal sums (1733) Perceptual audio coding: • Fourier’s initial theorem (1822) • Princen-Bradley filterbank • Telharmonium (1906) (1986) • Hammond organ (1930s) • K. Brandenburg thesis (1989) • Channel Vocoder (1939) • Auditory masking usage • Phase Vocoder (1966) • Dolby AC2 • “Additive Synthesis” (1969) • Musicam • FFT Phase Vocoder (1976) • ASPEC • Sinusoidal Modeling • MPEG-I,II,IV (1977,1979,1985) (S+N+T “parametric ”) • Sines+Noise (1989) • Sines+Transients (1989) • TF Reassignment (1995) • Sines+Noise+Transients (1998)

Julius Smith Music 421 Applications Lecture – 49 / 50 Future Prospects

Observations: • Sinusoidal modeling of sound is “Unreasonably Effective” • Basic “auditory masking” discards ≈ 90% information • Interesting neuroscience observation: “... most neurons in the primary auditory cortex A1 are silent most of the time ...” (from “Sparse Time-Frequency Representations”, Gardner and Magnesco, PNAS:103(16), April 2006) • What is the right “psychospectral model” for sound? ◦ The cochlea of the ear is a real-time analyzer ◦ How is the “ear’s spectrogram” represented at higher levels?

Julius Smith Music 421 Applications Lecture – 50 / 50 Future Prospects

Observations: • Sinusoidal modeling of sound is “Unreasonably Effective” • Basic “auditory masking” discards ≈ 90% information • Interesting neuroscience observation: “... most neurons in the primary auditory cortex A1 are silent most of the time ...” (from “Sparse Time-Frequency Representations”, Gardner and Magnesco, PNAS:103(16), April 2006) • What is the right “psychospectral model” for sound? ◦ The cochlea of the ear is a real-time ◦ How is the “ear’s spectrogram” represented at higher levels?

Julius Smith Music 421 Applications Lecture – 50 / 50 Future Prospects

Observations: • Sinusoidal modeling of sound is “Unreasonably Effective” • Basic “auditory masking” discards ≈ 90% information • Interesting neuroscience observation: “... most neurons in the primary auditory cortex A1 are silent most of the time ...” (from “Sparse Time-Frequency Representations”, Gardner and Magnesco, PNAS:103(16), April 2006) • What is the right “psychospectral model” for sound? ◦ The cochlea of the ear is a real-time spectrum analyzer ◦ How is the “ear’s spectrogram” represented at higher levels?

Julius Smith Music 421 Applications Lecture – 50 / 50 Future Prospects

Observations: • Sinusoidal modeling of sound is “Unreasonably Effective” • Basic “auditory masking” discards ≈ 90% information • Interesting neuroscience observation: “... most neurons in the primary auditory cortex A1 are silent most of the time ...” (from “Sparse Time-Frequency Representations”, Gardner and Magnesco, PNAS:103(16), April 2006) • What is the right “psychospectral model” for sound? ◦ The cochlea of the ear is a real-time spectrum analyzer ◦ How is the “ear’s spectrogram” represented at higher levels?

Julius Smith Music 421 Applications Lecture – 50 / 50 Future Prospects

Observations: • Sinusoidal modeling of sound is “Unreasonably Effective” • Basic “auditory masking” discards ≈ 90% information • Interesting neuroscience observation: “... most neurons in the primary auditory cortex A1 are silent most of the time ...” (from “Sparse Time-Frequency Representations”, Gardner and Magnesco, PNAS:103(16), April 2006) • What is the right “psychospectral model” for sound? ◦ The cochlea of the ear is a real-time spectrum analyzer ◦ How is the “ear’s spectrogram” represented at higher levels?

Julius Smith Music 421 Applications Lecture – 50 / 50 Future Prospects

Observations: • Sinusoidal modeling of sound is “Unreasonably Effective” • Basic “auditory masking” discards ≈ 90% information • Interesting neuroscience observation: “... most neurons in the primary auditory cortex A1 are silent most of the time ...” (from “Sparse Time-Frequency Representations”, Gardner and Magnesco, PNAS:103(16), April 2006) • What is the right “psychospectral model” for sound? ◦ The cochlea of the ear is a real-time spectrum analyzer ◦ How is the “ear’s spectrogram” represented at higher levels?

Julius Smith Music 421 Applications Lecture – 50 / 50 Future Prospects

Observations: • Sinusoidal modeling of sound is “Unreasonably Effective” • Basic “auditory masking” discards ≈ 90% information • Interesting neuroscience observation: “... most neurons in the primary auditory cortex A1 are silent most of the time ...” (from “Sparse Time-Frequency Representations”, Gardner and Magnesco, PNAS:103(16), April 2006) • What is the right “psychospectral model” for sound? ◦ The cochlea of the ear is a real-time spectrum analyzer ◦ How is the “ear’s spectrogram” represented at higher levels?

Julius Smith Music 421 Applications Lecture – 50 / 50