<<

ELEN E4896 PROCESSING Lecture 6: Linear Prediction (LPC)

1. & The Source-Filter Model 2. Linear Prediction (LPC) 3. LP Representations 4. LP Synthesis & Modification

Dan Ellis Dept. Electrical Engineering, Columbia University [email protected] http://www.ee.columbia.edu/~dpwe/e4896/ E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 1 /16 1. Resonance • Resonance is ubiquitous in physical systems e.g. plucked/struck string, drum head A

room “coloration” B vocal tract

C Time • D ↔ Poles easily implemented 00 in LTI filters Position Velocity

E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 2 /16 Singe Pole-Pair Resonance x[n] y[n] • Simple resonance = + Second-order IIR z-1

Band Pass Filter 2rcosωc z-1

imag 1 2 0.5 -r r 0 ωc

-0.5

-1 z-plane

-1 -0.5 0 0.5 1 real )| jw ωc ωc |H(e Q = B B ≈ 2(1-r)

0 0 0.1 0.2 0.3 ω / π impulse_bpf.pd E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 3 /16 Resonances in • Vocal Tract (throat + tongue + lips) acts as variable resonator resonances = “

4000

3000

freq / Hz 2000

1000

^ 0 e 0 0.5 1 1.5 h ε z w tcl c θ I n timeI z / s I d ay m has a watch thin as a dime

E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 4 /16 Source-Filter Model • Separation of: Source: fine structure in time/frequency Filter: subsequent shaping by physical resonances

Formants t Pitch Glottal pulse train Vocal tract Voiced/ resonances Radiation Speech unvoiced + characteristic Frication f t Source Filter • Advantages Good match to real Salient pieces

E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 5 /16 2. Linear Prediction (LPC) • LPC = remove redundancy in signal try to predict next point as linear combination of previous values p s[n]= a s[n k]+e[n] k k=1 th a k are p order linear predictor coefficients { e [ n ] } is residual “innovation” a/k/a prediction error • Transfer function S(z) 1 1 = = E(z) 1 p a z k A(z) k=1 k all-pole “autoregressive” (AR) modeling E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 6 /16 Modeling & LPC Direct expression of source-filter model • p s[n]= a s[n k]+e[n] k k=1 Pulse/noise Vocal tract excitation 1 e[n] H(z) = /A(z) s[n]

H(z) |H(ejω)|

f z-plane acoustic tube model of vocal tract is all-pole vocal tract resonances change slowly ~ 10-20ms but: nasals

E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 7 /16 Estimating LPC Models • You can “see” -40 resonances in -60 -80 a spectral slice: dB / energy -100 0 1000 2000 3000 freq / Hz

We can find LPC coefficients ak • { } to minimize energy of residual e [ n ] : p 2 e2[n]= s[n] a s[n k] k n n k=1 differentiate w.r.t. ak & solve end up with p linear equations involving autocorrelations r ( j k )= s[n j]s[n k] ss | | n E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 8 /16 LPC Illustration 0.1

0 windowed original -0.1

-0.2 LPC residual -0.3 0 50 100 150 200 250 300 350 400 time / samp dB original spectrum 0 LPC spectrum -20

-40 residual spectrum -60 0 1000 2000 3000 4000 5000 6000 7000 freq / Hz • Actual poles:

z-plane E4896_L06_lpc_diary E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 9 /16 Short-Time LP Analysis • Solve LPC for each ~20 ms frame

8

6

freq / kHz / freq 4

2

0 20 1 15

0.5 10

12 5 0 0

Imaginary Part Imaginary -0.5 -5

-10 -1 -1 0 1 -15 0 0.2 0.4 0.6 0.8 1 Real Part 8

6

freq / kHz / freq 4

2

0 0.5 1 1.5 2 2.5 3 time / s E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 10 /16 3. LP Representations • Can interpret LPC filter fit many ways: • Picking out resonances if signal was source + resonances, should find them • Low-order spectral approximation minimizing e 2 [ n ] also minimizes E(ej) 2 | | different from e.g. Fourier approximation... Finding & removing smooth spectrum • 1 A ( z ) is smooth approximation of S(z) S(z) 1 = E ( z )= S ( z ) A ( z ) is “unsmoothed” S(z) E(z) A(z) • Signal whitening removing linear dependence makes residual like white noise (iid, flat spectrum) E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 11 /16 Alternative Forms • Many formulations for pth order all-pole IIR: predictor coefficients a / polynomial A(z) { k} roots = r e j i of A(z) { i i } reflection coefficients (for lattice filter structure) Line Spectral Frequencies (LSF) • Choice depends on: mathematical convenience numerical stability statistical properties (e.g. for coding) opportunities for modification

E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 12 /16 4. LPC Synthesis • LP analysis on ~20ms frames gives prediction filter A ( z ) and residual e[n] recombining them should yield perfect s[n] coding applications further compress e[n]

Encoder Filter coefficients {ai} Decoder |1/A(ej!)| Represent Input f s[n] & encode LPC Output analysis e^[n] s^[n] Represent Excitation All-pole Residual & encode generator filter e[n] 1 t H(z) = -i 1 - "aiz e.g. simple pitch tracker ➝ “buzz-hiss” encoding Pitch period values 16 ms frame boundaries 100 50 0 -50 1.3 1.35 1.4 1.45 1.5 1.55 1.6 1.65 1.7 1.75 time / s E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 13 /16 LPC Warping Replacing delays z-1 with allpass elements z + • z +1 warps frequencies but not magnitudes

Original 8000

6000 P W 4000

= 0.6 Frequency 0.8 A 2000

0.6 0 0.5 1 1.5 2 2.5 3 Time 0.4 Warped LPC resynth, = -0.2 8000 0.2 A = -0.6 6000 0 0 0.2 0.4 0.6 0.8 P 4000 ^ W Frequency 2000

0 0.5 1 1.5 2 2.5 3 Time

http://www.ee.columbia.edu/~dpwe/resources/matlab/polewarp/

E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 14 /16 Cross-Synthesis • Mix residual (source) of one signal with resonances (filter) of another or: just use white noise as excitation formants carry phonemes ➝

E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 15 /16 Summary • Resonances (poles) color sound

• Source + Filter model decouples excitation and resonances

• Linear Prediction is a simple way to model and implement resonances (filter)

• Many interpretations, representations, modifications

E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 16 /16