Linear Prediction (LPC)

ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 6: Linear Prediction (LPC) 1. Resonance & The Source-Filter Model 2. Linear Prediction (LPC) 3. LP Representations 4. LP Synthesis & Modification Dan Ellis Dept. Electrical Engineering, Columbia University [email protected] http://www.ee.columbia.edu/~dpwe/e4896/ E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 1 /16 1. Resonance • Resonance is ubiquitous in physical systems e.g. plucked/struck string, drum head A room “coloration” B vocal tract C Time • Resonances D ↔ Poles easily implemented 00 in LTI filters Position Velocity E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 2 /16 Singe Pole-Pair Resonance x[n] y[n] • Simple resonance = + Second-order IIR z-1 Band Pass Filter 2rcosωc z-1 imag 1 2 0.5 -r r 0 ωc -0.5 -1 z-plane -1 -0.5 0 0.5 1 real )| jw ωc ωc |H(e Q = B B ≈ 2(1-r) 0 0 0.1 0.2 0.3 ω / π impulse_bpf.pd E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 3 /16 Resonances in Speech • Vocal Tract (throat + tongue + lips) acts as variable resonator resonances = “formants” 4000 3000 freq / Hz 2000 1000 ^ 0 e 0 0.5 1 1.5 h ε z w tcl c θ I n timeI z / s I d ay m has a watch thin as a dime E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 4 /16 Source-Filter Model • Separation of: Source: fine structure in time/frequency Filter: subsequent shaping by physical resonances Formants t Pitch Glottal pulse train Vocal tract Voiced/ resonances Radiation Speech unvoiced + characteristic Frication f noise t Source Filter • Advantages Good match to real signals Salient pieces E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 5 /16 2. Linear Prediction (LPC) • LPC = Linear Predictive Coding remove redundancy in signal try to predict next point as linear combination of previous values p s[n]= a s[n k]+e[n] k − k=1 th a k are p order linear predictor coefficients { e [ n ] } is residual “innovation” a/k/a prediction error • Transfer function S(z) 1 1 = = E(z) 1 p a z k A(z) − k=1 k − all-pole “autoregressive” (AR) modeling E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 6 /16 Voice Modeling & LPC Direct expression of source-filter model • p s[n]= a s[n k]+e[n] k − k=1 Pulse/noise Vocal tract excitation 1 e[n] H(z) = /A(z) s[n] H(z) |H(ejω)| f z-plane acoustic tube model of vocal tract is all-pole vocal tract resonances change slowly ~ 10-20ms but: nasals E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 7 /16 Estimating LPC Models • You can “see” -40 resonances in -60 -80 a spectral slice: energy / dB -100 0 1000 2000 3000 freq / Hz We can find LPC coefficients ak • { } to minimize energy of residual e [ n ] : p 2 e2[n]= s[n] a s[n k] − k − n n k=1 differentiate w.r.t. ak & solve end up with p linear equations involving autocorrelations r ( j k )= s[n j]s[n k] ss | − | − − n E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 8 /16 LPC Illustration 0.1 0 windowed original -0.1 -0.2 LPC residual -0.3 0 50 100 150 200 250 300 350 400 time / samp dB original spectrum 0 LPC spectrum -20 -40 residual spectrum -60 0 1000 2000 3000 4000 5000 6000 7000 freq / Hz • Actual poles: z-plane E4896_L06_lpc_diary E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 9 /16 Short-Time LP Analysis • Solve LPC for each ~20 ms frame 8 6 freq / kHz 4 2 0 20 1 15 0.5 10 12 5 0 0 Imaginary Part -0.5 -5 -10 -1 -1 0 1 -15 0 0.2 0.4 0.6 0.8 1 Real Part 8 6 freq / kHz 4 2 0 0.5 1 1.5 2 2.5 3 time / s E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 10 /16 3. LP Representations • Can interpret LPC filter fit many ways: • Picking out resonances if signal was source + resonances, should find them • Low-order spectral approximation minimizing e 2 [ n ] also minimizes E(ejω) 2 | | different from e.g. Fourier approximation... Finding & removing smooth spectrum • 1 A ( z ) is smooth approximation of S(z) S(z) 1 = E ( z )= S ( z ) A ( z ) is “unsmoothed” S(z) E(z) A(z) ⇒ • Signal whitening removing linear dependence makes residual like white noise (iid, flat spectrum) E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 11 /16 Alternative Forms • Many formulations for pth order all-pole IIR: predictor coefficients a / polynomial A(z) { k} roots λ = r e j ω i of A(z) { i i } reflection coefficients (for lattice filter structure) Line Spectral Frequencies (LSF) • Choice depends on: mathematical convenience numerical stability statistical properties (e.g. for coding) opportunities for modification E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 12 /16 4. LPC Synthesis • LP analysis on ~20ms frames gives prediction filter A ( z ) and residual e[n] recombining them should yield perfect s[n] coding applications further compress e[n] Encoder Filter coefficients {ai} Decoder |1/A(ej!)| Represent Input f s[n] & encode LPC Output analysis e^[n] s^[n] Represent Excitation All-pole Residual & encode generator filter e[n] 1 t H(z) = -i 1 - "aiz e.g. simple pitch tracker ➝ “buzz-hiss” encoding Pitch period values 16 ms frame boundaries 100 50 0 -50 1.3 1.35 1.4 1.45 1.5 1.55 1.6 1.65 1.7 1.75 time / s E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 13 /16 LPC Warping Replacing delays z-1 with allpass elements z + α • αz +1 warps frequencies but not magnitudes Original 8000 6000 P W 4000 = 0.6 Frequency 0.8 A 2000 0.6 0 0.5 1 1.5 2 2.5 3 Time 0.4 Warped LPC resynth, = -0.2 8000 0.2 A = -0.6 6000 0 0 0.2 0.4 0.6 0.8 P 4000 ^ W Frequency 2000 0 0.5 1 1.5 2 2.5 3 Time http://www.ee.columbia.edu/~dpwe/resources/matlab/polewarp/ E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 14 /16 Cross-Synthesis • Mix residual (source) of one signal with resonances (filter) of another or: just use white noise as excitation formants carry phonemes ➝ vocoder E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 15 /16 Summary • Resonances (poles) color sound • Source + Filter model decouples excitation and resonances • Linear Prediction is a simple way to model and implement resonances (filter) • Many interpretations, representations, modifications E4896 Music Signal Processing (Dan Ellis) 2013-02-25 - 16 /16.

Load more