Digital Speech Processing— Lecture 10 Short-Time Fourier Analysis

Digital Speech Processing— Lecture 10 Short-Time Fourier Analysis

Digital Speech Processing— Lecture 10 Short-Time Fourier Analysis Methods - Filter Bank Design 1 Review of STFT ∞ jjmωωˆˆˆ − 1.()[][]Xenˆ =−∑ xmwnme m=−∞ function of nˆ (looks like a time sequence) function of ωˆ (looks like a transform) jωˆ ˆ Xenˆ ( ) defined for n=≤≤123, , ,...; 0 ωπˆ jωˆ 2. Interpretations of Xenˆ ( ) ˆ jω ˆ 1.n fixed, ωω==−ˆ variable; Xenˆ ( ) DTFT⎣⎡ xmwnm [ ] [ ]⎦⎤ ⇒⇒DFT View OLA implementation jjnωωˆˆ− 2. nn==∗ˆ variable, ωˆ fixed; Xen ( ) xne [ ] wn [ ] ⇒⇒Linear Filtering view filter bank implementation ⇒ FBS implementation 2 Review of STFT 3. Sampling Rates in Time and Frequency ⇒ ˆ jωˆ recover xn[ ] exactly from Xnˆ ( e ) 1. time: We(jω ) has bandwidth of B Hertz ⇒ 2B samples/sec rate 2F Hamming Window: B =⇒S (Hz) sample at L 4F S (Hz) or every L/4 samples L 2. frequency: wn[ ] is time limited to L samples ⇒ need at least L frequency samples to avoid time aliasing 3 Review of STFT with OLA method can recover xn( ) exactly using lower sampling rates in either time or frequency, e.g., can sample every L samples (and divide by window), or can use fewer than L frequency samples (filter bank channels); but these methods are highly subject to aliasing errors with any modifications to STFT can use windows (LPF) that are longer than L samples and still recover with NL<⇒ frequency channels; e.g., ideal LPF is infinite in time duration, but with zeros spaced N samples apart where FNS / is the BW of the ideal LPF 4 Review of STFT jω H0(e ) jω H1(e ) X(ejω) Y(ejω) + … jω HN-1(e ) jnωk hnk []= wne [] N−1 jjωω He%()==∑ Hk () e 1 k =0 N−1 hn%[]===∑ hk [] nδ [] n wnpn [][] k =0 ∞ pn[]=− N∑ δ [ n rN ] r =−∞ ∞ hn%[]=−=++ N∑ wrN [ ][δ n rN ] w []0 wN [ ] ... r =−∞ ⇒ need to design digital filters that match criteria for exact reconstruction of xn[ ] and which still work with modifications to STFT 5 Tree-Decimated Filter Banks 6 Tree-Decimated Filter Banks • can sample STFT in time and frequency using lowpass filter (window) which is moved in jumps of R<L samples if L≤N where: – L is the window length, – R is the window shift, and – N is the number of frequency channels • for a given channel at ω=ωk, the sampling rate of the STFT need only be twice the bandwidth of the window Fourier transform – down-sample STFT estimates by a factor of R at the transmitter – up-sample back to original sampling rate at the receiver – final output formed by convolution of the up-sampled STFT with an appropriate lowpass filter, f [n] 7 Filter Bank Channels Fully decimated and interpolated filter bank channels; (a) analysis with bandpass filter, down- shifting frequency and down-sampling; (b) synthesis with up-sampler followed by lowpass interpolation filter and frequency up-shift; (c) analysis with frequency down-shift followed by lowpass filter followed by down-sampling; (d) synthesis with up-sampling followed by frequency 8 up-shift followed by bandpass filter. Full Implementation of Analysis-Synthesis System Full implementation of STFT analysis/synthesis system with channel decimation by a factor of R, and channel interpolation by 9 the same factor R (including a box for short-time modifications. Review of Down and Up-Sampling x[n] xnd []= xnR [ ] ↓ R R−1 Xe()jωωπ= 1 Xe (jrR()/−2 ) d R ∑ r =0 aliasing addition ⎧xn[ / R ] n=±± 0, 1, 2,... x[n] xnu []= ⎨ ↑ R ⎩ 0otherwise jjRωω Xeu ()= Xe ( ) imaging removal 10 Filter Bank Reconstruction • assuming no short time modifications jjmωωkk− XkrR( )== Xe rR ( )∑ wrRmxme ( − ) ( ) mW∈ rR N−1 1 jjnωωkk yn()= ∑ PkY ()rR ( e ) e N k =0 N−1 1 jnωk =−∑Pk()[] XrR [][ kfn rR ] e N k =0 ∞ N−1 ⎡⎤1 jω ()nm− =−−∑∑xm()⎢⎥ wrR ( mfn )( rR )∑ Pke () k mW∈=−∞rR ⎣⎦ rN k =0 •=∀ recognizing that when Pk( )1 , k 1 N−∞1 ∑∑enmqNjnmωk ()− =−−δ ( ) N kq==−∞0 •=∀ can show that the condition for yn( ) xn ( ), n is ∞ ∑ wrR()()()−+ n qNfn − rR =δ q ⇒ m = n 11 r =−∞ Filter Bank Reconstruction • frequency domain equations, assuming: jnωωkk jn hnkk[]== wne [] ; gn [] fne [] ; jjωωωωωωkkkk()−− jj () Hekk()== We(); Ge () Fe (); N−1 jω 1 jωk Ye( )=⋅∑ PkG ( ) k () e N k=0 22ππ ⎡⎤R−1 jj()ωω−− () 1 RRll ⎢⎥∑Hek ()() Xe ⎣⎦R l=0 N−1 jω 1 jωk jω = Xe() ∑PkG()kk() e H ( e ) RN k=0 RN−−1122ππ jj()ωω−−ll1 () RRjωk +⋅∑∑Xe() PkG ()()()kk e H e l==10RN k =⋅+He% ()()aliasingjjωω Xe terms 12 Filter Bank Reconstruction •= conditions for perfect reconstruction (Yeˆ (jjωω ) Xe ( )) 1 N−1 % jjωωjωk He()==∑ PkG ()()()kk e H e 1 RN k=0 Flat Gain and N−1 2π 1 j()ω− l jωk R ∑PkG( )kk ( e ) H ( e )==012 forl , ,..., R RN k=0 Alias Cancellation 13 Filter Bank Reconstruction 14 Decimated Analysis- Synthesis Filter Bank •=== practical solution for the case wn( ) fn ( ), N R 2 (2-band solution) where the conditions for exact reconstruction become 1 ⎡⎤PFeWe(01 ) (jjω ) ( ω )+= PFe ( ) ( j() ωπ−− ) We ( j () ωπ ) 1 4 ⎣⎦ and 1 ⎡⎤PFeWe()(0 jjωωπ ) ( ()− )()(+=PFe10jj()ωπ−− )( We ( ω2 π ) ) 4 ⎣⎦ •=−== aliasing cancels exactly if PP(011 ) ( ) (with FeWe(jjωω ) ( ))giving 1 22 ⎡⎤We(jjωωπω )−= We (()−− ) e jM 4 ⎣⎦⎢⎥()() ynˆ()=− xn ( M ) 15 Decimated Analysis- Synthesis Filter Bank when aliasing is completely eliminated (by suitable choice of filters), the overall analysis-synthesis system has frequency response: N −1 jω 1 j()ωω− k He% ()= ∑ PkW [](e e ) RN k=0 where jjjωωω Wee ()= We ()() Fe h%[]nwnpn= e [][] where wne[]=∗ wn [] fn [] 1 N −1 pn[]= ∑ Pk [] ejnωk RN k=0 16 Decimated Analysis- Synthesis Filter Bank • To design a decimated analysis/synthesis filter bank system, need to determine the following: 1. the number of channels, N, and the decimation/interpolation ratio R. (Usually determined by the desired frequency resolution) 2. the window, w[n], and the interpolation filter impulse response, f [n]. (These are lowpass filters that should have good frequency-selective properties such that the bandpass channel responses do not overlap into more than one band on either side of the channel). 3. the complex gain factors, P[k], k=0,1,…,N-1. (These constants are important for achieving flat overall response and alias cancellation.) 17 Maximally Decimated Filter Banks • we show here that it is possible to obtain exact reconstruction with R=N and N<L. – R=N is termed maximal decimation because this is the largest decimation factor that can be used with an N- channel filter bank and still achieve exact reconstruction. – nominal bandwidth of the bandpass channel signal is increased from ±π/N to ±π – down-sampling by N accomplishes frequency down- shifting normally accomplished by modulation 18 Maximally Decimated Filter Banks 19 Analysis-Synthesis Filter Bank (all modulators removed) 20 Two Channel Filter Banks 21 Two Channel Filter Banks For this two-band system, we have: RN==2;ωπk = kk , = 0,1 Applying the frequency domain analysis we get: 1 Ye()jjjjjjωωωωωω=+⎡⎤ G ()() e H e G ()() e H e Xe () 2 ⎣⎦00 11 1 ++⎡Ge()(jjωωπωωπ He()−− ) Ge ()( jj He () )⎤ Xe()j()ωπ− 2 ⎣⎦00 11 not including multiplier 1/NPP or the gain factors [0], [1] Aliasing can be eliminated completely by choosing the filters such that: ⎡⎤jjωωπωωπ()−− jj () ⎣⎦Ge00()( He )+= Ge 11 ()( He ) 0(aliasing cancelation) Perfect reconstruction requires: 1 ⎡⎤Ge()()jjωω He+= Ge ()()1 jj ωω He (flat gain condition) 22 2 ⎣⎦00 11 Quadrature Mirror Filter (QMF) Banks Alias cancellation achieved if the filters are chosen to satisfy the conditions: jnπωωπ j j()− hn10[]=⇔= e hn [] H 10 ( e ) H ( e ) jjωω gn00[]=⇔= 2 hn [] Ge 0 ( ) 2 He 0 ( ) jjωωπ()− gn11[]=− 2 hn [] ⇔ Ge 1 ( ) =− 2 H 0 ( e ) Note that there is only a single lowpass filter, with impulse response hn0[] on which all other filters are based; filter has nominal cutoff of π /2 rad with a narrow transition to a stopband with adequate attenuation to isolate the two bands jnπ The filter hn10[]= e hn [] is the complementary highpass filter The filters hn01[] and hn [] are called Quadrature Mirror Filters (QMF) 23 Quadrature Mirror Filter (QMF) Banks Substituting for the filter relationships we get: jjωωπωπωπ()−−− j () j (2) 2()(He00 He )2(−= He 0 )( He 0 )0 i.e., perfect alias cancellation for any choice of lowpass filter 22 Ye()jjωω=−⎡⎤ He () He ( j() ωπ− ) Xe()jjjωωω= He% ()() Xe ⎣⎦⎢⎥()()00 For perfect reconstruction we require that: 22 ⎡⎤He()jjωωπω−= He (()−− ) e jM ⎣⎦⎢⎥()()00 i.e., flat magnitude with delay of M samples, due to the delay of the causal filters of the filter bank 24 Quadrature Mirror Filters Assume FIR lowpass filter of length L samples such that hn00[ ]=−−≤≤− h [( L 1) n ], 0 n L 1; this filter has linear phase with a delay of (1)/2L − samples and a frequency response of the form: jjω ωω−−jL(1)/2 He00()= Ae ()(1)eL− must be odd jω where Ae0 () is real. Using this lowpass filter in the filter bank, the condition for perfect reconstruction becomes: 22 He% ()jjjLjjωωπωπω=−⎡⎤ A () e e(1)−−− A ( e ( ) ) e (1)L− ⎣⎦⎢⎥()00 ( ) If L −1 is odd, we get: 22 He% ()jjωω=+⎡⎤ Ae () Ae ( jjL() ωπω−−− ) e (1) ⎣⎦⎢⎥()()00 linear phase is guaranteed, but not perfect reconstruction 25 Quadrature Mirror Filters For flat gain we need to satisfy the condition: 22 ⎡⎤Ae()jjωωπ+= Ae (()− ) 1 ⎣⎦⎢⎥()()00 Solution is to use high order linear phase QMF filters Johnston proposed an algorithm for design of the basic lowpass filter that minimized the deviation from unity while providing a desired level of stopband attenuation 26 Quadrature Mirror Filters hn[]== wn [] gn [] hn[]== wne []jnπ gn [] 001127 Quadrature Mirror Filters Lowpass filter Highpass filter is designed by mirror image of windowing lowpass filter • practical realization of QMF filters • note that aliasing is cancelled, but the overall frequency response is not perfectly flat 28 Quadrature Mirror Filters 29 Issues with Perfect Reconstruction • The motivation for the subband approach is to process different speech properties independently in each band and thus being able to localize the band (compact bandwidth) is important.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    57 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us