Estimation of Power Spectral Density Using Wavelet Thresholding

Proceedings of the 7th WSEAS International Conference on CIRCUITS, SYSTEMS, ELECTRONICS, CONTROL and SIGNAL PROCESSING (CSECS'08) Estimation of Power Spectral Density using Wavelet Thresholding PETR SYSEL JIRI MISUREC Brno University of Technology Brno University of Technology Department of Telecommunications Department of Telecommunications Purkynova 464/118, 612 00 Brno Purkynova 464/118, 612 00 Brno CZECH REPUBLIC CZECH REPUBLIC Abstract: The basic problem of the single-channel speech enhancement methods lies in a rapid and precise method for estimating noise, on which the quality of enhancement method depends. The paper describes a new method of power spectral density estimation using wavelet transform in spectral domain. To reduce periodogram variance the proposed method use the procedure of thresholding the wavelet coefficients of a periodogram. Then the smoothed estimate of power spectral density of noise is obtained using the inverse discrete wavelet transform. Key–Words: Speech Enhancement, Power Spectral Density, Periodogram, Wavelet Transform Thresholding 1 Introduction the theoretical power spectral density Gxx [k] and the mean value of power spectral density estimation Most of the speech enhancement methods use estima- Gˆ [k] [8] tion of noise and interference characteristics. Thus the xx notion of power spectral density is introduced, which Gˆ k G k Gˆ k . (3) defines the density of total noise energy of a random B xx [ ] = xx [ ] − E xx [ ] signal in dependence on frequency. n o n o Estimation is unbiased when the mean value of esti- By the Wiener-Khinchin relation [3, 8] it can be mation is equal to the theoretical power spectral den- derived that the power spectral density G [k] of a xx sity stationary random process can be obtained by the ˆ discrete Fourier transform of the autocorrelation se- E Gxx [k] = Gxx [k] . (4) quence Estimation is asymptoticallyn o unbiased when this equa- ∞ tion holds for a sufficiently large count of realizations. −j2πkm G [k] = γxx [m] e , (1) ˆ xx Estimation variance D Gxx [k] is defined as the m=−∞ X mean square value of the differencen o between the esti- where γxx [m] is the autocorrelation sequence of sta- mate and the mean value of estimate [8] tionary random process, for which it holds ˆ ˆ ˆ 2 ∞ D Gxx [k] = E Gxx [k] − E Gxx [k] . γxx [m] = x[n + m]x[n]. (2) n o n o (5) n=−∞ X Thus the estimation error can be defined by nor- The stationary random process denotes a random pro- malized mean square error [8] cess whose mean value and variance are independent ǫ2 = ǫ2 + ǫ2, (6) of time n while the autocorrelation sequence γxx [m] c b r depends only on lags m. 2 where ǫb is the normalized mean square error due to The autocorrelation sequence in formula (1) is 2 bias, and ǫr is the normalized mean square error due theoretically infinite and in practice it must be re- to variance. Both errors are defined as placed by an estimate from one or a few process realizations of finite length. On the other hand, this 2 B Gˆxx [k] causes distortion and the power spectral density ob- ǫ2 = , (7) b nG [k] o tained is only an estimate Gˆxx [k] of theoretical power xx spectral density Gxx [k]. ˆ The estimation bias (systematic error) D Gxx [k] ǫ2 = . (8) r n 2 o B Gˆxx [k] is defined as the difference between Gxx [k] n o ISSN: 1790-5117 207 ISBN: 978-960-474-035-2 Proceedings of the 7th WSEAS International Conference on CIRCUITS, SYSTEMS, ELECTRONICS, CONTROL and SIGNAL PROCESSING (CSECS'08) 2 Power spectral density estimation achieves the value 1, i.e. 100 %. For this reason esti- methods mation using the periodogram is improper. One of the methods of smoothing the power spec- 2.1 Non-parametric methods tral density estimation is the mean value of a few periodograms, which are obtained from a few suffi- Non-parametric methods of power spectral density es- ciently long random process realizations. [3, 8] If timation use the discrete Fourier transform. The auto- we have at our disposal only one random process correlation sequence can be estimated by the relation realization, then we subdivide an N-point signal into [3] K segments. The effect of subdividing is reduced frequency resolution by a factor K. For each segment, 1 N−m−1 r [m] = x [n + m] x [n], 0 ≤ m < N, the periodogram is computed xx N n=0 X 2 N−1 M−1 1 (i) 1 −j2πkn rxx [m] = x [n + m] x [n], −N <m< 0, (9) Pxx [k] = xi[n]e , N M n=|m| n=0 X X where N is the signal length. The estimation used is where M = N is the length of the each segment. biased but it asymptotically approximates the theoret- K Then the powerj spectralk density estimation is equal to ical autocorrelation sequence the mean value of partial periodograms lim rxx [m] = γxx [m] . (10) N→∞ 1 K−1 P B [k] = P (i) [k] . (16) Moreover, estimation variance converges to zero as xx K xx i=1 N → ∞, therefore rxx [m] is a consistent estimate X [3]. This method is called the Bartlett method [3]. The If we substitute (9) into (1), an estimate of power power spectral density estimation by the Bartlett spectral density of an ergodic random process, so method is again an asymptotically unbiased estima- called periodogram, can be obtained tion. However, the estimation variance is reduced by N−1 a factor K [3] −j2πkm 1 2 P [k] = rxx [m] e = |X[k]| . xx N k 2 m=−(N−1) 1 sin 2π N X P B k G2 k 2N−1 . (11) D xx [ ] = xx [ ] 1 + k K N sin 2π ! However, the periodogram is also a biased estimate n o 2N−1 (17) but asymptotically it is an unbiased estimate After substituting into (8) the normalized mean square lim E {Pxx [k]} = Gxx [k] . (12) error due to variance is shown to be reduced by a fac- N→∞ tor K too. When the random process is a Gaussian random Another modification of averaging periodograms process, the estimation variance is shown to be [3] is proposed by Welch. The fundamentals are the same as in the Bartlett method but Welch proposed segment k 2 2 sin 2π 2N−1 N overlapping and weighting segments by a chosen win- D {Pxx [k]} = Gxx [k] 1 + k . N sin 2π ! dow. The Welch method can reduce the estimation 2N−1 variance while preserving sufficient frequency resolu- (13) tion. [3] Hence the estimation variance is approaching the square of power spectral density 2 2.2 Parametric methods lim D {Pxx [k]} = Gxx [k] (14) N→∞ Unlike non-parametric methods the parametric as N → ∞. Therefore estimation using the pe- methods emulate a model for the generation of sig- riodogram is an inconsistent estimation. Moreover, nals that is constructed from a number of parameters estimation variance achieves the value of the square estimated from a short realization of the random of power spectral density and the normalized mean process observed. The modelling aproach makes square error due to variance it possible to extrapolate the value of the random process outside the realization and also to extrapolate ˆ D Gxx [k] G2 [k] the value of the autocorrelation sequence for lags ǫ2 = lim = xx = 1 (15) r n 2 o 2 |m| ≥ N. An advantage is that these methods N→∞ Gxx [k] Gxx [k] ISSN: 1790-5117 208 ISBN: 978-960-474-035-2 Proceedings of the 7th WSEAS International Conference on CIRCUITS, SYSTEMS, ELECTRONICS, CONTROL and SIGNAL PROCESSING (CSECS'08) provide a better frequency resolution than non- where Pxx [k] are samples of the periodogram of the parametric methods and their frequency resolution is implementation of a random process of length 2N = almost independent of the length of random process 2M+1 obtained by the discrete Fourier transform, and realization. [3] the base functions ψj,m [k] are derived by a time shift The discrete stationary random process x [n] with j = 0, 1,..., 2m − 1 and dilatation with the scale unknown power spectral density Gxx [k] can be trans- m = 0, 1,..., log2 N of a single mother function formed into white noise w[n] when sequence x [n] is ψ [k] according to the relation filtered by a linear digital filter with a transfer function 1/H(z) that is inverse to the transfer function (20). 1 k ψ [k] = √ ψ − j . (23) This filter is then called whitening filter. If it is possi- j,m m 2m 2 ble to find out the transfer function of whitening filter then the power spectral density is shown to be The transformation is linear and therefore the coefficients will represent the sum of the coefficients repre- j2πf 2 j2πf j2πf Gxx e = σwH(e )eH( ), (18) senting the sought power spectral density gj,m[k] and the coefficients representing the noise ej,m[k] where 2 υ[0] σw = e (19) Cj,m[k] = gj,m[k] + ej,m[k]. (24) is the variance of white noise, As the random processes ς[k] are independent and the ∞ −m transformation is orthogonal, the coefficients ej,m[k] H(z) = exp υ [m] z = (20) will be non-correlated. At the same time, however, m=1 ! r X they are not independent since their probability distri- −i biz bution is independent of the shift j but is dependent i=0 , z > r , r < , on the scale m. But according to the central limiting = Ps | | 1 1 1 −j theorem their probability distribution converges to the 1 + ajz j=0 Gaussian normal distribution with m → ∞ [2].

Load more