Equalization of Audio Channels a Practical Approach for Speech Communication
Total Page:16
File Type:pdf, Size:1020Kb
Equalization of Audio Channels A Practical Approach for Speech Communication Nils Westerlund November, 2000 Abstract Many occupations of today requires the usage of personal preservative equip- ment such as a mask to protect the employee from dangerous substances or the usage of a pair of ear-muffs to damp high sound pressure levels. The goal of this Master thesis is to investigate the possibility of placing a microphone for communication purposes inside such a preservative mask as well as the possibil- ity of placing the microphone inside a persons auditory meatus and perform a digital channel equalization on the speech path in question in order to enhance the speech intelligibility. Subjective listening tests indicates that the speech quality and intelligibility can be considerably improved using some of the methods described in this thesis. Acknowledgements I would like to express my gratitude to Dr. Mattias Dahl for his support and extraordinary ability to explain complex systems and relationships in an under- standable way. I would also like to express my appreciation to Svenska EMC- Lab in Karlskrona for letting me use their semi-damped room for measurement purposes. Contents 1 Channel Equalization | An Introduction 2 1.1 Non-Adaptive Methods . 3 1.2 Adaptive Channel Equalization . 4 2 Equalization of Mask Channel 8 2.1 Gathering of Measurement Data . 8 2.2 Coherence Function . 9 2.3 Channel Equalization using tfe . 10 2.4 Adaptive Channel Equalization . 10 2.4.1 The LMS Algorithm . 11 2.4.2 The NLMS Algorithm . 14 2.4.3 The LLMS Algorithm . 16 2.4.4 The RLS Algorithm . 16 2.5 Minimum-Phase Approach . 18 2.6 Results of Mask Channel Equalization . 21 3 Equalization of Mouth-Ear Channel 22 3.1 Gathering of Measurement Data . 23 3.2 Coherence Function of Mouth-Ear Channel . 24 3.3 Channel Equalization Using tfe . 25 3.4 Adaptive Channel Equalization . 25 3.4.1 The LMS Algorithm . 26 3.5 Results of mouth-ear channel equalization . 28 4 Identification of \True" Mouth-Ear Channel 31 4.1 Basic Approach . 31 5 Conclusions 35 5.1 Further Work . 35 A MatLab functions 37 A.1 LMS Algorithm . 37 A.2 NLMS Algorithm . 38 A.3 LLMS Algorithm . 39 A.4 RLS Algorithm . 40 A.5 Minimum-Phase Filter Design . 41 A.6 Coherence and Transfer Function . 42 A.7 Estimate of \True" Channel . 43 1 Chapter 1 Channel Equalization | An Introduction x(n) y(n) h(n) X(z) Y(z) H(z) Figure 1.1: System with input and output signals and the corresponding system in z-domain. A linear time-invariant system h(n) takes input signal x(n) and produces an output signal y(n) which is the convolution of x(n) and the unit sample response h(n) of the system, see fig. 1.1. The input, output and system is assumed to be real and only real signals will be considered in this thesis. The convolution described above can be written as y(n) = x(n) ∗ h(n) (1.1) where the convolution operation is denoted by an asterisk ∗. In z-domain, the convolution represents a multiplication given by Y (z) = X(z)H(z) (1.2) where Y (z) is the z-transform of the output y(n), X(z) is the z-transform of the input x(n) and H(z) is the z-transform of the unit sample response h(n) of the system. In many practical applications there is a need to correct the distortion caused by the channel and in this way recover the original signal x(n). In this thesis, this corrective operation will be called channel equalization. 2 1.1 Non-Adaptive Methods A cascade connection of a system h(n) and its inverse hI (n) is illustrated in fig. 1.2. Suppose the distorting system has an impulse response h(n) and let Identity system x(n) y(n) d(n)=x(n) h(n) hI(n) Figure 1.2: System h(n) cascaded with its inverse system hI (n) results in an identity system. hI (n) denote the impulse response of the inverse system. We can then write d(n) = x(n) ∗ h(n) ∗ hI (n) = x(n) (1.3) where d(n) is the desired signal, i.e. the original input signal x(n). This implies that h(n) ∗ hI (n) = δ(n) (1.4) where δ(n) is a unit impulse. In z-domain, (1.4) becomes H(z)HI (z) = 1 (1.5) Thus, the transfer function for the inverse system is 1 H (z) = (1.6) I H(z) Note that the zeros of H(z) becomes the poles of the inverse system and vice versa. If the characteristics of the system is unknown, it is often necessary to excite the system with a known input signal, observe the output, compare it with the input and then determine the characteristics of the system. This operation is called system identification [1]. If we obtain an output signal y(n) from a system h(n) excited with a known input signal x(n), we could of course use the z-transforms of y(n) and x(n) to form Y (z) H(z) = (1.7) X(z) However, this is an analytical example and the transfer function H(z) is most likely infinite in duration. A more practical approach is based on a correlation method. The crosscorrelation of the signals x(n) and y(n) is given by 1 rxy(l) = X x(n)y(n − l) ; l = 0; 1; 2; : : : (1.8) n=−∞ 3 The index l is the lag parameter1 and the subscripts xy on the crosscorrelation sequence rxy(l) indicate the sequences being correlated. If the roles of x(n) and y(n) is reversed, we obtain 1 ryx(l) = X y(n)x(n − l) ; l = 0; 1; 2; : : : (1.9) n=−∞ Thus, rxy(l) = ryx(−l) (1.10) Note the similarities between the computation of the crosscorrelation of two sequences and the convolution of two sequences. Hence, if the sequence x(n) and the folded sequence y(−n) is provided as inputs to a convolution algorithm, the convolution yields the crosscorrelation rxy(l), i.e. rxy(l) = x(l) ∗ y(−l) (1.11) In the special case when x(n) = y(n) the operation results in the autocorrelation of x(n), rxx(l). Recall that y(n) = x(n) ∗ h(n). The insertion of this expression for y(n) into (1.11) yields rxy(l) = h(−l) ∗ rxx(l) (1.12) In z-domain, (1.12) becomes ∗ Pxy(z) = H (z)Pxx(z) (1.13) ∗ where H (z) is the complex conjugate of H(z) and Pxx is the power spectral density of x(n). The transfer function for the identified system is then ∗ P (z) H (z) = xy (1.14) Pxx(z) where Pxy(z) is the cross spectral density between x(n) and y(n). If rxy(l) is replaced by ryx(−l) in (1.12), the complex conjugate in (1.14) is eliminated and we obtain the following estimate of the transfer function: P (z) H(z) = yx (1.15) Pxx(z) The MatLab2 function tfe3 uses this method to estimate a transfer function of the system in question [4]. In later sections it will be clear that this method is both straightforward and powerful when identifying a given system. 1.2 Adaptive Channel Equalization Another trail to equalize a channel is to use adaptive algorithms. There are a vast amount of application areas for adaptive algorithms and the mathematical theory is quite complex and reaches beyond the scope of this thesis. Therefore, 1Also commonly referred to as (time) shift parameter 2MatLab is a trademark of The MathWorks, Inc. 3Transfer Function Estimate 4 in this section, only a brief description of the basic principles of adaptive filtering will be given [2]. A block diagram of an adaptive filter is shown in fig. 1.3. It consists of a shift-varying filter and an adaptive algorithm for updating the filter coefficients. The goal of adaptive FIR-filters, is to find the Wiener filter w(n) that minimizes d(n) x(n) Adaptive d(n) filter e(n) Adaptive algorithm Figure 1.3: Basic structure for an adaptive filter. the mean-square error ξ(n) = Efjd(n) − d^(n)j2g = Efje(n)j2g (1.16) where E{·} is the expected value and d^(n) is the estimate of the desired signal d(n). We know that if x(n) and d(n) are jointly wide-sense stationary processes, the filter coefficients that minimize the mean-square error ξ(n) are found by solving the Wiener-Hopf equations [2] Rxxw = rdx (1.17) where Rxx denotes the autocorrelation matrix of x(n), w denotes the vector containing filter coefficients and rdx denotes the crosscorrelation vector of d(n) and x(n). The calculation of the Wiener-Hopf equations is a complex mathematical operation including an inversion of the autocorrelation matrix Rxx. If the in- put signal or the desired signal is nonstationary, this operation would have to be performed iteratively. Instead, the requirement that w(n) should minimize the mean-square error at each time n can be relaxed and a coefficient update equation of the form w(n + 1) = w(n) + ∆w(n) (1.18) can be used. In this equation ∆w(n) is a correction that is applied to the filter coefficients w(n) at time n to form a new set of coefficients, w(n + 1), at time n+1. Equation (1.18) is the heart of all adaptive algorithms used in this thesis.4 Since the error function ξ(n) is a quadratic function, its curve can be viewed as a \bowl" with the minimum error at the bottom of this bowl.