<<

Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 63297, Pages 1–14 DOI 10.1155/ASP/2006/63297

Dual-Channel Speech Enhancement by Superdirective

Thomas Lotter and Peter Vary

Institute of Communication Systems and Data Processing, RWTH Aachen University, 52056 Aachen, Germany

Received 31 January 2005; Revised 8 August 2005; Accepted 22 August 2005 In this contribution, a dual-channel input-output speech enhancement system is introduced. The proposed algorithm is an adap- tation of the well-known superdirective beamformer including postfiltering to the binaural application. In contrast to conventional beamformer processing, the proposed system outputs enhanced stereo signals while preserving the important interaural ampli- tude and phase differences of the original signal. Instrumental performance evaluations in a real environment with multiple speech sources indicate that the proposed computational efficient spectral weighting system can achieve significant attenuation of speech interferers while maintaining a high speech quality of the target signal.

Copyright © 2006 T. Lotter and P. Vary. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. INTRODUCTION Griffiths-Jim beamformer [6], or extensions, for example, in [7, 8], is most widely known. Adaptive beamformers can be Speech enhancement by beamforming exploits spatial diver- considered less robust against distortions of the desired sig- sity of desired speech and interfering speech or noise sources nal than fixed beamformers. by combining multiple noisy input signals. Typical beam- Beamforming for binaural input signals, that is, signals former applications are hands-free telephony, speech recog- recorded by single microphones at the left and right ear, has nition, teleconferencing, and hearing aids. Beamformer real- found significantly less attention than beamformers for (lin- izations can be classified into fixed and adaptive. ear) microphone arrays. An important application is the en- A fixed beamformer combines the noisy signals of mul- hancement of speech in a difficult multitalker situation using tiple microphones by a time-invariant filter-and-sum opera- binaural hearing aids. tion. The combining filters can be designed to achieve con- Current hearing aids achieve a speech intelligibility im- structive superposition towards a desired direction (delay- provement in difficult acoustic condition by the use of inde- and-sum beamformer) or in order to maximize the SNR im- pendent small endfire arrays, often integrated into behind- provement (superdirective beamformer), for example, [1]. the-ear devices with low microphone distances around 1- As practical problems such as self-noise and amplitude or 2 cm. When hearing aids are used in combination with eye- phase errors of the microphones limit the use of optimal glasses, larger arrays are feasible, which can also form a bin- beamformers, constrained solutions have been introduced aural enhanced signal [9]. that limit the directivity to the benefit of reduced suscepti- Binaural noise reduction techniques get into attention, bility [2–4]. Most fixed beamformer design algorithms as- when space limitation forbids the use of multiple micro- sume the desired source to be positioned in the far field, phones in one device or when the enhancement benefits of that is, the distance between the microphone array and the two independent endfire arrays are to be combined with bin- source is much greater than the dimension of the array. Near- aural processing benefit. In contrast to an endfire array, a field superdirectivity [5] additionally exploits amplitude dif- binaural speech enhancement system must work with a dual- ferences between the microphone signals. Adaptive beam- channel input-output signal, at best without modification of formers commonly consist of a fixed beamformer steered to- the interaural amplitude and phase differences in order not wards a desired direction and a time-varying branch, which to disturb the original spatial impression. adaptively steers beamformer spatial nulls towards inter- Enhancement by exploiting coherence properties [10]of fering sources. Among various adaptive beamformers, the the desired source and the noise [3, 11] has the ability to 2 EURASIP Journal on Applied Signal Processing reduce diffuse noise to a high degree, however fails in sup- and right ears do not only differ in the time difference de- pressing sound from directional interferers, especially un- pending on the position of the source relative to the head. wanted speech. Also, due to the adaptive estimation of the Furthermore, the shadowing effect of the head causes sig- instantaneous coherence in frequency bands, musical tones nificant intensity differences between the left- and right-ear canoccur.In[12, 13], a noise reduction system has been microphone signals. Both effects are described by the head- proposed, that applies a binaural processing model of the related transfer functions (HRTFs) [17]. human ear. To suppress lateral noise sources, the interaural Figure 1(a) shows a time signal s arriving at the micro- level and phase differences are compared to reference values phones from the angle θS in the horizontal plane. The time for the frontal direction. Frequency components are attenu- signals at the left and right microphones are denoted by yl, ated by evaluation of the deviation from reference patterns. yr. The microphone signal spectra can be expressed by the However, the system suffers severely from susceptibility to re- HRTFs towards left and right ears Dl(ω), Dr(ω). As the beam- verberation. In [14], the Griffiths-Jim adaptive beamformer former will be realized in the DFT domain, a DFT representa- [6] has been applied to binaural noise reduction in subbands, tion of the spectra is chosen. At discrete DFT frequencies ωk and listening tests have shown a performance gain in terms with frequency index k, the left- and right-ear signal spectra of speech intelligibility. However, the subband Griffiths-Jim are given by approach requires a voice activity detection (VAD) for the             filter adaptation which can cause cancellation of the desired Yl ωk = Dl ωk S ωk , Yr ωk = Dr ωk S ωk . speech when the VAD frequently fails especially at low signal- (1) to-noise ratios. In [15], a two-microphone adaptive system is presented ffi Here, S(ωk) denotes the spectrum of the original signal s.For with the core of a modified Gri ths-Jim beamformer. By k ω lowband-highband separation, a tradeoff is provided be- brevity, the frequency index is used instead of k. tween array-processing benefit and binaural benefit by the The acoustic transfer functions are illustrated in Figure 1. ff The shadowing effect of the head is described by multiplica- choice of the cuto frequency. In the lower band, the bin- ffi S k aural signal is passed to the respective ear. The directional tion of each spectral coe cient of the input spectrum ( ) filter is only applied to the high-frequency regions, whose with an angle and frequency-dependent physical amplitude αphy αphy influence to sound localization and lateralization is consid- factors l , r for the left- and right-ear side. The physical τphy τphy ered less significant. Both adaptive algorithms from [14, 15] time delays l , r , that characterize the propagation time have the ability to adaptively cancel out an interfering source. from the origin to the left and right ears, are approximately However, the beamformer adaptation procedure needs to be considered to be frequency-independent. The HRTF vector coupled to a voice activity detection (VAD) or correlation- D canthusbewrittenby based measure to counteract against possible target cancella-        T tion. phy − jω τphy θ phy − jω τphy θ θ k = α θ k e k l ( s) α θ k e k r ( s) . In this contribution, a full-band binaural input-output D s, l s, , r s, array that applies a binaural signal model and the well- (2) known superdirective beamformer as core is presented [16]. The dual-channel system thus comprises the advantages of a For convenience, the physical transfer function can be nor- fixed beamformer, that is, low risk of target cancellation and malized to that of zero degree. With αphy(0◦, k):= αphy(0◦, computational simplicity. l k = αphy ◦ k τphy ◦ = τphy ◦ = τphy ◦ To deliver an enhanced stereo signal instead of a mono ) r (0 , )and (0 ): l (0 ) r (0 ), the αnorm αnorm output, an efficient adaptive spectral weight calculation is in- normalized amplitude factors l , r and time delays τnorm τnorm troduced, in which the desired signal is passed unfiltered and l , r ,respectively,canbewrittenas which does not modify the perceptually important interau-   ral time and phase differences of the target and residual noise   αphy θ k norm l S, α θS k =   signal. To further increase the performance, a well-known l , phy , α 0◦, k Wiener postfilter is also adapted for the binaural application l       under consideration of the same requirements. norm phy phy ◦ τ θS = τ θS − τ 0 , The rest of the paper is organized as follow. In Section 2, l l l   (3) the binaural signal model is introduced as a basis for the phy   αr θS, k beamformer algorithm. Section 3 includes the proposed su- αnorm θS, k =   , r αphy ◦ k perdirective beamformer with dual-channel input and out- r 0 , put as well as the adaptive postfilter. Finally, in Section 4 per-       τnorm θ = τphy θ − τphy ◦ . formance results are given in a real environment. r S r S r 0

αphy αphy 2. BINAURAL SIGNAL MODEL The transfer vector D or the amplitudes l , r and time τphy τphy delays l , r as well as their normalized versions are in For the derivation of binaural beamformers, an appropriate the following obtained by two different approaches. Firstly, a signal model is required. The microphone signals at the left database of measured head-related impulse responses is used T. Lotter and P. Vary 3

◦ θ =−90 αphy θ ,k Y k 1 ( S ) l( )

yr τphy θ 1 ( S)

S(k) θ = 0◦ θS phy αr (θS,k) Yr(k)

τphy θ yl r ( S)

s θ = 90◦ (a) (b)

Figure 1: Acoustic transfer of a source from θS towards the left and right ears.

phy α θn,k White dl(θn) Analy. 1 ( ) noise τphy θ σ2 = 1 CCF l ( n)

τphy θ CCF r ( n)

phy α θn,k dr(θn) Analy. r ( )

θn+1

θn Resolution: 5 degrees (a) (b)

αphy αphy τphy τphy Figure 2: Generation of physical binaural transfer cues l , r , l , r using a database of head-related impulse responses.

to extract the transfer vectors for a number of relevant spatial is performed. Here, the same analysis should be applied as directions. Secondly, a binaural model is applied to approxi- that of the frequency-domain realization of the beamformer. mate transfer vectors. 2.2. HRTF model 2.1. HRTF database Using binaural cues extracted from a database delivers fixed HRTFs. The real HRTFs will however vary greatly between The first approach to extract interaural time differences and the persons and also on a daily basis depending on the po- amplitude differences is to use a database of head-related sition of the hearing aids. An adjustment of the beamformer impulse responses, for example, [18]. This database com- to the user without the demand to measure the customers prises recordings of head-related impulse responses d (θn, i), l HRTFs is desirable. This can be achieved by using a paramet- d (θn, i) with time index i for several spatial directions r ric binaural model. with in-the-ear microphones using a Knowles Electron- In [19], binaural sound synthesis is performed using a ics Manikin for Auditory Research (KEMAR) head. For a two filter blocks that approximate the interaural time differ- given resolution of the azimuths, for example, 5 degrees, ences (ITDs) and the interaural intensity differences (IIDs), αphy αphy τphy τphy the values of l , r , l , r are determined according respectively, of a spherical head. Useful results have been ob- to Figure 2. White noise is filtered with the impulse responses tained by cascading a delay element with a single-pole and d θ d θ l( n), r( n) for the left and right ears. A maximum search single-zero head-shadow filter according to of the cross-correlation function of the output signals deliv- phy phy ers the relative time differences τ , τr . The left- and right- l   ear delays can then be calculated using (3). For the extrac- 1+j γmod(θ)ω/2ω0 − jωτ θ phy phy D θ ω =   · e mod( ) tion of the amplitude factors α , αr , a frequency analysis mod( , ) ,(4) l 1+j ω/2ω0 4 EURASIP Journal on Applied Signal Processing

0.6 10

5

0 )(dB) (ms) θ, k ( norm l τ −5 norm 1 0 α

−10

−0.3 −15 −90 −45 0 45 90 0 1 2 3 4 5 6 7 8 9 θ f (kHz)

Model θ = 60◦ θ =−20◦ Database θ = 20◦ θ =−60◦

ff τnorm θ αnorm θ k ff Figure 3: Normalized time di erences of left ear l ( ) using the Figure 4: Normalized amplitude factors l ( , )fordi erent az- HRTF database and the binaural model, respectively. imuth angles extracted from database.

10 with ω0 = c/a,wherec is the speed of sound and a is the radius of the head. The model is determined by the angle- dependent parameters γ and τ with mod mod 5       βmin βmin θ − π/2 ◦ γmod(θ) = 1+ + 1 − cos 180 , 2 2 θmin 0 ⎧ a π )(dB)

⎪ θ, k

⎪− θ − π/ − ≤ θ< ( ⎨ c cos( 2), 0, τ θ = 2 −5 mod( ) ⎪ a π norm 1 ⎪ α ⎩ |θ|,0≤ θ< . c 2 (5) −10

Theparametersofthemodelaresettoβmin = 0.1, θmin = 150◦, which produces a fairly good approximation to the −15 0 1 2 3 4 5 6 7 8 9 ideal frequency response of a rigid sphere (see [19]). The T f (kHz) transfer vector D = [Dl, Dr] can be extracted from (4)with ◦ ◦         θ = 60 θ =−20 θ = ◦ θ =− ◦ Dl θs, k = Dmod θs, ωk , Dr θs, k = Dmod π − θs, ωk . 20 60 (6) αnorm θ k ff Figure 5: Normalized amplitude factors l ( , )fordi erent az- The model provides the radius of the spherical head a as pa- imuth angles extracted from the binaural model. rameter. It is set to 0.0875 m, which is commonly considered as the average radius for an adult human head. factors obtained by the HRTF model. The model-based ap- 2.3. Comparison of HRTF extraction methods proach delivers amplitude values that interpolate the angle- and frequency-dependent amplitude factors of the KEMAR ff τnorm Figure 3 shows the normalized time di erences l in de- head, or in other words the fine structure of the HRTF is not pendence of the azimuth angle extracted from the HRTF considered by the simple model. database and by applying the binaural model. While the Due to the high variance between persons, measurements model-based approach delivers smaller absolute values, the of the targets person’s HRTFs should at best be provided to a time differences are very similar. binaural speech enhancement algorithm. However, we think αnorm Figure 4 plots the normalized amplitude factors l that a strenuous and time-consuming measurement for sev- over the frequency for different azimuths using the HRTF eral angles is not feasible for many application scenarios, for database, while Figure 5 shows the normalized amplitude example, not during the hearing aid fitting process. In case T. Lotter and P. Vary 5 of the target person’s HRTFs being unknown to the binau- Here ΦMM denotes the cross-spectral-density matrix, ral algorithm, the fine structure of a specific HRTF cannot be ⎛ ⎞ Φ k Φ k ··· Φ k exploited. Therefore, we prefer the model-based approach, 11( ) 12( ) 1M( ) ⎜ Φ k Φ k ··· Φ k ⎟ which can be customized to some extent with little effort ⎜ 21( ) 22( ) 2M( ) ⎟ Φ k = ⎜ ⎟ . by choosing a different head radius, for example, during the MM( ) ⎜ . . . . ⎟ (10) ⎝ . . . . ⎠ hearing aid fitting process. In the following, the dual-channel ΦM (k) ΦM (k) ··· ΦMM(k) input-output beamformer design will be illustrated only with 1 2 underlying the model-based HRTF. If a homogenous isotropic noise field is assumed, then the elements of ΦMM are determined only by the distance dmn 3. SUPERDIRECTIVE BINAURAL BEAMFORMER between microphones m and n [10]:   In this section, the superdirective beamformer with Wiener ωkdmn Φmn(k) = si . (11) postfilter is adapted for the binaural application. The pro- c posed fixed beamformer uses superdirective filter design ffi techniques in combination with the signal model to opti- The vector of coe cients can then be determined by gra- mally enhance signals from a given desired spatial direction dient calculation or using Lagrangian multipliers to   compared to all other directions. The enhancement of the   Φ−1 k θ k beamformer and postfilter is then exploited to calculate spec- θ k =  MM ( )D S,   . W S, H θ k Φ−1 k θ k (12) tral weights for left- and right-ear spectral coefficients under D S, MM( )D S, the constraint of the preservation of the interaural amplitude If a design should be performed with limited superdirectivity and phase differences. to avoid the loss of directivity by microphone mismatch, the design rule can be modified by inserting a tradeoff factor μs 3.1. Superdirective beamformer design [3], in the DFT domain       Φ−1 k μ θ k θ k =  MM( )+ sI D S,   . Consider a microphone array with M elements. The noisy W S, H −1 (13) D θS, k Φ (k)+μsI D θS, k observations for each microphone m are denoted as ym(i) i with time index . Since the superdirective beamformer can If μ →∞, then W → 1/DH , that is, a delay-and-sum beam- ffi s e ciently be implemented in the DFT domain, noisy DFT former results from the design rule. A more general approach ffi Y k coe cients m( ) are calculated by segmenting the noisy to control the tradeoff between directivity and robustness is L time signals into frames of length and windowing with a presented in [4]. h i function ( ), for example, Hann window including zero- The directivity of the superdirective beamformer strongly ffi m λ padding. The DFT coe cient of microphone ,frame ,and depends on the position of the microphone array towards the k frequency bin can then be calculated with desired direction. If the axis of the microphone array is the L −1 same as the direction of arrival, an endfire array with higher − j πki/L Ym(k, λ) = ym(λR + i)h(i)e 2 , m ∈{1, ..., M}. directivity than for a broadside array, where the axis is or- i=0 thogonal to the direction of arrival, is obtained. (7) For the computation of the next DFT, the window is shifted 3.1.1. Binaural superdirective coefficients by R samples. These parameters are chosen to N = 256 and M = R = 112 at a sampling frequency of fs = 20 kHz. For the sake In the binaural application 2 microphones are used, the of brevity, the index λ is omitted in the following. spectral coefficients are indexed by l and r to express left and In the DFT domain, the beamformer is realized as mul- right sides of the head. The superdirective design rule accord- tiplication of the input noisy DFT coefficients Ym, m ∈ ing to (13) requires the transfer vector for the desired direc- θ k = D θ k D θ k T {1, ..., M}, with complex factors Wm. The output spectral tion D( s, ) [ l( s, ), r( s, )] and the matrix of cross- Φ coefficient is given as power-spectral densities 22 as inputs for each frequency bin k. The transfer vector can be extracted from (4) according to Z k W ∗ k Y k = H . ( ) = m( ) m( ) W Y (8) (6). On the other hand, the 2×2 cross-power-spectral density m matrix Φ22(k) can be calculated using the head related coher- The objective of the superdirective design of the weight ence function. After normalization by Φll(k)Φrr(k), where vector W is to maximize the output SNR. This can be Φll(k) = Φrr(k), the matrix is achieved by minimizing the output energy with the con-   straint of an unfiltered signal from the desired direction. 1 Γlr(k) Φ22(k) = , (14) The minimum variance distortionless response (MVDR) ap- Γlr(k)1 proach can be written as (see [1–3])     H with the coherence function min W θS, k ΦMM(k)W θS, k w.r.t, W Φ k     (9) Γ k =  lr( ) . H θ k θ k = . lr( ) (15) W S, D S, 1 Φll(k)Φrr(k) 6 EURASIP Journal on Applied Signal Processing

 Sl(k) distortions from the frequency-domain filter. In addition, a distortionless response for the desired direction should be W ∗ k ! l ( ) G θs k = Yl(k) guaranteed, that is, super( , ) 1. To fulfil the demand of just one weight for both left- and right-ear sides, the weights are calculated by comparing the Z k G k ( ) Weights ( ) spectral amplitudes of the beamformer output to the sum of W ∗ k (20) both input spectral amplitudes, r ( ) Yr(k)   Z k  Beamformer (24) G k =   ( )  . S (k) super( )     (18) r Yl(k) + Yr(k)

Figure 6: Superdirective binaural input-output beamformer. To avoid amplification, the weight factor is upper-limited to one afterwards. To fulfil the distortionless response of the de- sired signal with (18), the MVDR design rule according to (13) has to be modified with a correction factor corrsuper: The head-related coherence function is much lower than the value that could be expected from (11) when only taking     H θ k Φ k θ k the microphone distance between left and right ears into min W S, MM( )W S, w.r.t., W       (19) account [3]. It can be calculated by averaging a number N of H W θS, k D θS, k = corrsuper θS, k . equidistant HRTFs across the horizontal plane, 0 ≤ θ<2π,

     θ k N D θ k D∗ θ k corrsuper( , ) is to be determined in the following. Assum- Γ k =  n=1 l n, r n, . s θ k = ( )            ing that a desired signal arrives from s, that is, Y( ) N  2 N  2 phy n= D θn, k n= D θn, k θ k S k |Y k |=α θ k |S k | 1 l 1 r D( s, ) ( ) and consequently l( ) l ( S, ) ( ) , phy (16) |Yr(k)|=αr (θS, k)|S(k)|. Also assume that the coefficient In this work, an angular resolution of 5 degrees in the hori- vector W has been designed for this angle θs. Then, after in- zontal plane is used, that is, N = 72. sertion of (17) into (18), we obtained

    3.1.2. Dual-channel input-output beamformer   corrsuper θs, k S(k) G (k) =       . (20) A beamformer that outputs a monaural signal would be un- super αphy θ k S k  αphy θ k S k  l s, ( ) + r s, ( ) acceptable, because the benefit in terms of noise reduction is consumed by the loss of spatial hearing. We therefore pro- ! pose to utilize the beamformer output for the calculation of The demand Gsuper = 1 for a signal from θS yields spectral weights. Figure 6 shows a block diagram of the pro- posed superdirective stereo input-output beamformer in the       frequency domain. θ k = αphy θ k αphy θ k . corrsuper s, l s, + r s, (21) In analogy to (8), the input DFT coefficients are summed after complex multiplication by superdirective coefficients, The design of the superdirective coefficient vector W(θs, k) k θ ff Z k = H k k = W ∗ k Y k W ∗ k Y k . for frequency bin and desired angle s with tradeo factor ( ) W ( )Y( ) l ( ) l( )+ r ( ) r( ) (17) μs is therefore

The enhanced Fourier coefficients Z can then serve as refer-        G θ k = αphy θ k αphy θ k ence for the calculation of weight factors (as defined in the W s, l s, + r s,   following), which output binaural enhanced spectra Sl, Sr via     −1 (22) Y Y Φ k μ θs k multiplication with the input spectra l, r. Afterwards, the ·  MM ( )+ sI D ,   . H −1 enhanced dual-channel time signal is synthesized via IDFT D θs, k ΦMM(k)+μsI D θs, k and overlap add. Regarding the weight calculation method, it is advanta- 3.1.3. Directivity evaluation geous to determine a single real-valued gain for both left- and right-ear spectral coefficients. By doing so, the interau- Now, the performance of the beamformer is evaluated in ral time and amplitude differences will be preserved in the terms of spatial directivity and directivity gain plots. The di- enhanced signal. Consequently, distortions of the spatial im- rectivity pattern Ψ(θs, θ, k) is defined as the squared transfer pression will be minimized in the output signal. Real-valued function for a signal that arrives from a certain spatial direc- weight factors Gsuper(k) are desirable in order to minimize tion θ if the beamformer is designed for angle θs. T. Lotter and P. Vary 7

− − 90 0dB 90 0dB −60 −120 −60 −120 −5dB −5dB

− − −30 10 dB −150 −30 10 dB −150 −15 dB −15 dB

0 180 0 180

30 150 30 150

60 120 60 120 90 90

f = Figure 7: Beam pattern (frequency-independent) of typical delay- 300 Hz f = and-subtract beamformer applied in a single behind-the-ear device. 1000 Hz f = 3000 Hz Parameters are microphone distance: dmic = 0.01 m and internal de- lay of beamformer for rear microphone signal: τ = (2/3) · (dmic/c). ◦ Figure 8: Beam pattern Ψ(θs = 0 , θ, f ) of superdirective binaural input-output beamformer for DFT bins corresponding to 300 Hz, 1000 Hz, and 3000 Hz (special case of broadside delay-and-sum beamformer). As a reference, Figure 7 plots the directivity pattern of a typical hearing aid first-order delay-and-subtract beam- former integrated, for example, in a single behind-the-ear device. In the example, the rear microphone signal is delayed ◦ −90 2/3 of the time, which a source from θS = 0 needs to travel 0dB −60 −120 from the front to the rear microphone, and is subtracted −5dB from the front microphone signal. The approach is limited − to low microphone distances, typically lower than 2 cm, to −30 10 dB −150 avoid spectral notches caused by spatial aliasing. Also, the −15 dB lower-frequency region needs to be excluded, because of its low signal-to-microphone-noise ratio caused by the subtract 0 180 operation. The behind-the-ear endfire beamformer can greatly at- tenuate signals from behind the hearing-impaired subjects but cannot differentiate between left- and right-ear sides. The 30 150 dual-channel input-output beamformer behaves the oppo- site. Due to the binaural microphone position, the directivity 60 120 shows a front-rear ambiguity. 90 In the case of the stereo input-output binaural beam- former, the directivity pattern is determined by the squared f = 300 Hz weight factors G2 , according to (18), that are applied to f = 1000 Hz super f = the spectral coefficients 3000 Hz

Ψ θ =− ◦ θ f      Figure 9: Beam pattern ( s 60 , , ) of superdirective bin- Ψ θ θ k / = G θ θ k aural input-output beamformer for DFT bins corresponding to s, , dB 20 log10 super s, , , (23) 300 Hz, 1000 Hz, and 3000 Hz (design parameter μs = 10, which corresponds to a low degree of superdirectivity). which can be written as

       H  special case of a simple delay-and-sum beamformer, that is,   W θs, k D(θ, k) Ψ θs, θ, k /dB = 20 log . (24) a broadside array with two elements. Thus, the achieved di- 10 αphy θ k αphy θ k l ( , )+ r ( , ) rectivity is low at low frequencies. At higher frequencies, the phase difference generated by a lateral source becomes sig- nificant and causes a narrow along with sidelobes Figure 8 shows the beam pattern for the desired direction due to spatial aliasing. However, the side lobes are of lower ◦ θs = 0 . In this case, the superdirective design leads to the magnitude due to the different amplitude transfer functions. 8 EURASIP Journal on Applied Signal Processing

− Figure 9 shows the directivity pattern for the desired an- 90 0dB ◦ −60 −120 gle θs =−60 . The design parameter was set to μ = 10, s −5dB that is, low degree of superdirectivity. Hence, approximately − a delay-and-sum beamformer with amplitude modification −30 10 dB −150 is obtained. Because of significant interaural differences, the −15 dB directivity is much higher compared to that of the frontal de- sired direction, especially signals from the opposite side will 0 180 be highly attenuated. The main lobe is comparably large at all plotted frequencies. Figure 10 shows that the directivity if the design param- eter is adjusted for a maximum degree of superdirectivity, 30 150 that is, μs = 0. As expected, the directivity further increases especially for low frequencies and the main lobe becomes 60 120 more narrow. 90 To measure the directivity of the dual-channel input- output system in a more compact way, the overall gain can be f = 300 Hz considered. It is defined as the ratio of the directivity towards f = 1000 Hz f = 3000 Hz the desired direction θs and the average directivity. As only the horizontal plane is considered, the average directivity can ◦ be obtained by averaging over 0 ≤ θ<2π with equidistant Figure 10: Beam pattern Ψ(θs =−60 , θ, f ) of superdirective bin- angles at a resolution of 5 degrees, that is, N = 72. The direc- aural input-output beamformer for DFT bins corresponding to μ = tivity gain DG is given as 300 Hz, 1000 Hz, and 3000 Hz (design parameter s 0, i.e., maxi- mum degree of superdirectivity).     Ψ θs, θs, k DG θs, k =    . (25) /N N Ψ θ θ k (1 ) n=1 s, n, transforms the noisy input vector Y(k) = S(k)+N(k) into Figure 11 depicts the directivity gain as a function of the fre- the best scalar estimate S(k)isgivenby quency for different desired directions with low degree of su-   . Φ k Φ−1 k θ k perdirectivity. The gain increases from 0 dB to up to 4–5 5dB k = ss( ) ·  MM( )D S,   . Wopt( ) H − below 1 kHz depending on the desired direction. Since the Φ k Φ k θS k Φ 1 k θS k  ss( )+ nn( ) D , MM ( )D , microphone distance between the ears is comparably high . Wiener filter MVDR beamformer with 17 5 cm, phase ambiguity causes oscillations in the fre- (26) quency plot. Towards higher frequencies, the interaural amplitude dif- Possible realizations of the Wiener postfilter are based on ferences gain more influence on the directivity gain. For θS = the observation that the noise correlation between the mi- 0◦, unbalanced amplitudes of the spectral coefficients of left- crophone signals is low [22, 23]. An improved performing and right-ear sides decrease the gain in (18)towardshighfre- algorithm is presented in [21], where the transfer function ffi quencies due to the simple addition of the coe cients in the Hpost of the postfilter is estimated by the ratio of the output numerator, while the denominator is dominated by one in- power spectral density Φzz and the average input power spec- put spectral amplitude for a lateral signal. For lateral desired tral density of the beamformer Φyy with directions however, the interaural amplitude differences are Φ (k) Φ (k) exploited in the numerator with (18) resulting in directivity H (k) = zz = zz . (27) post Φ k /M M Φ k gain values up to 5 dB. yy( ) (1 ) i=1 ii( ) Figure 12 shows the directivity for the case that the coef- ficients are designed with respect to high degree of superdi- 3.2.1. Adaptation to dual-channel input-output rectivity. Now, even at low frequencies, a gain of up to nearly beamformer 6 dB can be accomplished. In the following, the dual-channel input-output beamformer 3.2. Multichannel postfilter is extended by also adapting the formulation of the postfilter according to (27) into the spectral weighting framework. The superdirective beamformer produces the best possible The goal is to find spectral weights with similar require- signal-to-noise ratio for a narrowband input by minimiz- ments as for the beamformer gains. Again, only one postfilter ing the noise power subject to the constraint of a distortion- weight is to be determined for both left- and right-ear spec- less response for a desired direction [20]. It can be shown tral coefficients in order not to disturb the original spatial [21] that the best possible estimate in the MMSE sense is impression, that is, the interaural amplitude and phase differ- the multichannel Wiener filter, which can be factorized into ences. Secondly, a source from a desired direction θS should the superdirective beamformer followed by a single-channel pass unfiltered, that is, the spectral postfilter weight for a sig- Wiener postfilter. The optimum weight vector Wopt(k) that nal from that direction should be one. T. Lotter and P. Vary 9

6

5

4

3

2 Directivity gain (dB)

1

0 100 1000 10000 (Hz)

◦ θs = 0 ◦ θs =−30 ◦ θs =−60

◦ ◦ Figure 11: Directivity gain according to (25) of superdirective stereo input-output beamformer for desired direction θs = 0 (solid), θs = 30 ◦ (dashed), and θs =−60 (dotted) for low degree of superdirectivity (μs = 10).

6

5

4

3

2 Directivity gain (dB)

1

0 100 1000 10000 (Hz)

◦ θs = 0 ◦ θs =−30 ◦ θs =−60

◦ Figure 12: Directivity gain according to (25) of superdirective stereo input-output beamformer for desired direction θs = 0 (solid), θs = ◦ ◦ −30 (dashed), and θs =−60 (dotted) for high degree of superdirectivity (μs = 0).

In analogy to the optimal MMSE estimate according To realize the postfilter according to (27) in the spectral to (26) weights, Gpost postfilter weights are multiplicatively weighting framework, weights are calculated with combined with the beamformer weights Gsuper according to (18) to the resulting weights G(k):   2Z(k)2   G (k) =     · corr θS, k . (29) G k = G k · G k . post  2  2 post ( ) super( ) post( ) (28) Yl(k) + Yr(k) 10 EURASIP Journal on Applied Signal Processing

 The desired angle- and frequency-dependent correc- Sl(k) tion factor corrpost will guarantee a distortionless response W ∗ k θS l ( ) towards a signal from the desired direction . For a signal Yl(k) from θS,(29)canberewrittenas        H 2 Z k G G G k 2 W θS, k D θS, k S(k)   ( ) super post ( ) G (k) =     · corr θs, k . post  2  2 post W ∗ k (20) (35) Yl(k) + Yr(k) r ( ) Yr(k) (30)

ffi Beamformer (24)  Since the beamformer coe cients have been designed with Sr(k) θ k H θ k = αphy θ k αphy θ k respect to W( S, ) D( S, ) l ( S, )+ r ( S, ), the spectral weights can be reformulated as        Figure 13: Superdirective input-output beamformer with postfil-  2 phy phy 2 2 S(k) α θs, k + αr θs, k tering. G k = l post( )   2    2  αphy θ k S k 2 αphy θ k S k 2 l s, ( ) + r s, ( )   · θ k corrpost s, the system depends on various parameters of the real envi-      phy phy 2 ronment in which it is applied in. First of all, the unknown 2 α θs, k + αr θs, k   l HRTFs of the target person, for example, a hearing-impaired =       · corrpost θs, k . phy 2 phy 2 person will deviate from the binaural model or from a pre- α θs, k + αr θs, k l evaluated HRTF database. The noise reduction performance (31) of the system, that relies on the erroneous database, will thus decrease. Secondly, reverberation will degrade the perfor- Demanding Gpost(k) = 1gives mance.       phy 2 phy 2 In order to evaluate the performance of the beamformer   α θs, k + αr θs, k θ k = l . in a realistic environment, recordings of speech sources corrpost S,     2 (32) αphy θ k αphy θ k were made in a conference room (reverberation time T ≈ 2 l s, + r s, 0 800 ms) with two source-target distances as depicted in Consequently, after insertion of (32) into (29), the resulting Figure 14. All recordings were performed using a head mea- postfilter weight calculation for combination with the dual- surement system (HMS) II dummy head with binaural hear- channel input-output beamformer according to (18), (22) ing aids attached above the ears without taking special pre- can finally be written as cautions to match exact positions. In the first scenario, the   speech sources were located within a short distance of 0.75 m Z(k)2 . G k =     to the head. Also, the head was located at least 2 2m away post( )  2  2 Yl(k) + Yr(k) from the nearest wall. In the second scenario, the loudspeak- ers were moved 2 m away from the dummy head. Thus, the   2   2 (33) αphy θ k αphy θ k recordings from the two scenarios differ significantly in the l s, + r s, ·      . direct-to-reverberation ratio. In the experiments, a desired phy phy 2 α θ k α θ k s θS l s, + r s, speech source 1 arrives from angle 1 towards which the beamformer is steered and an interfering speech signal s2 ar- θ ff Again, to avoid amplification, the postfilter weight should be rives from angle S2 . The superdirectivity tradeo factor was upper-limited to one. Figure 13 shows a block diagram of the set to μs = 0.5. resulting system with stereo input-output beamformer plus Firstly, the spectral attenuation of the desired and un- θ = Wiener postfilter in the DFT domain. After the dual-channel wanted speech for one source-interferer configuration, S1 − ◦ θ = ◦ . beamformer processing, the postfilter weights are calculated 60 , S2 30 , at a distance of 0 75 m from the head according to (33) and are multiplicatively combined with the is illustrated. The theoretical behavior of the beamformer beamformer gains according to (28). The dual-channel out- without postfilter for that specific scenario is indicated   put spectral coefficients Sl(k), Sr(k)aregeneratedbymulti- by Figure 12. The desired source should pass unfiltered, ffi Y k Y k θ = ◦ plication of left- and right-side input coe cients l( ), r( ) while the interferer from S2 30 should be frequency- with the respective weight G(k). Finally, the binaural en- dependently attenuated. A lower degree of attenuation is ex- hanced time signals are resynthesized using IDFT and over- pected at f = 1000 Hz due to spatial aliasing. lap add. Figure 15 plots the measured results in the real environ- ment. The attenuation of the interfering speech source varies 4. PERFORMANCE EVALUATION mainly between 2–7 dB, while the desired source is also atten- uated by 1–2 dB, more or less constant over the frequency. At In this section, the performance of the dual-channel input- frequencies below 700 Hz, the superdirectivity already allows output beamformer with postfilter is evaluated by a mul- a significant attenuation of the interferer. Due to spatial alias- titalker situation in a real environment. The performance of ing, the attenuation difference is very low around 1200 Hz. At T. Lotter and P. Vary 11

3 −90◦ −60◦ . 2.5m 2 5

◦ −30 2

0.75 m 1.5 2.5m 0◦ 1

.

2m 22m Intelligibility-weighted gain (dB) 0.5

0 −80 −60 −40 −20 0 20 40 60 80 θ Angle of interfering speech S2 Figure 14: Recording setup inside conference room (reverberation Distance: 2 m T ≈ time 0 800 ms). Distance: 0.75 m

Figure 16: Intelligibility-weighted gain according to (35)ofsu- θ = perdirective stereo input-output beamformer for speech from S1 1 0◦ and interfering speech from other directions (distance to dummy 0 head 0.75 m and 2 m, resp.). −1

−2 attenuation of the desired source, that is, −3 Φs s (ω) Φs s (ω) ω = 2 2 − 1 1 . −4 AG( ) 10 log10 Φ ω 10 log10 Φ ω (34) s2s2 ( ) s1s1 ( ) − 5 Φ ω Here, s1s1 ( ) is the power spectral density of the desired − 6 Φs s ω Φs s Spectral attenuation (dB) speech and 2 2 ( ) that of the unwanted speech. 1 1 is −7 the power spectral density of the remaining components of s1 after beamformer processing according to Figure 6 or −8 Φ s Figure 13 and s2s2 is that of the remaining components of 2. −9 To judge the intelligibility improvement that is achieved by 100 1000 (Hz) the frequency-dependent reduction gain, AG(ω)isgrouped into critical bands AG(ωb) and is weighted according to the Desired source ANSI’s speech intelligibility index standard [24]. Unwanted source The intelligibility-weighted gain can then be calculated by θ =− ◦ Figure 15: Spectral attenuation of desired source from S1 60 θ = ◦ K   and unwanted source from S2 30 . IWG = abAG wb , (35) b=1

where ab are the weights according to [24] in the respective high frequencies, the attenuation difference rises again be- critical frequency band b. After evaluation of (34)and(35) cause the beamformer can exploit the significance of the in- for the left- and right-microphone signals, the intelligibility- teraural amplitude differences here. weighted gains of left and right ears are averaged. Figure 16 plots the performance of the superdirec- 4.1. Intelligibility-weighted improvement tive binaural input-output beamformer in terms of speech intelligibility-weighted gain for a desired speech source from To judge the benefit of the frequency-dependent noise re- 0◦ and speech interferers from variable directions. The two duction gain like exemplarily shown in Figure 15, a speech plots in Figure 16 show the gain when all sources were lo- intelligibility-weighted noise reduction gain is applied in the cated 0.75 m and 2 m away from the dummy head. following. The binaural input-output superdirective beamformer For its calculation, a spectral noise reduction gain is de- only delivers about 0.6–1.2 dB intelligibility-weighted im- termined as the difference between the power spectral den- provement because of its comparably low directivity towards sity attenuation of the undesired source subtracted from the the frontal direction as depicted in Figure 8. Due to the 12 EURASIP Journal on Applied Signal Processing

3 4.5

4 2.5 3.5

2 3

2.5 1.5 2

1 1.5

1 Intelligibility-weighted gain (dB) Intelligibility-weighted gain (dB) 0.5 0.5

0 0 −80 −60 −40 −20 0 20 40 60 80 −80 −60 −40 −20 0 20 40 60 80 θ θ Angle of interfering speech S2 Angle of interfering speech S2

Distance: 2 m Superdirective BF with postfilter Distance: 0.75 m Superdirective BF

Figure 17: Intelligibility-weighted gain according to (35)ofsu- Figure 18: Intelligibility-weighted gain according to (35)ofsu- perdirective stereo input-output beamformer (μs = 0) for speech perdirective stereo input-output beamformer with and without ◦ from θS =−60 and speech interferer from other directions (dis- θ = ◦ 1 postfilter for speech from S1 0 and speech interferer from other tance to dummy head 0.75 m or 2 m, resp.). directions (distance to dummy head 0.75 m).

decreased direct-to-reverberation ratio, the overall perfor- 4.5 mance is all the time lower for the 2 m distance from the 4 dummy head. Figure 17 plots the performance of the superdirective 3.5 binaural input-output beamformer when the desired sig- 3 nal arrives from θ =−60◦. Results for desired sources and interfering sources from a distance of 0.75 m and 2 m from 2.5 the dummy head are shown. 2 For low angular differences to the desired direction, that . is, θ =−90◦ and θ =−30◦, the gain mostly stays below 1 5 1 dB. When the interfering speech source is located at the 1 Intelligibility-weighted gain (dB) other side, the superdirective beamformer achieves the high- . est intelligibility-weighted gain, whose value is nearly 3 dB. 0 5 Due to the decreased direct-to-reverberation ratio at the 0 −80 −60 −40 −20 0 20 40 60 80 distance of 2 m, the gain remains below 2 dB. Angle of interfering speech θS Now, the influence of the additional binaural postfil- 2 ter for the superdirective input-output beamformer is ex- Superdirective BF with postfilter amined. Figures 18 and 19 show the intelligibility-weighted Superdirective BF noise reduction gain of the superdirective stereo input- output beamformer with and without postfilter for desired ◦ ◦ Figure 19: Intelligibility-weighted gain according to (35)ofsu- θS = θS =− directions 0 and 60 . perdirective stereo input-output beamformer (μs) with and without θ =− ◦ The postfilter provides an additional intelligibility- postfilter for speech from S1 60 and speech interferer from weighted gain which is proportional to that obtained by other directions (distance to dummy head 0.75 m). the superdirective binaural beamformer. For a desired signal from a lateral direction, the absolute improvement is much larger than for the frontal direction. θ =− ◦ performance for a lateral desired direction, S1 60 , while θ = ◦ 4.2. Improvements for both ear sides the right part plots that for the frontal direction S1 0 . Due to the formulation of the beamformer with post- Figure 20 plots the intelligibility-weighted gains indepen- filter as a single spectral weight for both left- and right-ear dently for both ear sides when applying the beamformer spectral coefficients, the proposed algorithm almost deliv- with postfiltering. The left part of Figure 20 shows the ers the same improvement for the good- and bad-ear sides. T. Lotter and P. Vary 13

4.5 4.5

4 4

3.5 3.5

3 Worse ear 3 Worse ear (lower SNR) (lower SNR) 2.5 2.5

2 2

1.5 1.5

1 1 Intelligibility-weighted gain (dB) Intelligibility-weighted gain (dB) 0.5 0.5

0 0 −80 −60 −40 −20 0 20 40 60 80 −80 −60 −40 −20 0 20 40 60 80 θ θ Angle of interfering speech S2 Angle of interfering speech S2 (a) (b)

θ =− ◦ θ = ◦ Figure 20: Intelligibility-weighted gain for left ear (dashed line) and right ear (dotted line): (a) S1 60 ;(b) S1 0 .

25 plots the segmental speech SNR for the two considered θ =− ◦ θ = ◦ desired angles, S1 60 and S1 0 . The speech quality of the target source is somewhat degraded due its attenuation 20 caused by imperfect knowledge of the HRTFs as also depicted in Figure 15, however the speech SNR is always high at 15– 15 25 dB. For the lateral desired direction, the target attenuation is always higher than for the frontal direction.

10 5. CONCLUSION

Segmental speech SNR (dB) We have presented a dual-channel input-output algorithm 5 for binaural speech enhancement, which consists of a su- perdirective beamformer and a postfilter with an underly- 0 ing binaural signal model, and consists of a simple spectral − − − − 80 60 40 20 0 20 40 60 80 weighting scheme. The system perfectly preserves the inter- Angle of interfering speech θS 2 aural amplitude and phase differences and passes a source θ = ◦ S1 0 from a desired direction unfiltered in principle. θ =− ◦ S1 60 Instrumental measurements and informal listening tests indicate a significant attenuation of speech interferers in a Figure 21: Segmental speech SNR of target signal for two different real environment, while the target source is only slightly desired directions. degraded. The amount of interference rejection depends on the spatial position of the desired and unwanted sources, a high rejection of interfering noise or speech can particularly be expected for lateral desired directions. The proposed algo- The improvement at the ear side with the lower SNR is only rithms can efficiently be realized in real time. slightly higher.

REFERENCES 4.3. Speech quality of target source [1] J. Bitzer and K. U. Simmer, “Superdirective microphone ar- To measure the speech quality of the target signal after pro- rays,” in Microphone Arrays: Signal Processing Techniques and cessing, the segmental SNR is measured. Again, the target Applications,M.S.BrandsteinandD.B.Ward,Eds.,chapter2, speech was mixed with interferers from other directions. The pp. 19–38, Springer, Berlin, Germany, 2001. speech quality was then determined by applying the resulting [2] E. N. Gilbert and S. P. Morgan, “Optimum design of direc- filter on the target signal alone and calculating the segmen- tive arrays subject to random variations,” Bell System tal speech SNR between input and filtered output. Figure 21 Technical Journal, vol. 34, pp. 637–663, 1955. 14 EURASIP Journal on Applied Signal Processing

[3] M. Dorbecker,¨ Mehrkanalige Signalverarbeitung zur Verbess- [19] C. P. Brown and R. O. Duda, “A structural model for binau- erung akustisch gestorter¨ Sprachsignale am Beispiel elektronis- ral sound synthesis,” IEEE Transactions on Speech and Audio cher Horhilfen¨ , Ph.D. thesis, Rheinisch-Westfalische¨ Technis- Processing, vol. 6, no. 5, pp. 476–488, 1998. che Hochschule Aachen, Aachen, Germany, 1998. [20] R. A. Monzingo and T. W. Miller, Introduction to Adaptive Ar- [4] S. Doclo and M. Moonen, “Design of broadband beamformers rays, John Wiley & Sons, New York, NY, USA, 1980. robust against gain and phase errors in the microphone array [21] K. U. Simmer, J. Bitzer, and C. Marro, “Post-filtering tech- characteristics,” IEEE Transactions on Signal Processing, vol. 51, niques,” in Microphone Arrays: Signal Processing Techniques no. 10, pp. 2511–2526, 2003. and Applications,M.S.BrandsteinandD.B.Ward,Eds.,chap- [5] W. Tager, “Near field superdirectivity (NFSD),” in Proceedings ter 3, pp. 39–60, Springer, Berlin, Germany, 2001. of the IEEE International Conference on Acoustics, Speech and [22] R. Zelinski, “A microphone array with adaptive post-filtering Signal Processing (ICASSP ’98), vol. 4, pp. 2045–2048, Seattle, for noise reduction in reverberant rooms,” in Proceedings of the Wash, USA, May 1998. IEEE International Conference on Acoustics, Speech and Signal [6]L.J.Griffiths and C. W. Jim, “An alternative approach to lin- Processing (ICASSP ’88), vol. 5, pp. 2578–2581, New York, NY, early constrained adaptive beamforming,” IEEE Transactions USA, April 1988. on Antennas and Propagation, vol. 30, no. 1, pp. 27–34, 1982. [23] C. Marro, Y. Mahieux, and K. U. Simmer, “Analysis of noise re- [7] O. Hoshuyama and A. Sugiyama, “Robust adaptive beam- duction and dereverberation techniques based on microphone forming,” in Microphone Arrays, M. S. Brandstein and D. B. arrays with postfiltering,” IEEE Transactions on Speech and Au- Ward, Eds., pp. 87–110, Springer, Berlin, Germany, 2001. dio Processing, vol. 6, no. 3, pp. 240–259, 1998. [8] W. Herbordt and W. Kellermann, “Adaptive beamforming [24] ANSI S3.5-1997 American National Standards Institute, for audio signal acquisition,” in Adaptive Signal Processing— “Methods for Calculation of the Speech Intelligibility Index,” Applications to Real-World Problems,J.BenestyandY.Huang, ANSI S3.5-1997, 1997. Eds., pp. 155–194, Springer, Berlin, Germany, 2003. [9] J. G. Desloge, W. M. Rabinowitz, and P. M. Zurek, “Micro- phone-array hearing aids with binaural output. I. Fixed- Thomas Lotter received the Dipl. Ing. processing systems,” IEEE Transactions on Speech and Audio degree in electrical engineering in 2000 Processing, vol. 5, no. 6, pp. 529–542, 1997. from the Aachen University of Technology, [10] G. W. Elko, “Spatial coherence functions for differential mi- RWTHAachen.HereceivedthePh.D.de- crophones in isotropic noise fields,” in Microphone Arrays: Sig- gree from the RWTH Aachen University in nal Processing Techniques and Applications, M. S. Brandstein 2004 after working at the Institute of Com- and D. B. Ward, Eds., chapter 4, pp. 61–86, Springer, Berlin, munication Systems and Data Processing Germany, 2001. in the area of single and multimicrophone speech enhancement. In 2004, he joined [11] V. Hamacher, “Comparison of advanced monaural and bin- Siemens Audiological Engineering Group in aural noise reduction algorithms for hearing aids,” in Proceed- Erlangen, Germany, with focus on hearing aid applica- ings of the IEEE International Conference on Acoustics, Speech, tions. His main interests include speech enhancement, signal pro- and Signal Processing (ICASSP ’02), vol. 4, pp. 4008–4011, Or- cessing for wireless systems, wireless standards, and audio coding. lando, Fla, USA, May 2002. [12] V. Hohmann, J. Nix, G. Grimm, and T. Wittkop, “Binaural Peter Vary received the Dipl. Ing. degree noise reduction for hearing aids,” in Proceedings of the IEEE in electrical engineering in 1972 from the International Conference on Acoustics, Speech, and Signal Pro- University of Darmstadt, Darmstadt, Ger- cessing (ICASSP ’02), vol. 4, pp. 4000–4003, Orlando, Fla, USA, many. In 1978, he received the Ph.D. May 2002. degree from the University of Erlangen- [13] T. Wittkop, S. Albani, V. Hohmann, J. Peissig, W. S. Woods, Nuremberg, Germany. In 1980, he joined and B. Kollmeier, “Speech processing for hearing aids: noise Philips Communication Industries (PKI), reduction motivated by models of binaural interaction,” Acus- Nuremberg, where he became the Head of tica United with Acta Acustica, vol. 83, no. 4, pp. 684–699, the Digital Signal Processing Group. Since 1997. 1988, he has been a Professor at Aachen [14] D. R. Campbell and P. W. Shields, “Speech enhancement us- University of Technology, Aachen, Germany, and the Head of the ing sub-band adaptive Griffiths–Jim signal processing,” Speech Institute of Communication Systems and Data Processing. His Communication, vol. 39, no. 1-2, pp. 97–110, 2003, Special is- main research interests are speech coding, channel coding, error sue on speech processing for hearing aids. concealment, adaptive filtering for acoustic echo cancellation and [15]D.P.Welker,J.E.Greenberg,J.G.Desloge,andP.M.Zurek, noise reduction, and concepts of mobile transmission. “Microphone-array hearing aids with binaural output. II. A two-microphone adaptive system,” IEEE Transactions on Speech and Audio Processing, vol. 5, no. 6, pp. 543–551, 1997. [16] T. Lotter, Single and multimicrophone speech enhancement for hearing aids, Ph.D. thesis, Rheinisch-Westfalische¨ Technische Hochschule Aachen, Aachen, Germany, 2004. [17] J. Blauert, Spatial Hearing: The Psychophysics of Human Sound Localization, MIT Press, Cambridge, Mass, USA, 1983. [18] B. Gardner and K. Martin, “HRTF measurement of a KEMAR dummy-head microphone,” Tech. Rep. 280, MIT Media Labo- ratory Perceptual Computing, Davos, Switzerland, May 1994, http://sound.media.mit.edu/KEMAR.html.