Spectral Smoothing for Frequency-Domain Blind Source Separation
Total Page:16
File Type:pdf, Size:1020Kb
SPECTRAL SMOOTHING FOR FREQUENCY-DOMAIN BLIND SOURCE SEPARATION Hiroshi Sawada Ryo Mukai Se´bastien de la Kethulle de Ryhove Shoko Araki Shoji Makino NTT Communication Science Laboratories, NTT Corporation 2-4 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0237, Japan {sawada,ryo,delaketh,shoko,maki}@cslab.kecl.ntt.co.jp ABSTRACT We need thousands of filter taps to separate acoustic signals This paper describes the circularity problem of frequency- in a room. The approach is impractical for such cases unless domain blind source separation (BSS), and presents a new starting from a good initial solution. method for solving it. Frequency-domain BSS performs in- The other approach is frequency-domain BSS, where dependent component analysis (ICA) in each frequency bin. complex-valued ICA for an instantaneous mixture is applied It is more efficient than time-domain BSS where ICA is ap- in each frequency bin [3–8]. The merit of this approach is plied to convolutive mixtures. However, frequency-domain that the ICA algorithm can be performed separately at each BSS has two problems. The first is the permutation prob- frequency, and the convergence of each ICA is fast. The dif- lem, for which we have recently proposed a method. It ficulty lies in solving the permutation problem and the cir- provides a robust and precise solution for the permutation cularity problem. The permutation problem is well known problem and reveals the influence of the second problem, to be a difficult problem. Recently, we have proposed a namely the circularity problem. Our solution for this sec- method [6] that provides a robust and precise solution and ond problem is based on spectral smoothing by window- also enables the separation of more than two sources. As ing. However, the direct application of windowing changes a consequence, the influence of the circularity problem is the frequency responses for separation obtained by ICA and highlighted. It deteriorates separation performance as ex- causes an error. Therefore, we adjust the frequency responses plained in Sec. 3. before windowing so that the error is minimized. The effec- One well-known aspect of the circularity problem is that tiveness of the method is shown by experimental results for a multiplication in the frequency domain corresponds to a the separation of up to four sources. circular convolution in the time domain, which is different from a linear one. A widely used technique for simulating a linear convolution by frequency-domain multiplication in- 1. INTRODUCTION volves constraining frequency responses such that the cor- responding time-domain filter contains enough consecutive We consider the problem of blind source separation (BSS) zeros at the end [9]. To maintain this constraint strictly in for convolutive mixtures [1]. Suppose that N source sig- BSS, the gradient of a filter in the ICA algorithm should also nals sp(t) are mixed and observed at M sensors xq(t)= be constrained to contain enough consecutive zeros. If this N ( ) ( − ) ( ) p=1 l hqp l sp t l , where hqp l represents the im- constraint is followed in frequency-domain BSS, it becomes pulse response from source p to sensor q. The goal is to sep- a frequency-domain implementation of time-domain convo- ( ) arate the mixtures xq t and to obtain a filtered version of lutive ICA [10, 11], whose convergence is slow as in the ( ) ( ) a source sp t at each output yr t . The separation system case of time-domain BSS. Some BSS methods interleave typically consists of a set of FIR filters wrq(l) of length L M L−1 this constraint with a frequency-domain ICA algorithm [7, to produce separated signals yr(t)= q=1 l=0 wrq(l) 8]. With these methods, the constraint conflicts with the xq(t − l). In a practical situation, separation should be per- ICA solution, and these two different operations should be formed without knowing sp(t) or hqp(l). Independent com- applied repeatedly until convergence. In both cases, the al- ponent analysis (ICA) [2] is one of the major statistical tools gorithm moves back and forth between the two domains, for solving this problem. We can classify BSS methods into and loses the attractive characteristic of frequency-domain two approaches based on how we apply ICA. BSS whereby ICA can be applied separately in each fre- The first approach is time-domain BSS, where ICA is quency bin. applied directly to the convolutive mixture model. The ICA Our recognition of the circularity problem is more gen- algorithm correctly evaluates the independence of separated eral than the circular convolution problem. The circularity signals yr(t). Thus, the approach achieves good separation problem arises from discrete frequency representation as ex- once the algorithm converges. However, if the algorithm plained in Sec. 3. Our technique for solving the problem in- starts from a solution far from the final one, it takes many volves spectral smoothing by a window that tapers smoothly iterations and much time to converge. This is because filter to zero at each end, and forcing a filter to fit a specific length coefficients wrq(l) depend on each other in the algorithm. L. The direct application of windowing, however, changes mixed signals separated signals x(t) w(l) y(t) 1 0 STFT IDFT X( f ,t) W( f ) −1 Amplitude −2 ICA permutation scaling smoothing W( f ) 1000 2000 3000 4000 5000 6000 1 Fig. 1. Flow of frequency-domain BSS 0 −1 the ICA solutions and causes an error. Therefore, we adjust Amplitude −2 the ICA solutions within their scaling ambiguity before win- 1000 2000 3000 4000 5000 6000 dowing so that the error is minimized in the least-squares Time (sample) sense. These techniques are explained in Sec. 4. Fig. 2. Periodical time-domain filter represented by fre- quency responses sampled at L = 2048 points (above) and 2. FREQUENCY-DOMAIN BSS its one-period realization (below). This section describes frequency-domain BSS where ICA Target: u (l) Interference: u (l) is applied separately in each frequency bin [3–6]. Figure 1 11 13 shows the flow. First, time-domain signals xq(t) are con- 1 1 ( ) verted into frequency-domain time-series signals X q f,t 0.5 0.5 by short-time Fourier transform (STFT), where t is now 0 0 down-sampled with the distance of the frame shift. Then, Amplitude −0.5 −0.5 to obtain the frequency responses Wrq(f) of filters wrq(l), complex-valued ICA Y(f,t)= W(f)X(f,t) is solved, 3000 4000 5000 3000 4000 5000 T where X(f,t)=[X1(f,t), ..., XM (f,t)] , Y(f,t)= T 1 1 [Y1(f,t), ...,YN (f,t)] , and W(f) is a separation matrix 0.5 0.5 whose elements are Wrq(f). Note that any ICA algorithm can be used in this scheme. The ICA solution in each fre- 0 0 quency has scaling and permutation ambiguity. We perform Amplitude −0.5 −0.5 W( ) ← P( )W( ) P( ) permutation alignment f f f , where f 3000 4000 5000 3000 4000 5000 is a permutation matrix obtained by the method [6]. The Time (sample) Time (sample) scaling ambiguity is solved by the minimal distortion prin- Fig. 3. Impulse responses urp(l) obtained by the periodical W( ) ← (W( )−1)W( ) ( ) ciple, f diag f f , to make Yr f,t as filter (above) and by the finite-length filter (below). close to Xr(f,t) as possible [4, 12]. Then, we solve the circularity problem by spectral smoothing as described in ( ) Sec. 4. Finally, separation filters wrq l are obtained by ap- are obtained by the infinite-length filters, and the lower ones ( ) plying inverse DFT to Wrq f . are by the one-period filters. We see that the one-period fil- ters create spikes, which distort a target signal and degrade 3. THE CIRCULARITY PROBLEM the separation performance. Since these spikes does not exist in the infinite-length case, both adjacent periods are The frequency-domain BSS described in the previous sec- needed to eliminate a spike. Here, we consider two reasons tion is influenced by the circularity of the discrete frequency for these spikes. One is that the frequency responses are un- representation. The circularity refers to the fact that fre- dersampled and the corresponding time domain filter has an quency responses sampled at L points with an interval f s/L overlap with another period. This problem does not occur (fs: sampling frequency) represent a time-domain signal if the required filter length is less than L. The other rea- whose period is L/fs. Figure 2 shows two time-domain fil- son is that adjacent periods work together to perform some ters. The upper one is a periodical infinite-length filter rep- filtering even if the first problem is solved. The effect of ( ) resented by frequency responses Wrq f calculated by ICA the second problem can be mitigated if the amplitude of the at L points. Since this filter is unrealistic, we usually use the filter coefficients around both ends is small. one-period realization of the filter shown in the lower part. The ICA algorithm in the frequency domain does not However, one-period filters may cause a problem. Fig- know the length L of the time-domain filters, and ICA solu- ure 3 shows impulse responses from a source sp(t) to a sep- M L−1 tions in various frequency bins might require the length of arated signal yr(t): urp(l)= q=1 τ=0 wrq(τ)hqp(l−τ). the time-domain filter to be longer than L and generally infi- Those on the left u11(l) correspond to the extraction of a nite.