Eric Jacobsen and Richard Lyons

The Sliding DFT

he standard method well as review the process of fre- filter’s structure is readily available in for spectrum analysis quency-domain convolution to ac- the literature [5]-[7]. in digital signal pro- complish time-domain windowing. The z-domain transfer function of cessing (DSP) is the Finally, a modified sliding DFT struc- the Goertzel filter is discrete Fourier ture is proposed that provides im- transform (DFT), typically imple- proved computational efficiency. HzG () Tmented using a − −−jkN21π / = 1 ez (FFT) algorithm. However, there are −− 12−+cos( 2πkNz / ) 12 z applications that require spectrum Goertzel Algorithm analysis only over a subset of the N (2) center frequencies of an N-point DFT. The Goertzel algorithm, used in A popular, as well as efficient, tech- dual-tone multifrequency decoding with a single z-domain zero located at = − jkN2 π / nique for computing sparse DFT re- and phase-shift keying/frequency-shift ze and conjugate poles at = ± jkN2 π / sultsistheGoertzelalgorithmthat keying modem implementations, is ze as shown in Figure 2(a). = − jkN2 π / computes a single complex DFT spec- commonly used to compute DFT The pole/zero pair at ze can- tral bin value for every N input time spectra [1]-[4]. The algorithm is im- cels each other. The frequency magni- samples. This article describes a sliding plemented in the form of a second-or- tude response, provided in Figure DFT process whose spectral bin out- der infinite impulse response (IIR) 2(b), shows resonance centered at a π put rate is equal to the input data rate, filter as shown in Figure 1. This filter normalized frequency of 2 kN/ , cor- on a sample-by-sample basis, with the computes a single DFT output (the responding to a cyclic frequency ⋅ advantage that it requires fewer com- kth bin of an N-point DFT) defined kfs / NHz (where f s is the signal putations than the Goertzel algorithm by sample rate). for real-time spectral analysis. In appli- While the Goertzel algorithm is N −1 − π derived from the standard DFT equa- cations where a new DFT output spec- Xk()= ∑ xne ( ) jnkN2 / . tion, it’s important to realize that the trum is desired every sample, or every n=0 (1) few samples, the sliding DFT is filter’s frequency magnitude response computationally simpler than the tra- The filter’s yn()output is equal to is not the sin(xx )/( )-like response of a ditional radix-2 FFT. We’ll start our the DFT output frequency coeffi- single-bin DFT. The Goertzel filter is sliding DFT discussion by providing a cient, Xk(), at the time index nN= . a complex resonator having an infi- brief review of the Goertzel algorithm For emphasis, we remind the reader nite-length unit impulse response, = jnkN2 π / anduseitsbehaviorasayardstickto that the filter’s yn()output is not hn() e , and that’s why its evaluate the performance of the sliding equal to Xk()at any time index when magnitude response is so narrow. DFT technique. Following that, we nN≠ . The frequency-domain index The time-domain difference equa- will examine stability issues regarding k is an integer in the range tions for the Goertzel filter are the sliding DFT implementation as 01≤≤kN −. The derivation of this vn( )=−22 cos(π kN / ) vn ( 1 ) −−+ vn()()2 xn (3a) “DSP Tips and Tricks” introduces practical tips and tricks of design and imple- mentation of signal processing algorithms so that you may be able to incor- =−− jkN2 π / − porate them into your designs. We welcome readers who enjoy reading this yn() vn () e vn (1 ). (3b) column to submit their contributions. Please contact Associate Editor Rick Ly- ons at [email protected]. An advantage of the Goertzel filter in calculating an N-point Xk()DFT

74 IEEE SIGNAL PROCESSING MAGAZINE MARCH 2003 1053-5888/03/$17.00©2003IEEE bin is that (3a) is implemented N times while (3b), the feed forward path in Figure 1, need only be com- puted once after the arrival of the Nth input sample. Thus for real xn()the filter requires N + 2 real multiplies and 21N + real adds to compute an N-point Xk(). However, when mod- eling the Goertzel filter if the time in- dex begins at n = 0, the filter must process N + 1 time samples with xN()= 0 to compute Xk(). Now let’s look at the sliding DFT process. ▲ 1. IIR filter implementation of the Goertzel algorithm.

Sliding DFT The sliding DFT (SDFT) algorithm performs an N-point DFT on time samples within a sliding-window as shown in Figure 3. In this example the SDFT initially computes the DFT of the N = 16 time samples in Figure 3(a). The time window is then ad- vanced one sample, as in Figure 3(b), and a new N-point DFT is calculated. The value of this process is that each ▲ 2. Goertzel filter: (a) z-domain pole/zero locations and (b) frequency magnitude re- new DFT is efficiently computed di- sponse. rectly from the results of the previous DFT. The incremental advance of the time window for each output compu- tation is what leads to the name slid- ing DFT or sliding-window DFT. The principle used for the SDFT is known as the DFT shifting theorem or the circular shift property [8]. It states that if the DFT of a windowed (finite-length) time-domain se- quence is Xk(), then the DFT of that sequence, circularly shifted by one π sample, is Xke() jkN2 / . Thus the spec- tral components of a shifted time se- quence are the original (unshifted) spectral components multiplied by π e jkN2 / , where k is the DFT bin of in- terest. We express this process by

=−jkN2 π / Snkk() Sn (1 ) e −−+xn()() N xn (4)

where Snk ()is the new spectral com- − ▲ ponent and Snk ()1 is the previous 3. Signal windowing for two 16-point DFTs: (a) data samples in the first computation spectral component. The subscript k and (b) second computation samples.

MARCH 2003 IEEE SIGNAL PROCESSING MAGAZINE 75 reminds us that the spectra are those put samples, and M is less than z-domain transfer function for the kth associated with the kth DFT bin. log2 (N ),theslidingDFTcanbe bin of the sliding DFT filter is Equation (4), whose derivation is computationally superior to tradi- provided in the Appendix, reveals the tional FFT implementations even − −N = ()1 z Hz() π − . value of this process in computing when all N DFT outputs are required. SDFT 1 − ezjkN21/ (5) real-time spectra. We calculate Snk () Equation (4) leads to the sin- by phase shifting the previous gle-bin SDFT filter structure shown This complex filter has N zeros Sn()− 1 components, subtract the in Figure 4. k equally spaced around the z-domain’s xn()− N sample, and add the current The single-bin SDFT algorithm is unit circle, due to the N-delay comb xn()sample. Thus the SDFT requires implemented as an IIR filter with a filter, as well as a single pole canceling only one complex multiply and two comb filter followed by a complex π the zero at ze= jkN2 / . The SDFT fil- real adds per output sample. The com- resonator [9]. (If you want to com- ter’s complex unit impulse response putational complexity of each succes- pute all N DFT spectral components, hn()and pole/zero locations are sive N-point output is then O(N ) for N resonators with k = 0 to N − 1 will 2 shown in Figure 5 for the example the sliding DFT compared to O(N ) be needed, all driven by a single comb where k = 2 and N = 20. for the DFT and O[NN log ( )] for filter.) The comb filter delay of N 2 Because of the comb subfilter, the the FFT. Unlike the DFT or FFT, samples forces the filter’s transient re- SDFT filter’s complex sinusoidal unit however, due to its recursive nature sponse to be N − 1samples in length, impulse response is finite in length, the sliding DFT output must be com- so the output will not reach steady truncated in time to N samples, and puted for each new input sample. If a state until theSN()sample. In practi- k that property makes the frequency new N-point DFT output is required cal applications the algorithm can be magnitude response of the SDFT fil- only every N inputs, the sliding DFT initialized with zero input and zero 2 ter identical to the sin(Nx )/sin( x ) re- requires O(N ) computations and is output. The output will not be valid, sponse of a single DFT bin centered equivalent to the DFT. When output or equivalent to (1)’s Xk(), until N in- at a normalized frequency of 2πkN/ . computations are required every M in- put samples have been processed. The One of the attributes of the SDFT − is that once an Snk ()1 is obtained, the number of computations to calcu- xn() Snk() + + late Snk ()is fixed and independent of N. A computational workload com- − − parison between the Goertzel and z N z 1 SDFT filters is provided later in this article. Unlike the radix-2 FFT, the − Snk(1) SDFT’s N can be any positive integer giving us greater flexibility to tune the −1 jkN2/π e SDFT’s center frequency by defining integer k such that kNff=⋅/ , ▲ 4. Single-bin sliding DFT filter structure. i s when f i is a frequency of interest in hertz. In addition, the SDFT does not 1 require bit-reversal processing as does the FFT. Like Goertzel, the SDFT is 0 1 especially efficient for narrowband −1 02468101214161820 0.5 spectrum analysis. 2/πkN Real [hn ( )] 0 For completeness, we mention that a radix-2 sliding FFT technique 1 −0.5

Imaginary Part k =2 exists for computing all N bins of Xk() 0 −1 N =20 in (1) [10], [11]. This method is −1 −10 1 computationally attractive because it 02468101214161820 Real Part Imag [hn ( )] requires only N complex multiplies to (a) (b) update the N-point FFT for all N bins; however, it requires 3N mem- ▲ 5. Sliding DFT characteristics for k = 2 and N = 20: (a) impulse response and (b) ory locations (2N for data and N for pole/zero locations. twiddle coefficients). Unlike the

76 IEEE SIGNAL PROCESSING MAGAZINE MARCH 2003 SDFT, the radix-2 sliding FFT logic operations [12]. Determining if ples. However, windowing by time- scheme requires address bit-reversal thedampingfactorrisnecessaryfora domain multiplication would compro- processing and restricts N to be an in- particular SDFT application requires mise the computational simplicity of teger power of two. careful empirical investigation. the SDFT. Alternatively, we can imple- Another stabilization method ment a time-domain window by means worth consideration is decrementing of frequency-domain convolution. SDFT Stability the largest component (either real or Spectral leakage reduction per- jkNπ The SDFT filter is only marginally imaginary) of the filter’s e 2 / feed- formed in the frequency domain is ac- stable because its pole resides on the back coefficient by one least significant complished by convolving adjacent z-domain’s unit circle. If filter coeffi- bit. This technique can be applied se- Snk ()values with the DFT of a cient numerical rounding error is not lectively to problematic output bins window function. For example, the severe, the SDFT is bounded-input, and is effective in combating instabil- DFT of a Hanning window com- bounded-output stable. Filter insta- ity due to rounding errors which result prises only three nonzero values, jkNπ bility can be a problem, however, if in finite-precision e 2 / coefficients −0.25, 0.5, and −0.25. As such we can numerical coefficient rounding having magnitudes greater than unity. compute a Hanning-windowedSnk (), causes the filter’s pole to move out- Like the DFT, the SDFT’s output the kth DFT bin, with a three-point side the unit circle. We can use a is proportional to N, so in fixed-point convolution using damping factor r to force the pole to binary implementations the designer be at a radius of r inside the unit circle must allocate sufficiently wide regis- Hanning-windowed Sn=− ⋅ S n + ⋅ Sn and guarantee stability using a trans- ters to hold the computed results. kkk()025 .−1 () 05 . () fer function of −⋅ 025.().Snk+1 (9) − ()1 − rzNN Time-Domain Windowing Hz= Figure 7 shows this process where SDFT, gs () jkNπ − 1 − re21/ z (6) in the Frequency Domain the comb filter stage need only be im- ThespectralleakageoftheSDFTcan plemented once. Thus a Hanning with the subscript gs meaning guaran- be reduced by the standard concept of window can be implemented by bi- teed-stable. The stabilized feed-for- windowing the xn()input time sam- nary right shifts (assuming integer ward and feedback coefficients π become −r N and re jkN2 / ,respec- xn() Sn() tively. The difference equation for the + + k stable SDFT filter becomes

− −1 =−jkN2 π / z N z SnSnrekgs,,() kgs (1 ) N −−xn() Nr + xn () − (7) Snk(1) with the stabilized-filter structure −r N rejkN2/π shown in Figure 6. Using a damping factor as in Fig- ▲ 6. Guaranteed-stable sliding DFT filter structure. ure 6 guarantees stability, but the

Snk ()output, defined by Sn() Resonator k−1 k−1 N −1 =⋅− jnkN2 π / − Xkr <1 ()∑ xnre ( ) 0.25 Windowed n=0 xn() Sn() S (n ) Output (8) + Resonator k + k k is no longer exactly equal to the kth bin z −N 0.5 of an N-point DFT in (1). While the Resonator Snk+1() error is reduced by making r very close k+1 to (but less than) unity, a scheme does −1 − exist for eliminating that error com- 0.25 pletely once every N output samples at ▲ 7. Three-resonator structure to compute three SDFT bin results and a three-point con- the expense of additional conditional volution.

MARCH 2003 IEEE SIGNAL PROCESSING MAGAZINE 77 arithmetic) and two complex adds for where nN=−012, , ,..., 1,andthe and denominator of HzSDFT ()in (5) α −−π each SDFT bin, making the Hanning integer specifies the number of by the factor()1 − ezjkN21/ yielding window attractive in ASIC and terms in the window’s time function.

FPGA implementations where sin- These window functions are attrac- HzSG () gle-cycle hardware multiplies are tive for frequency domain convolu- −−π − ()()11−−ezjkN21/ z N costly. If a gain of four is acceptable, tion because their DFTs contain only = −−−−jkN21π / j 2πkN/ −1 then only one left shift two complex a few nonzero samples. The fre- ()(11ez ez ) adds are required using quency domain vectors of various −−−−jkN21π / − N α = ()()11ez z cos (x )window functions follow the −− −+π 1 2 ⋅− − 12cos( 2kNz / ) z (12) Hanning-windowed form (/)(,,,,)12aaaaa21012 2 , =− + ⋅ − Sn() S−+ () n2 Sn () S (). n with a few examples presented in Ta- kk11 kk α ble 1. Additional cos (x ) window where the subscript SG means sliding (10) functions are described in the litera- Goertzel. The filter block diagram for Hz()is shown in Figure 8 where The Hanning window is a member ture [14]. SG α this new filter is recognized as the of a category called cos (x ) window standard Goertzel filter preceded by a functions [13], [14]. These functions comb filter. The sliding Goertzel are also known as generalized cosine Sliding Goertzel DFT DFT filter, unlike the standard windows because their N-point Goertzel filter, has a finite-duration time-domain samples are defined as We can reduce the number of multi- plications required in the SDFT by impulse response identical to that = α −1 creating a new pole/zero pair in its shown in Figure 5(a), for k 2 and =−m π = wn( )∑ (12 ) am cos( mnN / ) Hz()system function [7]. This is N 20. m =0 DFT (11) done by multiplying the numerator Of course, unlike the traditional Goertzel filter in Figure 1, the sliding Goertzel DFT filter’s complex Table 1. cos x) windows, frequency domain coefficients. feedforward computations must be performed for each input time sam- α α α Window function 0 1 2 ple. The sliding Goertzel filter’s Nx x −− sin( )/sin( ) frequency magnitude Rectangular 1.0 response, for k = 2 and N = 20,is provided in Figure 9(a). The asym- Hanning (α = 2) 0.5 0.25 − metrical frequency response is de- Hamming (α = 2) 0.54 0.46 − fined by the filter’s N zeros equally spaced around the z-domain’s unit Blackman (α = 3) 0.42 0.5 0.08 circle in Figure 9(b) due to the N-de- lay comb filter, as well as an addi- 7938 9240 1430 tional (uncanceled) zero located at Exact Blackman (α = 3) − π 18608 18608 18608 ze= jkN2 / on account of the − −−jkN21π / ()1 ezfactor in the HzSG () transfer function’s numerator. In ad- xn() vn() yn() dition, the filter has conjugate poles + + + ± π canceling zeros at ze= jkN2 / . The sliding Goertzel DFT filter is −1 z −N z of interest because its computational workload is less than that of the SDFT. This is because the vn()sam- ples in Figure 8 are real-only due to 2cos(2πkN / ) −e−πjkN2/ z −1 the real-only feedback coefficients. A −1 single-bin DFT computational com- parison, for real-only inputs, is pro- vided in Table 2. For real-time −1 processing requiring spectral updates ▲ 8. Structure of the sliding Goertzel DFT filter. on a sample by sample basis the slid-

78 IEEE SIGNAL PROCESSING MAGAZINE MARCH 2003 ing Goertzel method requires fewer 0 1 multiplies than either the SDFT or −5 k =2 the traditional Goertzel algorithm. N =20 −10 0.5 2/πkN −15 0

dB − Summary 20 −0.5

−25 Imaginary Part k =2 Two The sliding DFT process for spectrum −1 N =20 − Zeros analysis was presented and shown to be 30 − 0 fs/4 fs/2 3fs /4 fs 10 1 more efficient than the popular Frequency Real Part Goertzel algorithm for sample-by-sam- (a) (b) ple DFT bin computations. The sliding DFT provides computational advan- ▲ 9. Sliding Goertzel filter for N = 20 and k = 2: (a) frequency magnitude response and tages over the traditional DFT or FFT (b) z-domain pole/zero locations. for many applications requiring succes- Table 2. Single-bin DFT comparison. sive output calculations, especially when only a subset of the DFT output + Next Snk()1 bins are required. Methods for output Single Snk()Computation Computation stabilization as well as time-domain data windowing by means of fre- Real Real quency- domain convolution were also Multiplies Real Adds Multiplies Real Adds discussed. A modified sliding DFT al- gorithm, called the sliding Goertzel DFT 2N 2N 2N 2N DFT, was proposed to further reduce N N N N computational workload. Standard Goertzel + 2 2 + 1 + 2 2 + 1 Sliding DFT 4N 4N 44 Eric Jacobsen is minister of algorithms at Intel. He currently leads the Advanced Sliding Goertzel DFT N + 2 3N + 1 34 OFDM wireless research effort within Intel Labs and has interests in channel modeling, efficient algorithms, syn- IEEE Signal Processing Magazine and [6] J. Proakis and D. Manolakis, Digital Signal Pro- chronization, coding, adaptive tech- author of Understanding Digital Sig- cessing-Principles, Algorithms, and Applications, nal Processing (Prentice-Hall, 1997). 3rd ed. Upper Saddle River, NJ: Prentice Hall, niques, and other aspects of wireless 1996, pp. 480-481. communication systems. He has devel- He is a member of the IEEE and the Eta Kappa Nu honor society and [7] A. Oppenheim, R. Schafer, and J. Buck, Dis- oped efficient hardware and software crete-Time Signal Processing, 2nd ed. Upper Sad- implementation techniques for signal rides a 1981 Harley Davidson. dle River, NJ: Prentice Hall, 1996, pp. 633-634. processing in radar, imaging, satellite, [8] T. Springer, “Sliding FFT computes frequency and communications systems. With an References spectra in real time,” EDN Mag., pp. 161-170, M.S.E.E. from South Dakota School [1]M.Felder,J.Mason,andB.Evans,“Efficientdual- Sept. 29, 1988. of Mines and Technology, he is a mem- tone multi-frequency detection using the non-uni- [9] L. Rabiner and B. Gold, Theory and Application form discrete Fourier transform,” IEEE Signal Pro- ber of the IEEE and the Eta Kappa Nu of Digital Signal Processing. Upper Saddle River, cessing Lett., vol. 5, pp. 160-163, July 1998. honor society and occasionally road NJ: Prentice Hall, 1975, pp. 382-383. [2] Using the ADSP-2100 Family, vol. 1. Norwood, races a 1995 Taurus SHO. [10] B. Farhang-Boroujeny and Y. Lim, “A com- MA: Analog Devices, 1995, chap. 14 [Online]. ment on the computational complexity of slid- Available: http://www.analog.com/Ana- ing FFT,” IEEE Trans. Circuits Syst. II, vol. 39, Richard Lyons is a consulting systems log_Root/static/library/dspManuals/Us- no. 12, pp. 875-876, Dec. 1992. ing_ADSP-2100_Vol1_books.html engineer and lecturer with Besser As- [11] B. Farhang-Boroujeny and S. Gazor, “General- sociates in Mt. View, California. He [3] S. Gay, J. Hartung, and G. Smith, “Algorithms ized sliding FFT and its application to imple- has been the lead hardware engineer for multi-channel DTMF detection for the mentation of block LMS adaptive filters,” IEEE WERDSP32 family,” in Proc. Int. Conf. ASSP, Trans. Signal Processing, vol. 42, no. 3, pp. for numerous multimillion dollar sig- 1989, pp. 1134-1137. nal processing systems for both the 532-538, Mar. 1994. [4] K. Banks, “The Goertzel algorithm,” Embedded National Security Agency (NSA) and [12] S. Douglas and J. Soh, “A numerically stable Syst. Programming Mag., pp. 34-42, Sept. 2002. sliding-window estimator and its application to TRW Inc. and has taught at the Uni- [5] G. Goertzel, “An algorithm for the evaluation of adaptive filters,” in Proc. 31st Annual Asilomar versity of California Santa Cruz Ex- finite trigonometric series,” American Math. Conf. on Signals, Systems, and Computers, Pacific tension. He is an associate editor for Month., vol. 65, pp. 34-35, 1958. Grove, CA, vol. 1, Nov. 1997, pp. 111-115.

MARCH 2003 IEEE SIGNAL PROCESSING MAGAZINE 79 [13] F. Harris, “On the use of windows for harmonic Here’s the payoff for our efforts. If (A7) to make them compatible with π analysis with the discrete Fourier transform,” we let ze= jkN2 / , a point on the unit causal filters. With no loss in general- Proc. IEEE, vol. 66, pp. 51-84, Jan. 1978. o circle, the (A6) z-transform expression ity, using the following substitutions [14] A. Nuttall, “Some windows with very good becomes the desired time-domain ex- sidelobe behavior,” IEEE Trans. Acoust., Speech, = Snk () SjkN2 π / (), q Signal Processing,, vol. 29, pp. 84-91, Feb. 1981. pression for the sliding DFT as e −= − π Sn Sπ q =−jkN2 / k ()11jkN2 / (), SqSqeππ() (1 ) e eejkN22// jkN =+− π xn() xq ( N 1), Appendix −−xq()1 ejkNN2 / xn()(),−= N xq −1 The derivation of the SDFT time-do- +x(qN+−1) (A8) main expression is straightforward by =−jkN2 π / Sqeπ ()1 example. We start by using the princi- e jkN2 / we can rewrite (A7) in a time-domain ples of the z-transform and write a −−++−xq()(11 xq N ).form for the causal recursive filter generalized z-domain spectrum of a π (A7) Sn()=− Sn (1 ) ejkN2 / four-point time-domain sequence kk = jkN2 π / −−+xn()(). N xn xn(), where n 01, ,..., 3, evaluated at Because the angle of e is an (A9) = π zzo as integer multiple of2 /N, (A7) is seen merely as the qth single-bin DFT, The subscript k remindsusthatthe Sxxz()032=+ () () zoo Xk(),ofxn()for the normalized fre- filter output is associated with the ++xz()1023 xz () . quency of 2πkN/ radians, where kth DFT bin. The z-domain transfer oo(A1) kN=−012, , ,..., 1. We now have an function for the kth bin of the sliding expression for the sliding DFT. How- Likewise we could compute a DFT filter is ever, to turn that expression into a fil- spectrum of the four xn()+ 1 time −N ter difference equation we’re ()1 − z samples as Hz()= . compelled to modify the indices of SDFT − jkN21π / − 1 ez (A10) Sxxz()143=+ ( ) () zoo ++xz()2123 xz () . oo(A2)

If we multiply S ()0 by z we have z o o Szxzxz()032=+ () () 2 zoo o o ++xz()1034 xz () . oo(A3)

Comparing (A2) and (A3) we can rewrite (A2) as

SSzxzx()10=−+ () () 04 ( 4 ). zzoooo (A4)

Thus we can use S ()0 to com- z o puteS ()1 . Moreover we can express z o the spectrum of the four xn()+ 2 samples as

SSzxzx()211=−+ () ()4 () 5. zzoooo (A5)

In the general case, the qth N-point spectrum can be expressed as

Sq()()=−−− Sq11 z xqz ()N zzoo o o ++−xq(). N 1 (A6)

80 IEEE SIGNAL PROCESSING MAGAZINE MARCH 2003