814 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, VOL. CAS-34, NO. 7, JULY 1987 Fundamental Relations Between the LMS Algorithm and the DFT

BERNARD WIDROW, FELLOW, IEEE, PHILIPPE BAUDRENGHIEN, MEMBER, IEEE, MARTIN VETTERLI, MEMBER, IEEE, AND PAUL F. TITCHENER, MEMBER, IEEE

Abstract-The digital Fourier transform (DFT) and the adaptive least with the complex LMS algorithm of Widrow, McCool, and mean square (LMS) algorithm have existed for some time. This paper Ball [2], [3]: establishes a connection between them. The result is the “LMS spectrum analyzer,” a new means for the calculation of the DFT. The method uses a Wj:.,l = wj +2pr,X,. set of N periodic complex phasors whose frequencies are equally spaced from dc to the sampling frequency. The phasors are weighted and then are The terms in this equation are defined as follows. Wj is the summed to generate a reconstructed signal. Weights are adapted to realize current complex weight vector. Wj+, is the next complex a best least squares fit between this reconstructed signal and the input weight vector: signal whose spectrum is to be estimated. The magnitude squares of the weights correspond to the. power spectrum. For a proper choice of adaptation speed, the LMS spectrum analyzer will provide an exact N-sample DFf. New DFT outputs will be available in steady flow after the introduction of each new data sample. (2)

I. INTRODUCTION The complex error cj, is given by OTH THE DIGITAL Fourier transform (DFT) and cj = dj - yj the LMS adaptive algorithm [l] have been known for a (3) long time, and both of these techniques have enjoyed wide where practical application. It is the purpose of this paper to demonstrate relationships between the DFT and the LMS (4 algorithm by showing how the DFT can be calculated by and where making use of the LMS algorithm. This study leads to a new way to calculate the DFT. The resulting algorithm lends itself to parallel computation and VLSI implementa- tion. Using a parallel implementation, processing speed is essentially independent of the size of the data block N. 2n(N-1) ,iNJ Computation time per data sample is thus also indepen- 1 1 dent of the number of frequency points in the calculated Note that xj is the conjugate of Xi and that i = J-1. The spectrum. result of the adaptive process is the weight vector Wj, which will turn out to be associatedwith the DFT of dj. II. THELMS SPECTRUMANALYZER The complex LMS algorithm minimizes the mean square The LMS spectrum analyzer, an adaptive system that of the complex error ej, i.e., minimizes the mean of the can be used in the calculation of the DFT, is shown in Fig. sum of the squares of its real and imaginary parts. The objective is to cause a weighted sum of harmonic 1. The input signal dj to be Fourier analyzed is sampled, and the time index is j. The sampling period is T, and the phasors to best match the input dj, thereby resolving dj sampling frequency is Q = 27~/T rad/s. The input dj into its Fourier components by means of an adaptive could be real or complex. As the system is configured, this algorithm. input serves as the “desired response” for the adaptive The choice of the phasor components of the Xi vector process. The weights wO,wi, . . ‘, w,- i will. in general be can be explained in the following way. We use a total of N complex. The same is true for the weighted sum yj and for basis frequencies, including zero frequency. The funda- the error ej. The weights are adjusted or adapted m accord mental basis frequency is Q/N. The entire set of basis frequencies spans the frequency range from zero to the sampling frequency Q. The fundamental phasor, a time function expressedin terms of the discrete time index j, is Manuscript received December 5, 1986; revised March ?, 1987. The authors are with the Department of Electrical Engmeering, Stan- ford University, Stanford, CA 94305. IEEE Log Number 8714716. (6) 009%4094/87/0700-0814$01.00 01987 IEEE WIDROW et al.: RELATIONS BETWEEN LMS ALGORITHM AND DFT 815

dl INPUT TO SE ANALYZED The product ( Xirzo) of (10) can be evaluated as follows:

COMPLEX 1 / WEIGHTS x;Fo= k 1 ei+JA eiF . . . ,iG-yy--2a(N-1) 1. i0 -2, - 1 I .D i N-l I.1 2771 = C eiN=o. (11) I=0

i !.X j The sum is geometric and can easily be shown to be zero. 1, N - Basically, we are adding N unit magnitude complex vec- 4 . tors that are equally spaced about the unit circle. As a result of (ll), (10) can be simplified to ,W,=2p(d,?,+d,&). (12) Next, W,= (I-2$f2X;)W2+2pd2z2

ALGORITHM = 2p(doXo + d,X, + d2y2)

Fig. 1. LMS Spectrum analyzer. - 4p2x2 ( X$odo + XTyld,). (13) There are two products in (13) that need to be evaluated: Since SIT = 2n, this can be written as

n 2n ei ei; jT = eizj.

All of the components of the X vector are powers of (7). A normalization factor l/m has been included to simplify 1 2n the analysis of the system of Fig. 1. It causesthe Xj vector 1 W2) 2a(N-l)(2) to have unit power. - The complex LMS algorithm is used to obtain the weights. The weight iteration formula is given by (1). e iwN-l) x;xlcN11 eiN .. . e’ N 1i! e-‘N N 1 Using (3) and (4), this iteration formula can be expressed = as 0. (15) Therefore, wj+l = wj +2pcjx, W,=2p(dojE;,+d,~,+d2~2). (16) =l$+2p~j(dj- yj) We can easily generalizethis result: _-_ ^ z/ . j-l = Wj+“+Xj\d J - xyw;.) - 5=2p c d,,$,,,, j=l;.., N. (17) =WL3,,11,rj I Ap&+,.L.jY - 2$ijXjTWj m=O An interesting case is that for j = N. From (17) we obtain = (I-2px,XjT)Wj+2pdjx,. (8) r1 1 Let the initial weight vector be set to zero. On a step-by-step N-l basis, the weight vector versus time can be induced. Using WN=2p 1 formula (8), m=O

w,= (r-2pX,X,T)Wo+i,do~o N-l = (1 -,2p~OX:)0+2,udOX0 c dm m=O = 2pdoXo. (9) N-l 21-1 ,~odm&” Next, =- (18) w,= (I-2p$X~)Wl+2pdlXl m = 2pdoXo -4p2~lX:~od0 +2pd,X, N-l m~odme-iqz71 =2p(do~o+d,~l)-4p2~,(X~~o)d0. (10) 816 IEEE TRANSAkIONS ON CIRCUiTS AND SYSTEMS, VOL. CAS-34, NO. 7, JULY 1987

Except for the scale factor, it is clear from the above that With some further algebraic work along the same lines, the elements of IV,,, comprise the values of the DFT of dj a completely general formula for Wj can be derived which over the uniform time window from j = 0 to j = N - 1. would be applicable over all j >l, assuming the initial Formula (17) is based on the orthogonality of X vectors condition W, = 0. The result is at different times and it applies for 1 Q j < N. Beyond this range, we need a new formula since, for example, X,, is j-l identical to and of course not orthogonal to X,. Let us H5=2p c d,X, calculate W,, i. Once again from (8) m=~‘-N j-N-1 W N+l= (I-2pgXNX;)WN+2pdNXN +2i(l-2~) c d,X, m= j-2N

= (‘-2rX,X,‘)(2y~~~d~~~) +2pd,X, j-2N-1 +2p(l-~&i)~ c d,F, m= j-3N =k&,n~m-4p2x,i X/i;dm~m)- (19) j-3N-1 +21.1(1-2~)~- c d,F, A product in (19) needs to be evaluated: mi- j-4N (26j

In using this formula, it is understood that the allowed (20) ranges of the index m for each of the sums are set by the Since X, = X0 becauseof the peiiodicity of (14) ’ upper and lower limits unless these limits are negative. The ranges of m must first be m >, 0; then they are determined 1 by the sum’s limits. Thus, in applying the formula, one 1 x,7$=x~Fo+ 1 ... l] . =l. (21) seesthat for N > j > 0 only the first term in the series(26) exists, all the rest are zero, and the first term agreesas it IIi should with the previously derived formula (17). For j = N, Furthermore, this formula gives W, as in (18). W, is proportional to the

N-l M-1 DFT of the first iv samplesof d,. XN’C d,,,zti=X,T 1 d,z,,,=O. (22) A critical choice of p is the value p = l/2. Making this choice, the series (26) reducesto its first term regardlessof This results from the orthogonalitjr between X0 and the value of j. Let y =1/2 and let j be any integer multiple of N. At time IN, the formula for Wj becomes XI, x2,* * *, XN-i. Returning now to (19) and making use of (20), (21) and (22) we can obtain Wh+l: IN-1 yN= c d,X* m=IN-N wN+l = 2p ; dj,,, - 4p2FNd, m=O IN-1

c dm = 2p f d,,& +2~(1-2~)~,d,. (23) m=lN-N *=I IN-1 1 m=g- Ndmeei%m The next weight vector can be obtained similarly. It fol- =77 . (27) lows that

N+l IN-1 m =;- Nd,e-i~m W N+2 = 2p c d,,$,,, -4p2%N+Idl -4p21G;,do m=O

N+l =2p 1 d,~m+2p(1-2p)(~IdI+~od,j). (24) It is evident from this expression that W,, is proportional m=2 to the DFT of the input signal dj. Thus far, we have established the following. If we set This result can be generalizedas p = l/2, the weight vector wj will be exactly proportional j-l j-N-1 to the DFT of the previous N samples of the input dj at F5= 2p 1 dm~,+2p(1-2p) 1 d,F,, times j that are integer multiples of N. The DFT will be m=(j-N) m=O correct in both magnitude and phase. Furthermore, from j=N+1;..,2N. (25) (26), the magnitude squares of the weights at any time j can be seen to be exactly proportional to the magnitude Formula (25) applies to W’s between W,,, and W,,. squares of the DFT of the last j samples of input data. WlDROW et al.: RELATIONS BETWEEN LMS ALGORITHM AND DFT 817

Then, for all j,

j-l FQ= c d,;i;, m=j-N

j-l

...... C dm m=j-N j-l c dme-i;m

m=j-N 1 =- j-l 2vrk Frequency , . t t t t index k=O k=l k=2 k=N-2 k-N-1 m c d,e-‘Nm m=J’-N 66,

Fig. 2. Scheme for steady-flow calculation of the discrete Fourier trans- i-1 form (DFT). C dme-iZ$!l*

m=J’-N Thus, the magnitude squaresof the weights give the DFT power spectrum of the input dj. Although formula (29) is similar to (28), they differ signifi- cantly. To make them look more alike, let the index m in (29) be changed to m + j - N, and let the new index run III. STEADY-FLOWCOMPUTATION OF THE DFT BY from 0 t? N - 1. The resuit will be no change in the THE LMS SPECTRUM ANALYZER summations but this will make it easierto connect (29) and Normally, the DFT is calculated with blocks of data N (28). &cordjngly, samples long. Sometimesit is of interest to recompute and display the DFT with the arrival of each new input data N-l

sample. The result is a “steady-flow” DFT. This steady- C dm+j-N flow DFT can be calculated with the system of Fig. 1. m=O Calculation of the steady-flow DFT is suggestedby the N-l -ig(m+j-N) block diagram of Fig. 2. The input data are applied to a C dm+j-Ne tapped delay line, and the signals at the taps are applied as m=O inputs to the DFT. The output (DFT)j is a vector whose components are indexed in terms of frequency k and N-l 2nk I. (30) -iy(m+ j- N) whose values vary with the’ time index j. We define the c dm+j-Ne steady-flow DFT vector for data sliding along in time as m’=O

N-l C d,+j~Ne-i~(m+ j-N) N-l

C dj-(N-l)+* m-0 m=O N-l c dj-cN-l),,e-‘~m The - N terms in the exponents can be deleted since they m-0 only account for additive multiples of 2mi in the argu- ments of the exponentials. Therefore, (DFT)j’h N-l 2rk (28) C dj-(N-l)+,ehiNm N-l m=O c dm+j-N m'70

277 N-1 N-l e-i-h;’ c dj-cNpl),me-i?m m~od,+~~~e~i~m . (31) m-0

-i?T!.$Tj Nfl d,+j~Ne-i3$2* We can relate the steady-flow DFT vector to the adap- e tive process in the following way. The adaptive weight m=O vector as a function of the time index j is given by the general formula (26). Consider this formula with ~=1/2. Formula (31) is quite similar to (28), yet it is still different. 818 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, VOL. CAS-34, NO. 7, JULY 1987

d, INPUT TO BE ANALYZED , system of Fig. 3, is in general given by LMS spectrum analyzer output vector i 1 SPECTRUM- 2?r ANALYZER eiyj OUTPUT VECTOR _ 1 =- .25m =(DW,., e’TJ (33) fl if p = l/Z

The weight vector yj is expressed for all time j by formula (35). It is an mfinite sum of sums. Thus far, with p=1/2,weh ad only to contend with the first sum. Refer to (32): Fig. 3. LMS spectrum analyzer EO+K computation of the steady-flow ’ LMS ’ spectrum analyzer = (DFT)j-1. (34) The relationship between them is as follows: output \ vector 1 j (DFT)j-1 Now refer to formula (26), and let ~1take a value other 1 than l/2. Then, ,211. e’TJ ’ LMS 1 ,2m spectrum =- erNJ y.. (32) fi analyzer output 2s(N - 1) \ vector e'NJ The second sum of (26) involves the input data dj taken The weights.at time j are multiplied by the corresponding over a range of time which is lagged by N samples relative phasor components of the vector Xi (defined by (5)) to to the corresponding range of time of the first sum. The give the DFT vector of time j - 1. Fig. 3 is a block time range of the third sum is lagged by 2N saniples diagram showing how the adaptive system of Fig. 1 can be relative to the time range of the.first sum, and so forth. used to provide the steady-flow DFT. The data samples in the sums are multiplied by the respec- We had seen in Section II that the components of the tive components of xj, and in accord with (33), these weight vector give the DFT at times j = N,2N,3N, * * . . components are premultiplied by the respective compo- Now we see that the components of the weighted X vector nents of Xj, yielding the LMS spectrum analyzer output give the steady-flow DFT at 41 times. (The weighted X vector. We note that Xi and xj are periodic with a period vector components are equal to the weights themselves at of N samples. The first sum of (26) gave us the DFT at times j=O, N,2N;.* .) Recall that all of these results are time j - 1. The second sum will give the DFT at time based on the initial weights being zero. With ~=1/2, the j - 1- N, and so forth. With proper coefficients for the adaptive algorithm is stable and “forgets” initial condi- terms of the series, (35) becomes tions after N samples of data have been inputted. A LMS nonzero initial weight vector ‘can be regarded as having spectrum resulted from adapting with some data before time zero. analyzer Equation (26) shows that with ~=1/2, data older than N output samples have no effect oh wj. vector ’ +2p(1-2p)(DFT)j-l-N IV. A GEOMETRIC COHERENT-AVERAGE DFT +2Cl(1-2CL)2(DFT)j-1-2N When ~1is set at some value other than l/2, a new form +2~(1--?IL)3(DFT)/-I-sN of DFT is calculated by the LMS spectrum analyzer of Fig. 3. The weighted X vector, the set of outputs’ of the . (36) WIDROW et d.: RELATIONS BETWEEN LMS ALGORITHM AND DFT 819

‘From this formula, one can see that the output of the FUTURE (HASN’T LMS spectrum analyzer of Fig. 3 is related to the steady- HAPPENEG flow DFT. It is a geometrically weighted coherent sum of DFT’s, each taken from a data block of N samples. The first DFT is taken from the present and N -1 previous MOW input data samples and is weighted by the coefficient 2~. WAVE F The second DFT is taken from the Nth previous sample and the N - 1 samples previous to that. It is weighted by the coefficient 2~(1- 2~), and so forth. A geometric ratio can be defined as

r d (1-2~). (37)

Formula (36) can be written as a geometric sum:

’ LMS \ spectrum (N complex numbers) Fig. 4. Computation of a geometric coherent-average steady-flow DFT. analyzer =2p f r’(DFT)j-l-/N. (38) output I=0 \ vector , j V. CONCLUSIONS This paper established a connection between the digital Fig. 4 illustrates the computation of the geometric Fourier transform (DFT) and the adaptive least mean coherent-average steady-flow DFT as specified by for- square (LMS) algorithm. The existenceof a close relation- mulas (36) and (38). The data waveform moves to the left ship between least squares fitting methods and spectral with time. The present data sample dj enters the picture at estimation is not surprising once we recall that the classic the “present time” as designated-on the time axis. Each Fourier coefficients can be obtained by best least squares sample time, an infinite number of N-point DFT’s are fitting of a finite number of sines and cosines to the input taken from the sampled data waveform. The outputs of the signal. DFT’s are weighted in proportion to 1, r, r2, . . . . The We have shown that the exact N-sample DFT can be system output is a geometrically weighted coherent sum of computed by means of the adaptive LMS algorithm, with a DFT’s in accord with (38). To best understand this illus- particular choice of the speed of adaptation (p = l/2). tration, think in terms of the data moving while the DFT More generally, it was shown that the “LMS spectral windows remain stationary. The DFT windows weight analyzer” computes a sliding geometric.coherent-average uniformally within each DFT data block, and geometri- DFT. The output is a weighted coherent sum of DFT’s cally from block to block. taken on adjacent N-sample data blocks. The data are In order for the adaptive algorithm of Figs. 1 and 3 to weighted uniformly within each block, and geometrically be stable or, equivalently, for the DFT computation of Fig. from block to block. 4 to be stable, it is necessary and sufficient that the The DFT calculates only an,approximation to the Four- magnitude of the geometric ratio be less than 1. Accord- ier transform. The LMS algorithm calculates only an ap- ingly, for stability, proximation to the least squaressolution. By making p= l/2, the errors in these approximations match and the H-d LMS spectrum analyzer outputs an exact DFT. or REFERENCES ll-2pl Cl PI B. Widrow and M. E. Hoff, “Adaptive switching circuits,” in 1960 WESCON Conf. Rec., pt. 4, pp. 96-140. or PI B. Widrow, J. M. McCool, and M. Ball, “The complex LMS algorithm,” Proc. IEEE, vol. 63, pp. 719-720, Apr. 1975. I l>/.l>Q. [31 B. Widrow and S. D. Steams, Adaptive . En- (3% glewood Cliffs, NJ: Prentice-Hall, 1985. 141 L. R. Rabiner and B. Gold, Theory and Application of Digital Signal Assuming that the stability condition (39) has been met Processing. Englewood Cliffs, NJ: Prentice-Hall, 1975. [51 A. V. Oppenhelm and R. W. Schafer, Digital Signal Processing. and that the adaptive processhas been going on for a very Englewood Cliffs, NJ: Prentice-Hall, 1.975. long time, one can visualize from Fig. 4 that the oldest [61 SEykin, Introduction to Adaptive Filters. New York: Macmillan, data will be weighted by a very high power of r, which will [71 Haykin, Adaptive Filter Theory. Englewood Cliffs, NJ: Prentice- be a very small number. The effects of very old data will Hall, 1986. PI C. F. N. Cowan and P. M. Grant, Adaptive Filters. Englewood therefore be inconsequential. Cliffs, NJ: Prentice-Hall, 1985. 820 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, VOL. CAS-34, NO. 7, JULY 1987

Bernard Widrow (M’58-SM’75-F’76) was born Martin Vetterli (S’86-M’86) was born in Switzer- in Norwich, CT, on December 24, 1929. He land in 1957. He received the Engineer’s degree received the S.B., SM., and Sc.D. degrees from from the Eidgenijsssische Technische Hochschule the Massachusetts Institute of Technology, Cam- Ziirich, Switzerland, in 1981, the Master of Sci- bridge, in 1951, 1953, and 1956, respectively. ence degree from , Stanford, From 1951 to 1956, he was a Staff Member of CA, in 1982, and the Doctorat es Science degree the M.I.T. Lincoln Laboratory and a Research from the Ecole Polytechnique Fed&ale de Assistant in the Department of Electrical En- Lausanne, Switzerland, in 1986. gineering of M.I.T. He joined the M.I.T. faculty In 1981, he worked as a Software Engineer for in 1956 and taught classes in radar, theory of Siemens, Switzerland; in 1982, he was a Re- sampled-data systems, and control theory. In search Assistant with the De- 1959, he joined the faculty of Stanford University, Stanford, CA, and is partment of Stanford University; and from 1983 to 1986, he was a now Professor of . He is presently engaged in Researcher at the Ecole Polytechnique. In the Fall of 1984, he was a research and teaching in systems theory, pattern recognition, adaptive temporary member of the Technical Staff at AT&T Bell Laboratories in filtering, and adaptive control systems. He is an Associate Editor of Holmdel, NJ, working on VLSI design. Since the Fall of 1986, he has Adaptive Control and Signal Processing, Neural Networks, Information been an Associate Research Scientist with ’s Center Sciences, and Pattern Recognition. He is coauthor with S. D. Steams of for Telecommunications Research in New York, NY. His research inter- Adaptiue Signal Processing (Prentice-Hall). In 1986, he received the IEEE ests include algorithm design for VLSI, computational complexity, multi- Alexander Graham Bell Medal for fundamental contributions to adaptive rate signal processing, and packet video transmission. filtering, adaptive noise canceling, and adaptive antennas. Dr. Vetterli was the recipient of the 1984 Best Paper Award from the Professor Widrow is a member of the American Association of Univer- European Signal Processing Society (EURASIP) and of the 1986 Re- sity Professors, the Pattern Recognition Society, Sigma Xi, Tau Beta Pi, search Prize from the Brown Bovery Corporation (Switzerland). and is a Fellow of the American Association for the Advancement of Science.

Paul F. Titchener (S’79-M’82) was born in Cincinnati, OH, in 1953. He received the B.S. degree (magna cum laude) in electrical engineer- mg from the University of Cincinnati in 1978, and the M.S. and Ph.D. degrees in electrical Philippe Baudrenghien (M’85) was born in Liege, engineering from Stanford University, in 1980 Belgium, on November 20,1958. He received the and 1983, respectively. “Li-cence en Sciences Appliquees” from the Uni- From 1983 to 1984, he was employed by versite Libre de Bruxelles. Belgium, in 1981. and Hunter Geophysics in Mountain View, CA, and the M.S. and Ph.D degrees in electrical engineer- directed the application of digital-signal ing from Stanford University, in 1982 and 1984, processing methods to geophysical measure- respectively. ments. Since 1984, he has worked in the research group at Comdisco Since 1985, he has been with the Radio Resources in San Francisco, CA. His areas of professional interest and Frequency Group of the Super Proton Synchro- activity include adaptive signal processing and signal-processing software tron Division at CERN (European Center for systems. Nuclear Research) in Geneva, Switzerland, where Dr. Titchener is a member of the Audio Engineering Society, the he is engaged in the de!sign and development of Beam Control and Beam Society of Automotive Engineers, the Hayward Engineering Club, and Observation Systems. Tau Beta Pi.