Informal Development of Fast Fourier Series

Chapter 11.05 Informal Development of Fast Fourier Transform (FFT)

Introduction Recalled the DFT pairs of Equations (22) and (23) (of Chapter 11.04) and swapping the indices n,k one obtains:

N 1  2  ~ in w0  k  N  (1) Cn   f (k)e k0

N 1  2  ~ in w0  k  1   N  f (k)   C n e (2)  N n0 where n,k  0,1,2,3,...N 1 (3) While the above DFT pairs of equations are convenient for computer implementation, they still require substantial computation effort. The objective of this chapter, therefore, is to develop the improved version of DFT (namely Fast Fourier Transform, or FFT) so that much larger sampling data can be handled more efficiently. Let 2 i N i2 E  e N (hence E  e  cos(2 )  isin(2 ) 1) (4) Then Equation (1) and Equation (2) become N 1 ~ ~ nk Cn  C(n)   f (k)E (5) k0 N 1  1  ~ nk f (k)   Cn E  N n0 It should be emphasized here that in performing interpolation, one usually has to solve a system of equations to determine the unknown coefficients of the linear combination of basis functions that fit the given data. For example, if N  4 , then one need to solve the following ~ system (see the second part of Equation (5)), for obtaining C, with a given vector f .

11.05.1 11.05.2 Chapter 11.05

~ 1 1 1 1 C(0)  f (0)  1 2 3  ~     1 1 E E E C(1)  f (1)   2 4 6  ~     (5a)  N 1 E E E C(2)  f (2)  3 6 9  ~    1 E E E C(3)  f (3)

However, the inverse of the above coefficient matrix can be easily obtained as 1  1 1 1 1  1 1 1 1    1 2 3   1 2 3   1  1 E E E 1 E E E        N 1 E 2 E 4 E 6  1 E 2 E 4 E 6    3 6 9   3 6 9   1 E E E  1 E E E  ~ Thus, the unknown vector C can be computed as matrix times vector operations, as following: Assuming N  4  2(r2) , then (see the first part of Equation (5)) ~ C(0) E (0)(0) E (0)(1) E (0)(2) E (0)(3)  f (0)  ~   (1)(0) (1)(1) (1)(2) (1)(3)   C(1) E E E E  f (1)  ~     (6) C(2) E (2)(0) E (2)(1) E (2)(2) E (2)(3)  f (2)       ~  (3)(0) (3)(1) (3)(2) (3)(3)   C(3) E E E E  f (3) ~ C(0) E 0 E 0 E 0 E 0  f (0)  ~   0 1 2 3   C(1) E E E E  f (1)  ~     (7) C(2) E 0 E 2 E 4 E 6  f (2)       ~  0 3 6 9   C(3) E E E E  f (3) For N  4 , n  2 and k  3, then E nk  E 6  E (N 4) E 2

The term inside the square bracket is equal to 1, since

For , and , then

Informal Development of Fast Fourier Series 11.05.3

In the above equation, one should recall the following Euler identity

Thus, in general (for

where (8)

= remainder of Remarks: Matrix times vector, shown in Equation (7), will require 16 (or complex multiplications and

12 (or ) complex additions.

Usage of Equation (8) will help to reduce the number of operation counts, as explained in the next section.

Factorized Matrix and Further Operation Count Equation (7) can be factorized as: (9) Remarks: The theory behind the 2 matrices on the right hand side (RHS) of Equation (9) will be clearly explained soon (see Equations 11 and 15, in chapter 11.06). The order of the left-hand-side (LHS) vector has been changed, such as rows 2 and 3 have been swapped. Let the row-interchanged LHS vector be defined as (10) Now performing the inner-product (matrix times vector) on the RHS of Equation (9), one obtains (11) or (11a)

(11c) since

Equations (11a through 11d) for the “inner” matrix times vector requires 2 complex multiplications and 4 complex additions. In Equations (11a–11d), is intentionally not reduced to the numerical value of 1.0 to facilitate the discussions of more general cases. Finally, performing the “outer” product (matrix times vector) on the RHS of Equation (9), one obtains: (13)

11.05.4 Chapter 11.05 or (14a)

Again, Equations (14a-14d) requires 2 complex multiplications and 4 complex additions. Thus, the complete RHS of Equation (9) can be computed by only 4 complex multiplications (or and 8 complex additions (or ). Since computational time is mainly controlled by the number of multiplications, hence implementing Equation (9) will significantly reduce the number of multiplication, as compared to direct matrix times vector operations (as shown in Equation (7)).

For large value of data points ( ), the ratio of complex multiplications by using Equation (7) and Equation (9) can be computed as (15) For Equation (15) gives

, which basically implies that the number of complex multiplications involved in Equation (9) is about 372 times less than the one involved in Equation (7). Informal Development of Fast Fourier Series 11.05.5

Graphical flow of Equation (9), for case Equation (9) can also be presented in the graphical form, as shown in Figure 1.

Figure 1 Graphical form of FFT (Equation 9) for the case . Remarks: a) Computed vector 1 does correspond to Equations (11a–11d). b) Computed vector 2 does correspond to Equations (14a-14d). c) Since in this example, one needs to compute 2 vectors

d) Each node in the graph is computed from 2( ) nodes in the “previous” vector. e) Factor (such as appears near the arrow head of the transmission path. Absence of

implies that . For example: , which is the same as Equation (14c).

Graphical Flow of Equation (9), for case To see a more detailed computational patterns of FFT, a slightly larger data size ( ) is shown in the graphical form, as indicated in Figure 2. 11.05.6 Chapter 11.05

Figure 2 Graphical form of FFT (Equation 9) for the case . Informal Development of Fast Fourier Series 11.05.7

Companion Node Observation Careful observation of Figure 2 reveals that for each computed -vector (where ; and ), we can always find two (companion) nodes which came from the same pair of nodes in the previous vector. For example, and are computed in terms of and . Similarly, the companion nodes and are computed from the same pair of nodes and .

Furthermore, the computation of companion nodes is independent of other nodes (within the -vector). Therefore, the computed and will override the original space of

and . Similarly, the computed and will over ride the space occupied by and , which in turn, will occupy the original space of and . Hence, only one complex vector (or 2 real vectors) of length are needed for the entire FFT process. Companion Node Spacing Observing Figure 2, the following statements can be made: a) in the first vector , the companion nodes and is separated by (or spaces.

b) In the second vector , the companion nodes and is separated by (or

Companion Node Computation The operation counts in any companion nodes (of the vector), such as and can be explained as (see Figure 2): (16)

Thus, the companion nodes and computation will require 1 complex multiplication and 2 complex additions (see Equations (16-17)). The weighting factors for the companion nodes [ and ] are and , respectively.

Thus, in general (18) (19) Skipping computation of certain nodes Because the pair of companion nodes “ ” and are separated by the “distance”. ( , hence, at the level, after every node computation, then the next nodes will be skipped! (see Figure 2).

Determination of The values of “ ” can de determined by the following steps: Step 1: Express the index ( ) in binary form, using bits. For , and ; or , one obtains

11.05.8 Chapter 11.05

Step2: Sliding this binary number “ ” positions to the right, and fill in zeros, the results are

It is important to realize that the results of Step 2 (0,0,1,0) is equivalent to express an integer

in the binary formats. In other words, .

Step3: Reverse the order of the bits, then (0,0,1,0) becomes 0,1,0,0 . Thus,

It is “NOT” really necessary to perform Step 3, since the results of Step 2 can be used to compute “ ” as following

In conclusion, for and ; the computation of companion nodes from general formulas (see

Equations (18) and (19)) gives

The above 2 equations are identical to Equations (16) and (17). Computer Implementation to Find Value of “U” (in

Based on the previous discussions (with the 3-step procedures), to find the value of “ ”, one only needs a procedure to express an integer in binary formats, with “ ” bits. Assuming (a base 10 number) can be expressed as (assuming bits) (20)

Divide by 2 (say, , multiply the truncated result by 2 (say, and compute the difference between the original number and .

(21) If then the bit

If then the bit

Once the bit has been determined, the value of is set to (or value of is reduced by a factor of 2), since the previous

and similar process can be used to determine the value of bit etc.

Example 1 For ; bits and , Find the value of .

Determine the bit : (Index )

Initialize

Informal Development of Fast Fourier Series 11.05.9

Determine the bit [Index ]

11.05.10 Chapter 11.05

Remarks: Although the “intermediate” results might be different, at the end of the do-loop process (computing ), both formulas for “ ”, such as

where (23) will eventually give the same final answers for “ ”. Example 2 For ; and Compute the corresponding value of ?

One has

Determine the bit (Index )

Initialize

Informal Development of Fast Fourier Series 11.05.11 or

11.05.12 Chapter 11.05

Remarks: Although both formulas for “ ”, shown in Equations (22) and (23), will yield the same “final” value of “ ”. Implementation of Equation (22) will be more computationally efficient. Unscrambling the FFT For the case (see Figure 2), the final ‘bit-reversing’ operation for FFT is shown in Figure 3. f (k) ~ 4 C(n) ~ 0 f4 (0000) C(0000) ~ 1 f4 (0001) C(0001) ~ 2 f4 (0010) C(0010) ~ 3 f4 (0011) C(0011) ~ 4 f4 (0100) C(0100) ~ 5 f4 (0101) C(0101) ~ 6 f4 (0110) C(0110) ~ 7 f4 (0111) C(0111) ~ 8 f4 (1000) C(1000) ~ 9 f4 (1001) C(1001) ~ 10 f4 (1010) C(1010) ~ 11 f4 (1011) C(1011) ~ 12 f4 (1100) C(1100) ~ 13 f4 (1101) C(1101) ~ 14 f4 (1110) C(1110) ~ 15 f4 (1111) C(1111)

= skip the operation Informal Development of Fast Fourier Series 11.05.13

Figure 3 Final “bit-reversing” for FFT (with N  2r  24  16) .

For do-loop index k  0  (0,0,0,0)  i  (0,0,0,0)  0 If (i.GT.k)Then

T  f 4 (k)

f 4 (k)  f 4 (i)

f 4 (i)  T Endif

Hence, f 4 (0) = f 4 (0) ; no swapping.

For k  1  (0,0,0,1)  i  (1,0,0,0)  0 =bit-reversion=8 If (i.GT.k)Then

T  f 4 (k)

f 4 (k)  f 4 (i)

f 4 (i)  T Endif

Hence, f 4 (1) = f 4 (8) ; are swapped.

For k  2  (0,0,1,0)  i  (0,1,0,0)  4

For k  3  (0,0,1,1)  i  (1,1,0,0)  12

For k  4  (0,1,0,0)  i  (0,0,1,0)  2 In this case, since “i ” is not greater than “ k ”.

Hence, no swapping, since f 4 (k  2) and f 4 (i  4) ; had already been swapped earlier. . . . etc…

Computer Implementation of FFT (for case N  2r ). The pair of companion nodes computation are given by Equations (18, 19). To avoid “complex number” operations, Equation (18) can be computed based on “real number” operations, as following: R I R I f L (k)  if L (k) f L1 (k)  if L1 (k)

U ,R U ,I  R N I N   E  iE   f L1(k  )  if L1(k  ) (24)  2L 2L  In Equation (24), the superscripts R and I denote real and imaginary components, respectively. 11.05.14 Chapter 11.05

Multiplying the last 2 complex numbers, one obtains: R I R I f L (k)  if L (k) f L1 (k)  if L1 (k)

 U ,R R N U ,I I N   E  f L1 (k  )  E  f L1 (k  )  2L 2L 

 U ,R I N U ,I R N   iE  f L1(k  )  E  f L1(k  ) (25)  2L 2L  Equating the real (and then, imaginary) components on the Left-Hand-Side (LHS), and the Right-Hand-Side (RHS) of Equation (25), one obtains

R R  U ,R R N U ,I I N  f L (k) f L1 (k) E  f L1 (k  )  E  f L1(k  ) (26a)  2L 2L 

I I  U ,R I N U ,I R N  f L (k) f L1 (k) E  f L1 (k  )  E  f L1 (k  ) (26b)  2L 2L  Recall Equation (4) 2 i E  e N Hence 2 U  i  U  N  E  e  (27)   2U i  e N  e i  cos( )  isin( ) where 2U   (28) N 6.28U  N Thus EU ,R  cos( ) (29a) EU ,I  sin( ) (29b) Substituting Equations (29a, 29b) into Equations (26a, 26b), one gets

R R  R N I N  fL (k) fL1(k) cos( )  fL1(k  )  sin( )  fL1(k  ) (30a)  2L 2L 

I I  I N R N  fL (k) fL1(k) cos( )  fL1(k  )  sin( )  fL1(k  ) (30b)  2L 2L  Similarly, the single (complex number) Equation (19) can be expressed as 2 equivalent (real number) equations, such as equations (30a, 30b). Listing of computer implementation of serial FFT algorithm is given at http://numericalmethods.eng.usf.edu/simulations/mtl/11fft/general_fft.m Informal Development of Fast Fourier Series 11.05.15

References [1] E.Oran Brigham, The Fast Fourier Transform, Prentice-Hall, Inc. (1974).

[2] S.C. Chapra, and R.P. Canale, Numerical Methods for Engineers, 4th Edition, Mc-Graw Hill (2002).

[3] W.H . Press, B.P. Flannery, S.A. Tenkolsky, and W.T. Vetterling, Numerical Recipies, Cambridge University Press (1989), Chapter 12.

[4] M.T. Heath, Scientific Computing, Mc-Graw Hill (1997).

[5] H. Joseph Weaver, Applications of Discrete and Continuous Fourier Analysis, John Wiley & Sons, Inc. (1983).

FAST FOURIER TRANSFORM Topic Informal Development of Fast Fourier Series Summary Textbook notes on the informal development of fast Fourier series Major General Engineering Authors Duc Nguyen Date 2018 年 6 月 6 日 Web Site http://numericalmethods.eng.usf.edu