<<

fea-maslen.qxp 10/15/01 10:03 AM Page 1151

The Cooley-Tukey FFT and Theory David K. Maslen and Daniel N. Rockmore

Pure and Applied —Two Sides of a rift in the mathematical community by showing the Coin ultimate unity of pure and applied mathematics. In November of 1979 there appeared in the Bulletin We will show that investigation of finite and fast of the AMS a paper by L. Auslander and . Tolim- Fourier transforms continues to be a varied and in- ieri [3] with the delightful title “Is Computing with teresting direction of mathematical research. the Finite Pure or Applied Math- Whereas Auslander and Tolimieri concentrated on ematics?” This rhetorical question was answered relations to nilpotent harmonic analysis and theta by showing that in fact the finite Fourier transform functions, we emphasize connections between the and the family of efficient used to com- famous Cooley-Tukey FFT and group representa- pute it (the Fast Fourier Transform (FFT), a pillar tion theory. In this way we hope to provide further of the world of digital ) are of in- evidence of the rich interplay of ideas which can terest to both pure and applied mathematicians. Auslander had come of age as an applied math- be found at the nexus of pure and applied math- ematician at a time when pure and applied math- ematics. ematicians still received much the same training. Background The ends towards which these skills were then di- rected became a matter of taste. As Tolimieri retells The finite Fourier transform or discrete Fourier it,1 Auslander had become distressed at the de- transform (DFT) has several representation theo- velopment of a separate discipline of applied math- retic interpretations: either as an exact computa- ematics which had grown apart from much of core tion of the Fourier coefficients of a function on the mathematics. The effect of this development was cyclic group Z/nZ or a function of band-limit n on detrimental to both sides. On the one hand, applied the circle S1, or as an approximation to the Fourier mathematicians had fewer tools to bring to prob- transform of a function on the real line. For each lems, and, conversely, pure mathematicians were of these points of view there is a natural group- often ignoring the fertile bed of inspiration pro- theoretic generalization and also a corresponding vided by real-world problems. Auslander hoped set of efficient algorithms for computing the quan- their paper would help mend a growing perceived tities involved. These algorithms collectively make David K. Maslen is a mathematician at Susquehanna up the Fast Fourier Transform or FFT. International Group LLP. His e-mail address is Formally, the DFT is a linear transformation [email protected]. mapping any complex vector of length n, f = Daniel N. Rockmore is professor of mathematics and com- ( f (0),...,f(n − 1))t ∈ Cn, to its Fourier transform, puter science at Dartmouth College and on the external ∈ n faculty of the Santa Fe Institute. His e-mail address is f C . The k th component of f , the DFT of [email protected]. He is supported in part f at frequency k, is by NSF PFF Award DMS-9553134, AFOSR F49620-00-1- n−1 0280, and DOJ 2000-DT-CX-K001. He would also like to (1) f(k)= f (j)e2πijk/n, thank the Santa Fe Institute and the Courant Institute for j=0 their hospitality during some of the writing. Pieces of the √ introduction are similar to his paper “The FFT—an where i = −1, and the inverse Fourier transform the whole family can use”, which appeared is in Computing in Science & Engineering, January 2000, n−1 1 − pp. 62–67. (2) f (j)= f(k)e 2πijk/n. n 1Private communication. k=0

NOVEMBER 2001 NOTICES OF THE AMS 1151 fea-maslen.qxp 10/15/01 10:03 AM Page 1152

Thus, with respect to the standard basis, the DFT In the nonabelian case, ΘG admits an analo- can be expressed as the -vector product gous in terms of irreducible polyno- f = Fn · f, where Fn is the Fourier matrix of order mials of the form n whose j,k entry is equal to e2πijk/n. Computing a DFT directly would require n2 scalar operations.2 ΘD(G) = det D(g)xg , ∈ Instead, the FFT is a family of algorithms for com- g G puting the DFT of any f ∈ Cn in O(n log n) opera- where D is an irreducible matrix representation tions. Since inversion can be framed as the DFT of of G. The inner sum here is a generic Fourier trans- ˇ 1 − form over G. See [12] for a beautiful historical the function f (k)= n f ( k), the FFT also gives an efficient inverse Fourier transform. exposition of these ideas. One of the main practical implications of the FFT Gauss’s interests ranged over all areas of math- is that it allows any cyclically invariant linear op- ematics and its applications, so it is perhaps not erator to be applied to a vector in only O(n log n) surprising that the first appearance of an FFT can also scalar operations. Indeed, the DFT diagonalizes be traced back to him [10]. Gauss was interested in any cyclic group-invariant operator, making pos- certain astronomical calculations, a recurrent area sible the following algorithm: (1) Compute the of application of the FFT, needed for interpolation Fourier transform (DFT). (2) Multiply the DFT by of asteroidal orbits from a finite set of equally spaced the eigenvalues of the operator, which are also observations. Surely the prospect of a huge labori- found using the Fourier transform. (3) Compute the ous hand calculation was good motivation for the de- inverse Fourier transform of the result. This tech- velopment of a fast algorithm. Making fewer hand nique is the basis of efficient (i.e., calculations also implies less opportunity for error ) and is also used for the efficient nu- and hence increased ! merical solution of partial differential equations. Gauss wanted to compute the Fourier coeffi- Some History cients ak and bk of a function represented by a Since the Fourier matrix is effectively the character of bandwidth n, table of a cyclic group, it is not surprising that some m m of its earliest appearances are in , the (5) f (x)= ak cos 2πkx+ bk sin 2πkx, subject which gave birth to character theory. Con- k=0 k=1 sideration of the Fourier matrix goes back at least as − far as to Gauss, who was interested in its connec- where m =(n 1)/2 for n odd and m = n/2 for n tions to quadratic reciprocity. In particular, Gauss even. He first observed that the Fourier coeffi- showed that for odd primes p and q, cients can be computed by a DFT of length n using the values of f at equispaced sample points. p q Trace(Fpq) Gauss then went on to show that if n = n1n2, this (3) = , q p Trace(Fp)Trace(Fq) DFT can in turn be reduced to first computing n1 DFTs of length n2, using equispaced subsets of p where q denotes the Legendre symbol. Gauss the sample points, i.e., a subsampled DFT, and also established a formula for the quadratic Gauss then combining these shorter DFTs using various sum Trace(Fn), which is discussed in detail in [3]. trigonometric identities. This is the basic idea Another early appearance of the DFT occurs in underlying the Cooley-Tukey FFT. the origins of representation theory in the work of Unfortunately, this reduction never appeared Dedekind and Frobenius on the group determi- outside of Gauss’s collected works. Similar ideas, nant. For a finite group G, the group determinant usually for the case n1 =2, were rediscovered in- ΘG is defined as the homogeneous in termittently over the succeeding years. Notable the variables xg (for each g ∈ G) given by the de- among these is the doubling trick of Danielson and terminant of the matrix whose rows and columns Lanczos (1942), performed in the service of x-ray are indexed by the elements of G with g,h-entry crystallography, another frequent employer of FFT equal to xgh−1 . Frobenius showed that when G is technology. Nevertheless, it was not until the pub- abelian, Θ admits the factorization lication of Cooley and Tukey’s famous paper [7] G that the algorithm gained any notice. The story of (4) ΘG = χ(g)xg , Cooley and Tukey’s collaboration is an interesting ∈ χ∈G g G one. Tukey arrived at the basic reduction while in a meeting of President Kennedy’s Science Advisory where G is the set of characters of G. The linear Committee, where among the topics of discus- form defined by the inner sum in (4) is a “generic” sions were techniques for offshore detection of DFT at the frequency χ. nuclear tests in the Soviet Union. Ratification of a 2At this point we must come clean about how we count proposed United States/Soviet Union nuclear test operations. Our count is either the number of complex ad- ban depended upon the development of a method ditions or the number of complex multiplications, for detecting the tests without actually visiting whichever is greater. the Soviet nuclear facilities. One idea required the

1152 NOTICES OF THE AMS VOLUME 48, NUMBER 10 fea-maslen.qxp 10/15/01 10:03 AM Page 1153

analysis of seismological obtained from (7) f (ρij)= f (x)ρij(x). offshore seismometers, the length and number of x∈G which would require fast algorithms for comput- A set of matrix representations R of G is called ing the DFT. Other possible applications to a complete set of irreducible representations if and national security included the long-range acoustic only if the collection of matrix elements of the rep- detection of nuclear submarines. resentations, relative to an arbitrary choice of basis R. Garwin of IBM was another of the partici- for each matrix representation in the set, forms a pants at this meeting, and when Tukey showed him basis for the space of complex functions on G. The this idea, Garwin immediately saw a wide range of Fourier transform of f with respect to R is then potential applicability and quickly set to getting this defined as the collection of individual transforms, algorithm implemented. Garwin was directed to while the Fourier transform on G means any Fourier Cooley, and, needing to hide the national security transform computed with respect to some com- issues, told Cooley that he wanted the code for an- plete set of irreducibles. In this case, the inverse other problem of interest: the determination of the transform is given explicitly as periodicities of the spin orientations in a 3-D crys- tal of He3. Cooley had other projects going on, 1 −1 (8) f (x)= dρTrace(f (ρ)ρ(x )). |G| and only after quite a lot of prodding did he sit ρ∈R down to program the “Cooley-Tukey” FFT. In short Equation (8) shows us a relation between the group order Cooley and Tukey prepared their paper, Fourier transform and the expansion of a function which, for a mathematics/ paper, in the basis of matrix elements. The coefficient of was published almost instantaneously—in six ρij in the expansion of f is the Fourier transform −1 months! This publication, Garwin’s fervent pros- of f at the dual representation [ρji(g )] scaled by elytizing, as well as the new flood of data available the factor dρ/|G|. from recently developed fast analog-to-digital con- Viewing the Fourier transform on G as a sim- verters, did much to help call attention to the ex- ple matrix-vector multiplication leads to some sim- istence of this apparently new fast and useful al- ple bounds on the number of operations required gorithm. In fact, the significance of and interest in to compute the transform. The computation clearly the FFT was such that it is sometimes thought of takes no more than the |G|2 scalar operations re- as having given birth to the modern field of analy- quired for any matrix-vector multiplication. On sis of algorithms. See also [6] and the 1967 and the other hand, the column of the Fourier matrix 1969 special issues of the IEEE Transactions in corresponding to the trivial representation is all Audio Electronics for more historical details. 1s, so at least |G|−1 additions are necessary. One The Fourier Transform and Finite Groups main goal of this finite group FFT research is to One natural group theoretic interpretation of the discover algorithms which can significantly re- Fourier transform is as a change of basis in the space duce the upper bound for various classes of groups of complex functions on Z/nZ. Given a complex or even all finite groups. function f on Z/nZ, we may expand f in the basis The Current State of Affairs for Finite of irreducible characters {χk} defined by χk(j) = Group FFTs 2πijk/n e . By (2) the coefficient of χk in the expan- Analysis of the Fourier transform shows that for sion is equal to the scaled Fourier coefficient G abelian, the number of operations required is 1 bounded by O(|G| log |G|). For arbitrary groups G, f (−k), whereas the Fourier coefficient f (k) is the n upper bounds of O(|G| log |G|) remain the holy inner product of the vector of function values of grail in group FFT research. In 1978 A. Willsky f with those of the character χk . provided the first nonabelian example by showing For an arbitrary finite group G there is an anal- that certain metabelian groups have an ogous definition. The characters of Z/nZ are the O(|G| log |G|) Fourier transform algorithm [20]. simplest example of a matrix representation, which Implicit in the big-O notation is the idea that a fam- for any group G is a matrix-valued function ρ(g) ily of groups is under consideration, with the size on G such that ρ(ab)=ρ(a)ρ(b) and ρ(e) is the of the individual groups going to infinity. identity matrix. Given a matrix representation ρ of Since Willsky’s initial discovery, much progress dimension dρ and a complex function f on G, the has been made. U. Baum has shown that the su- Fourier transform of f at ρ is defined as the ma- persolvable groups admit an O(|G| log |G|) FFT, trix sum while others have shown that symmetric groups admit O(|G| log2 |G|) FFTs (see the section below (6) f(ρ)= f (x)ρ(x). on symmetric groups). Other groups for which x∈G highly improved (but not O(|G| logc |G|)) algo- Computing f (ρ) is equivalent to the computation rithms have been discovered include the matrix 2 of the dρ scalar Fourier transforms at each of the groups over finite fields and, more generally, the individual matrix elements ρij, Lie groups of finite type. See [15] for pointers to

NOVEMBER 2001 NOTICES OF THE AMS 1153 fea-maslen.qxp 10/15/01 10:03 AM Page 1154

the literature. There is much work to be done Tukey’s original m-stage algorithm, which requires finding new classes of groups which admit fast N i pi operations [7]. transforms and improving on the above results. The A Group Theoretic Interpretation ultimate goal is to settle or make progress on the Auslander and Tolimieri’s paper [3] related the following conjecture. Cooley-Tukey algorithm to the Weil-Brezin map for the finite Heisenberg group. Here we present Conjecture. There exist constants c1 and c2 such that for any finite group G there is a complete set an alternate group theoretic interpretation, origi- of irreducible matrix representations for which nally due to Beth [4], that is more amenable to gen- the Fourier transform of any complex function on eralization. c2 G may be computed in fewer than c1|G| log |G| The change of variables (9) may be interpreted scalar operations. as the factorization of the group element j as the (group) product of j1q ∈ qZ/nZ with the coset representative j . Thus, if we write G = Z/nZ , The Cooley-Tukey Algorithm 2 H = qZ/nZ, and let Y denote our set of coset rep- Cooley and Tukey showed [7] how the Fourier resentatives, (9) can be rewritten as transform on the cyclic group Z/nZ, where n = pq is composite, can be written in terms of Fourier (14) g = y · h, y ∈ Y,h∈ H. ∼ transforms on the subgroup qZ/nZ = Z/pZ . The trick is to change variables so that the The second change of variables (10) can be one-dimensional formula (1) is turned into a interpreted using the notion of restriction of two-dimensional formula which can be computed representations. It is easy to see that restricting a in two stages. Define variables j1,j2,k1,k2 through representation on a group G to a subgroup H the equations yields a representation of that subgroup. In the case of qZ/nZ this amounts to the observation that j = j(j1,j2)=j1q + j2, ≤ ≤ (9) 0 j1 Z/pZ > 1. Figure 1 shows the 2πij1k1/p ∼ (12) f (k1,j2)= e f (j1,j2). situation for the chain Z/6Z > 2Z/6Z = Z/3Z > 1. j1=0 In this way the irreducible representations of 2 This requires at most p q scalar operations. Z/nZ are indexed by paths (k1,k2) in the Bratteli • Stage 2: For each k1 and k2 compute the outer diagram for Z/nZ > Z/pZ > 1. The DFT factoriza- sum tion (11) now becomes q−1 2πij2(k2p+k1)/n · (13) f (k1,k2)= e f (k1,j2). (15) f (k1,k2)= χk1,k2 (y) f (y h)χk1 (h). y∈Y ∈ j2=0 h H This requires an additional q2p operations. The two-stage algorithm is now restated as first Thus, instead of (pq)2 operations, the above algo- computing a set of sums that depend on only the rithm uses (pq)(p + q) operations. first leg of the paths and then combining these to Stage 1 has the form of a DFT on the subgroup compute the final sums that depend on the full ∼ qZ/nZ = Z/pZ, embedded as the set of multiples paths. of q, whereas Stage 2 has the form of a DFT on a In summary, the group elements have been in- cyclic group of order q. So if n could be factored dexed according to a particular factorization scheme, further, we could apply the same trick to these while the irreducible representations (the dual group) DFTs in turn. Thus, if N has the prime factoriza- are now indexed by paths in a Bratteli diagram, de- tion N = p1 ···pm, then we reobtain Cooley and scribing the restriction of representations. This

1154 NOTICES OF THE AMS VOLUME 48, NUMBER 10 fea-maslen.qxp 10/15/01 10:03 AM Page 1155

what follows we summarize recent improvements on Clausen’s result. Example: Computing the Fourier Transform on S4 The fast Fourier transform for S4 is obtained by mimicking the group theoretic approach to the Cooley-Tukey algorithm. More precisely, we shall rewrite the formula for the Fourier transform using two changes of variables: one using of group elements, and the other using paths in a Bratteli diagram. The former comes from the re- duced word decomposition of g ∈ S4, by which g may be uniquely expressed as

Figure 1. The Bratteli diagram for 4 · 4 · 4 · 3 · 3 · 2 (16) g = s2 s3 s4 s2 s3 s2 , Z/6Z > 2Z/6Z > 1. The representation χk of j 2πikl/m − Z/mZ is defined by χk(l)=e . where si is either e or the transposition (ii 1), j j and s = e implies that s = e for i ≤ i . Thus any i1 i2 2 1 allows us to compute the Fourier transform in function on the group S4 may be thought of as a stages, using one fewer group element factor at 4 4 4 3 3 2 function of the six variables s2, s3, s4, s2, s3, s2. each stage but using paths of increasing length in To index the matrix elements of S4, paths in a the Bratteli diagram. Bratteli diagram are used, this time relative to the chain of subgroups S ≥ S ≥ S ≥ S ≥ 1. The Fast Fourier Transforms on Symmetric 4 3 2 1 irreducible representations of Sn are in one- Groups to-one correspondence with partitions of the A fair amount of attention has been devoted to de- integer n, with restriction of representations veloping efficient Fourier transform algorithms corresponding to deleting a box in the Young for the symmetric group. One motivation for de- diagram. The corresponding Bratteli diagram is veloping these algorithms is the goal of analyzing called Young’s lattice and is shown in Figure 2. data on the symmetric group using a spectral approach. In the simpler case of time series data on the cyclic group, this approach amounts to projecting the data vector onto the basis of com- plex exponentials. The spectral approach to data analysis makes sense for a function defined on any kind of group, and such a general formulation is due to Diaconis (see, e.g., [8]). The case of the symmetric group cor- responds to considering ranked data. For instance, Figure 2. Young’s lattice up to level 4. a group of people might be asked to rank a list of four restaurants in order of preference. Thus, each Paths in Young’s lattice from the empty partition respondent chooses a permutation of the original φ to β , a partition of 4, index the basis vectors ordered list of four objects, and counting the num- 4 of the irreducible representation corresponding ber of respondents choosing each permutation to β . Matrix elements, however, are determined yields a function on S . It turns out that the cor- 4 4 by specifying a pair of basis vectors, so to index responding Fourier decomposition of this function the matrix elements, we must use pairs of paths naturally describes various coalition effects that in Young’s lattice, starting at φ and ending at the may be useful in describing the data. same partition of 4. Since there are no multiple To get some feel for this, notice that the Fourier edges in Young’s lattice, each path may be de- transform at the matrix element ρ (π) of the (re- ij scribed by the of partitions φ, β1, β2, β3, ducible) defining representation counts the num- β4 through which it passes. ber of people ranking restaurant i in position j. If Before we can state a formula for the Fourier instead ρ is the (reducible) permutation repre- transform analogous to (11) and (15), we must sentation of Sn on unordered pairs {i,j}, then for choose bases for the irreducible representations { } { } each choice of i,j and k, l the individual of S4 in order to define our matrix elements. Effi- Fourier transforms count the number of respon- cient algorithms are known only for special choices dents ranking restaurants i and j in positions k of bases, and our algorithm uses the representa- and l. See [8] for a more thorough explanation. tions in Young’s orthogonal form, which is equiv- The first FFT for symmetric groups (an alent to the following equation (17) for the Fourier O(|G| log3 |G| ) algorithm) is due to M. Clausen. In transform in the new sets of variables.

NOVEMBER 2001 NOTICES OF THE AMS 1155 fea-maslen.qxp 10/15/01 10:04 AM Page 1156

4 4 4 3 3 2 • Stage 0: Start with f (s2 s3 s4 s2 s3 s2 ) , for all reduced words. 2 2 • Stage 1: Multiply by P 2 . Sum on s2. s2 (17) 2 3 • Stage 2: Multiply by P 3 . Sum on s2. s2 3 3 • Stage 3: Multiply by P 3 . Sum on η1,s3 . s3 Stage 4: Multiply by P 2 . Sum on s4. • s4 2 The functions P i in equation (17) are defined 2 j 2 4 si Stage 5: Multiply by P . Sum on ϕ ,s . • s4 1 3 below, and for each i, the variables βi, γi , ϕi, ηi 3 are partitions of i satisfying the restriction rela- Stage 6: Multiply by P 3 . Sum on ϕ ,s4 . • s4 2 4 tions described by Figure 3. A solid line between 4 The indices occurring in each stage of the al- partitions means that the right-hand partition is gorithm are shown in Figure 4. obtained from the left-hand partition by removing To count the number of additions and multi- a box. plications used by the algorithm, we must count the number of configurations in Young’s lattice cor- responding to each of the diagrams in Figure 4. This yields a grand total of 130 additions and 130 mul- tiplications for the Fourier transform on S4. The generalization to higher-order symmetric groups is straightforward. The reduced word de- composition gives the group element factoriza- tion, Young’s orthogonal form allows us to change variables, and the formula and algorithm for the Fourier transform can be read off a diagram gen- eralizing Figure 3. The diagram for S5, for exam- ple, is shown in Figure 5. Figure 3. Restriction relations for (17). We have computed the exact operation counts 3 for symmetric groups Sn with n ≤ 50, and a gen- eral formula seems hard to come by. However, The relationship between (17) and Figure 3 is ex- bounds are easier to obtain. tremely close: we derived the diagram from the re- duced word decomposition first and then read the Theorem 1 [13]. The number of additions (or mul- equation off the diagram. Each 2-cell in Figure 3 tiplications) required by the above algorithm (as gen- corresponds to a factor in the product of P func- eralized to Sn >Sn−1 > ···>S1) is exactly tions in (17), and the labels on the boundary of each i cell give the arguments of P j . The sum in (17) is si n k over those variables occurring in the interior of 1 1 n! · Fi, k (i − 1)! Figure 3. Thus, the variables describing the Fourier k=2 i=2 transformed function are exactly those appearing on the boundary of the figure. where Fi is the number of configurations in Young’s Equation (17) can be summarized by saying lattice of the form that we take the product over 2-cells and sum on interior indices in Figure 3. This suggests a gen- eralization of the Cooley-Tukey algorithm that corresponds to building up the diagram one cell (18) at a time. At each stage multiply by the factor cor- responding to a 2-cell and form the diagram con- sisting of those 2-cells that have been considered γ so far. Then sum over any indices that are in the interior of the diagram for this stage but were not 1 Furthermore, Fi ≤ 3(1 − )i!, so the number of ad- in the interior for previous stages. At the end of i 3 ditions (multiplications) is bounded by n(n − 1) · n!. this algorithm we have multiplied by the factors 4 for each 2-cell and summed over all the interior Why stop at Sn? The algorithm for the FFT on indices, and have therefore computed the Fourier Sn generalizes to any wreath product Sn[G] with transform. The order in which the cells are added matters, 3 2 3 3 4 4 4 This would seem to include all cases where the algo- of course. The order s2, s2, s3, s2, s3, s4 is known rithm might ever be implemented, but the same numbers to be most efficient. Here is the algorithm in arise in FFTs on homogeneous spaces, which have far detail. fewer elements.

1156 NOTICES OF THE AMS VOLUME 48, NUMBER 10 fea-maslen.qxp 10/15/01 10:04 AM Page 1157

Figure 4. Variables occurring at each stage of the fast Fourier transform for S4.

column(b2). Now suppose βi, βi−1, αi−1, αi−2 are partitions and that αi−1 and βi−1 are obtained from βi by removing a box and are obtained from αi−2 by adding a box. Then the skew diagrams of βi − βi−1 and βi−1 − αi−2 each consists of a single box, and P i is given by

(21)

Figure 5. Restriction relations in the Fourier transform formula for S5.

the symmetric group. The subgroup chain is re- placed by the chain × Sn[G] >Sn−1[G] G>Sn−1[G] For a proof of this formula, in slightly different (19) > ···>S2[G] >G× G>G, notation, see [11, Chapter 3]. and the reduced word decomposition is replaced Generalization to Other Groups by the factorization The FFT described for symmetric groups suggests (20) a general approach to computing Fourier trans- n ··· n n n−1 ··· n−1 n−1 ··· 2 2 1 forms on finite groups. Here is the recipe. x = s sn g s s − g ss g g . 2 2 n 1 1. Choose a chain of subgroups Adapting the Sn argument along these lines gives (22) G = G ≥ G − ≥···≥G ≥ G =1 the following new result. m m 1 1 0 for the group. This determines the Bratteli di- Theorem 2. The number of operations needed to agram that we will use to index the matrix el- compute a Fourier transform on S [G] is at most n ements of G. In the general case, this Bratteli − diagram may have multiple edges, so a path 3n(n 1) 2 1 2 |G|d + n tG + |G|(hGd −|G|) |Sn[G]|, 4 G 4 G is no longer determined by the nodes it visits. 2. Choose a factorization g = gn · gn−1 ···g1 of where hG is the number of conjugacy classes in G, each group element g. Choose the gi so that dG is the maximal degree of an irreducible repre- they lie in as small a subgroup Gk as possible sentation of G, and tG is the number of operations and commute with as large a subgroup Gl as required to compute a Fourier transform on G. If possible. 2 −| | G is abelian, then the inner term hGdG G =0. 3. Choose a system of Gel’fand-Tsetlin bases [9] i for the irreducible representations of G relative The functions P j defining Young’s orthogonal si to the chain (22). These are bases that are in- form are defined as follows: For any two boxes b1 dexed by paths in the Bratteli diagram and that and b2 in a Young diagram, we define the axial dis- behave well under restriction of representa- tance from b1 to b2 to be d(b1,b2) , where tions. Relative to such a basis, the representa- d(b1,b2) = row(b1) − row(b2) + column(b1) − tion matrices of gi will be block diagonal

NOVEMBER 2001 NOTICES OF THE AMS 1157 fea-maslen.qxp 10/15/01 10:04 AM Page 1158

whenever gi lies in a subgroup from the chain of length p − 1 for which Cooley-Tukey-like ideas and block scalar whenever gi commutes with may be used. It is a very interesting open question all elements of a subgroup from the chain. to discover if this idea has a nonabelian general- 4. Now write the Fourier transform in coordi- ization. nates as a function of the pairs of paths in the Modular FFTs Bratteli diagram with a common endpoint and A significant application of the abelian FFT is in the with the original function written as a function efficient computation of Fourier transforms for func- of g1,...,gn. This will be a sum of products tions on cyclic groups defined over finite fields. indexed by edges in the Bratteli diagram which These are needed for the efficient encoding and de- lie in some configuration generalizing (3). This coding of various polynomial error-correcting codes. configuration of edges specifies the way in Many abelian codes, e.g., the Golay codes used in which the nonzero elements of the represen- deep-space communication, are defined as Fp-valued tation matrices appear in the formula for the functions on a group Z/mZ with the property that Fourier transform in coordinates. f (k)=0for k in some specified set of indices S, 5. The algorithm proceeds by building up the where now the Fourier transform is defined in terms product piece by piece and summing on as of a primitive (p − 1)st . many partially indexed variables as possible. These sorts of spectral constraints define cyclic Further Considerations and Generalizations codes, and they may immediately be generalized The efficiency of the above approach—in theory (in to any finite group. Recently, this has been done terms of algorithmic complexity) and in practice in the construction of codes over SL2(Fp) using (in terms of execution time)—depends on both the connections between expander graphs and linear choice of factorization and the Gel’fand-Tsetlin codes discovered by M. Sipser and D. Spielman. For bases. In particular, very interesting work of further discussion of this and other applications, L. Auslander, J. Johnson, and R. Johnson [2] shows see [17]. how, in the abelian case, different factorizations correspond to different well-known FFTs, each FFTs for Compact Groups well suited for execution on a different computer The DFT and FFT have a natural extension to con- architecture. This work shows how to relate the tinuous compact groups. The terminology “dis- 2-cocycle of a group extension to construction of crete Fourier transform” derives from the algo- the important “twiddle factor” matrix in the fac- rithm having been originally designed to compute torization of the Fourier matrix. It marks the first the (possibly approximate) Fourier transform of a appearances of group cohomology in signal pro- continuous signal from a discrete collection of cessing and derives an interesting connection be- sample values. tween and the design of retargetable Under the simplifying assumption of periodic- software. ity, a continuous function may be interpreted as The analogous questions for nonabelian groups a function on the unit circle S1 , a compact abelian and other important signal processing transform al- group. Any such function f has a Fourier expan- gorithms, i.e., the problem of finding architecture- sion defined as optimized factorizations, is currently being inves- − tigated by the SPIRAL project at Carnegie Mellon [19]. (24) f (e2πit)= f(l)e 2πilt, Another Abelian Idea—Rader’s Prime FFT l∈Z The use of subgroups depends upon the existence where of a nontrivial subgroup. Thus, for a reduction in 1 2πit 2πilt the case of a cyclic group of prime order, a new (25) f (l)= f (e )e dt. 0 idea is necessary. In this case, one possibility is an algorithm due to C. Rader [16] which proceeds by If f (l)=0 for |l|≥N, then f is band-limited with turning computation of the DFT into computation band-limit N, and the DFT (1) is in fact a quadra- of convolution on a different, albeit related, group. ture rule or sampling theorem for f. That is, the DFT Let p be a prime. Since Z/pZ is also a , of the function 1 f (e2πit) on the group of × 2N−1 there exists a generator g of Z/pZ , a cyclic group (2N − 1)st roots of unity computes exactly the (under multiplication) of order p − 1. Thus, for Fourier coefficients of the band-limited function. any f : Z/pZ → C and nonzero frequency index The FFT then efficiently computes these Fourier − − g b, f (g a) can be written as coefficients. p−2 a−b The first nonabelian FFT for a compact group −b a g (23) f (g )=f (0) + f (g )ζp was a fast spherical harmonic expansion algorithm a=0 discovered by J. Driscoll and D. Healy. Several .The summation in (23) has the form of a convo- ingredients were required: (1) a notion of “band- lution on Z/(p − 1)Z, of the sequence f (a)=f (ga), limit” for functions on S2 , (2) a sampling theory ga with the function z(a)=ζp , so that f may be al- for such functions, and (3) a fast algorithm for the most entirely computed using Fourier transforms computation.

1158 NOTICES OF THE AMS VOLUME 48, NUMBER 10 fea-maslen.qxp 10/15/01 10:04 AM Page 1159

The are naturally indexed it is invariant under taking duals and that according to their order (the common degree of a α≤β + γ for α occurring in β ⊗ γ) defines set of homogeneous on S2 ). With re- a notion of band-limit given by all α with norm less spect to the usual coordinates of latitude and than a fixed b. This generalizes the definition n longitude, the spherical harmonics separate as a above. The associated sampling sets Xb are con- product of exponentials and associated Legendre tained in certain one-parameter subgroups. These functions, each of which separately has a sam- sampling sets permit a separation of variables pling theory. Finally, using the usual FFT for the analogous to that used in the Driscoll-Healy FFT. exponential part and a new fast algorithm (based Once again the special functions satisfy certain on three-term recurrences) for the Legendre part three-term recurrences which admit a similar ef- forms an FFT for S2 . ficient divide-and-conquer computational approach These ideas generalize nicely. Keep in mind that (see [15] and references therein). One may derive the representation theory of compact groups is efficient algorithms for all the classical groups much like that of finite groups: there is a count- U(n), SU(n), and Sp(n). able complete set of irreducible representations, ≥ and any square-integrable function (with respect Theorem 3. Assume n 2. n dim U(n)+3n−3 to Haar measure) has an expansion in terms of the (i) For U(n), T n (R ) ≤ O(b ) , Xb b corresponding matrix elements. There is a natural n dim SU(n)+3n−2 (ii) for SU(n), T n (R ) ≤ O(b ), Xb b definition of band-limited in the compact case, n dim Sp(n)+6n−6 (iii) for Sp(n), TXn (R ) ≤ O(b ), encompassing those functions whose Fourier ex- b b n where T n (R ) denotes the number of operations pansion has only a finite number of terms. The sim- Xb b n plest version of the theory is as follows: needed for the particular sample set Xb and rep- Rn resentations b for the associated group. Definition. Let R denote a complete set of irre- ducible representations of a compact group G. A Further and Related Work system of band-limits on G is a decomposition of Noncompact Groups R = ∪ ≥ R such that b 0 b Much of modern signal processing relies on the un- 1. Rb is finite for all b ≥ 0, derstanding and implementation of Fourier analy- ≤ R ⊆R 2 2. b1 b2 implies that b1 b2 , sis for L (R), i.e., the noncompact abelian group R. R ⊗R ⊆ R 3. b1 b2 spanZ b1+b2 . Nonabelian, noncompact examples have begun to attract much attention. Suppose {R } ≥ is a system of band-limits on G b b 0 In this area some of the most exciting work is and f ∈ L2(G). Then f is band-limited with band- being done by G. Chirikjian and his collaborators. limit b if the Fourier coefficients are zero for all They have been exploring applications of convo- matrix elements in ρ for all ρ/∈R . b lution on the group of rigid motions of Euclidean The case of G = S1 provides the classical ex- space to such diverse areas as robotics, polymer j ample. If Rb = {χj : |j|≤b} , where χj (z)=z , modeling, and pattern matching. See [5] for details then χj ⊗ χk = χj+k, and the notion of band-limited and pointers to the literature. corresponding to the definition coincides with the To date the techniques used here are approxi- usual notion. mate in nature, and interesting open problems For a nonabelian example, consider G = SO(3). abound. Possibilities include the formulation In this case the irreducible representations of G of natural sampling, band-limiting, and time- are indexed by the nonnegative integers, with Vλ frequency theories. The exploration of other the unique irreducible of dimension 2λ +1. Let special cases such as semisimple Lie groups (see Rb = {Vλ : λ ≤ b}. The Clebsch-Gordon relations [1] for a beautifully written, succinct survey of the Harish-Chandra theory) would be one natural place λ +λ 1 2 to start. A sampling and band-limiting theory would (26) V ⊗ V = V λ1 λ2 j be the first step towards developing a computa- | − | j= λ1 λ2 tional theory, i.e., FFT. “Fast Fourier transforms on imply that this is a system of band-limits for SO(3). semisimple Lie groups” has a nice ring to it! ∼ When restricted to the quotient S2 = SO(3)/SO(2), Approximate Techniques band-limits are described in terms of the highest The techniques in this paper are all exact, in the order spherical harmonics that appear in a given sense that if computed in exact arithmetic, they expansion. yield exactly correct answers. Of course, in any ac- This notion of band-limit permits the con- tual implementation, errors are introduced, and the struction of a sampling theory [14]. For example, utility of an algorithm will depend highly on its nu- in the case of the classical groups, a system of merical stability. Rn band-limits b is chosen with respect to a partic- There are also “approximate methods”, ap- ular norm on the dual of the associated Cartan proximate in the sense that they guarantee a cer- subalgebra. Such a norm · (assuming that tain specified approximation to the exact answer

NOVEMBER 2001 NOTICES OF THE AMS 1159 fea-maslen.qxp 10/15/01 10:04 AM Page 1160

that depends on the running time of the algo- [7] J. W. COOLEY and J. W. TUKEY, An algorithm for ma- rithm. For computing Fourier transforms at non- chine calculation of complex Fourier series, Math. equispaced frequencies, as well as spherical har- Comp. 19 (1965), 297–301. monic expansions, the fast multipole method due [8] P. DIACONIS, Group Representations in Probability and Statistics, Inst. Math. Stat., Hayward, CA, 1988. to V. Rokhlin and L. Greengard is a recent and [9] I. GEL’FAND and M. TSETLIN, Finite dimensional repre- very important approximate technique. Multipole- sentations of the group of unimodular matrices, based approaches efficiently compute Dokl. Akad. Nauk SSSR 71 (1950), 825–828 (Russian). these quantities approximately in such a way that [10] M. T. HEIDEMAN, D. H. JOHNSON, and C. S. BURRUS, Gauss 1 the running time increases by a factor of log( ), and the history of the fast Fourier transform, Arch. where denotes the precision of the approxima- Hist. Exact Sci. 34 (1985), 265–277. tion. M. Mohlenkamp has applied quasi-classical [11] G. JAMES and A. KERBER, The Representation Theory frequency estimates to the approximate compu- of the Symmetric Group, Encyclopedia Math. Appl., vol. 16, Addison-Wesley, Reading, MA, 1981. tation of various special function transforms. [12] T. Y. LAM, Representations of finite groups: A hun- Quantum Computing dred years, I and II. Notices Amer. Math. Soc. 45 Another related and active area of research involves (1998), 361–372, 465–474. connections with quantum computing. One of the [13] D. K. MASLEN, The efficient computation of Fourier first great triumphs of the quantum computing transforms on the symmetric group, Math. Comp. 67 model is P. Shor’s fast algorithm for integer factor- (1998), 1121–1147. ization on a quantum computer [18]. At the heart [14] ——— , Efficient computation of Fourier transforms on compact groups, J. Fourier Anal. Appl. 4 (1998), of Shor’s algorithm is a subroutine which computes 19–52. (on a quantum computer) the DFT of a binary vector [15] D. K. MASLEN and D. N. ROCKMORE, Generalized FFTs— representing an integer. The implementation of this a Survey of Some Recent Results, Groups and Com- transform as a sequence of one- and two-bit quan- putation, II (New Brunswick, NJ, 1995), DIMACS Ser. tum gates is the quantum FFT, effectively the Coo- Discrete Math. Theoret. Comput. Sci., vol. 28, Amer. ley-Tukey FFT realized as a particular factorization Math. Soc., Providence, RI, 1997, pp. 183–237. of the Fourier matrix into a product of matrices com- [16] C. RADER, Discrete Fourier transforms when the num- posed as tensor products of certain 2 × 2 unitary ber of data samples is prime, IEEE Proc. 56 (1968), 1107–1108. matrices, each of which is a “local unitary trans- [17] D. N. ROCKMORE, Some applications of generalized form”. Extensions of these ideas to the more gen- FFTs (an appendix w/D. Healy), Groups and Com- eral group transforms mentioned above are a current putation, II (New Brunswick, NJ, 1995), DIMACS Ser. important area of research of great interest in com- Discrete Math. Theoret. Comput. Sci., vol. 28, Amer. puter science. Math. Soc., Providence, RI, 1997, pp. 329–369. Final Remarks [18] P. W. SHOR, Polynomial-time algorithms for prime fac- So, these are some of the things that go into the torization and discrete logarithms on a quantum computer, SIAM J. Comput. 26 (1997), 1484–1509. computation of the finite Fourier transform. It is [19] http://www.ece.cmu.edu/~spiral/. a tapestry of mathematics both pure and applied, [20] A. WILLSKY, On the algebraic structure of certain woven from algebra and analysis, complexity the- partially observable finite-state Markov processes, ory, and scientific computing. It is on the one hand Inform. Contr. 38 (1978), 179–212. a focused problem, but like any good problem, its “solution” does not end a story, but rather initi- ates an exploration of unexpected connections and new challenges.

References [1] J. ARTHUR, Harmonic analysis and group represen- tations, Notices Amer. Math. Soc. 47 (2000), 26–34. [2] L. AUSLANDER, J. R. JOHNSON, R. W. JOHNSON, Multidi- mensional Cooley-Tukey algorithms revisited, Adv. Appl. Math. 17 (1996), 477–519. [3] L. AUSLANDER and R. TOLIMIERI, Is computing with the finite Fourier transform pure or applied mathematics? Bull. Amer. Math. Soc. (N.S.) 1 (1979), 847–897. [4] T. BETH, Verfahren der schnellen Fourier-Transfor- mation, Teubner Studienbücher, Stuttgart, 1984. [5] G. S. CHIRIKJIAN and A. B. KYATKIN, Engineering Ap- plications of Noncommutative Harmonic Analysis, CRC Press, Boca Raton, FL, 2000. [6] J. W. COOLEY, The re-discovery of the fast Fourier transform algorithm, Mikrochimica Acta III (1987), 33–45.

1160 NOTICES OF THE AMS VOLUME 48, NUMBER 10 fea-maslen.qxp 10/15/01 10:04 AM Page 1161

NOVEMBER 2001 NOTICES OF THE AMS 1161