<<

Numerical Algorithms 24 (2000) 99±116 99

Coupled Vandermonde matrices and the superfast computation of Toeplitz determinants ∗

Peter Kravanja ∗∗ and Marc Van Barel Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200 A, B-3001 Heverlee, Belgium E-mail: [email protected]; [email protected]

Let n be a positive integer, let a−n+1, ..., a−1, a0, a1, ..., an−1 be complex numbers n−1 × and let T := [ak−l]k,l=0 be a nonsingular n n complex Toeplitz . We present a superfast algorithm for computing the determinant of T . Superfast means that the arithmetic complexity of our algorithm is O(N log2 N), where N denotes the smallest power of 2 that is larger than or equal to n. We show that det T can be computed from the determinant of a certain coupled . The latter matrix is related to a linearized rational interpolation problem at roots of unity and we show how its determinant can be calculated by multiplying the pivots that appear in the superfast interpolation algorithm that we presented in a previous publication. Keywords: Toeplitz determinants, rational interpolation, coupled Vandermonde matrices AMS subject classi®cation: 65F40, 65D05

1. Introduction

Let n be a positive integer, let a−n+1, ..., a−1, a0, a1, ..., an−1 be complex num- n−1 × bers and let T = Tn := [ak−l]k,l=0 be a nonsingular n n complex Toeplitz matrix. We consider the problem of computing the determinant of T . Let N denote the smallest power of 2 that is larger than or equal to n. In [5,6] we used a formula of Heinig and Rost [4] for the generating function of T −1 to represent the N × N matrix     −1 − T 0 T 1 := N 00 in terms of discrete Fourier transform matrices and diagonal matrices. The latter matrices involve certain polynomials (known as the canonical fundamental system of T ) evaluated at roots of unity. These polynomials can be computed by solving two linearized rational interpolation problems at the 2Nth roots of unity. In [5,6] we presented a stabilized generically superfast algorithm for solving such interpolation

∗ This research was partially supported by the Fund for Scienti®c Research-Flanders (FWO-V), project ªOrthogonal Systems and Their Applicationsº, grant #G.0278.97. ∗∗ Corresponding author.

 J.C. Baltzer AG, Science Publishers 100 P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants problems. (An earlier version of this algorithm can be found in [8].) Superfast means that the arithmetic complexity of our algorithm is O(ν log2 ν) for a problem that consists of ν interpolation points. Generically refers to the fact that in some exceptional cases the complexity of the algorithm is only O(ν2). We combined our superfast interpolation −1 algorithm and the explicit formula for [T ]N to obtain a superfast Toeplitz solver, i.e., a superfast algorithm for solving linear systems of equations that have Toeplitz structure. The two linearized rational interpolation problems that are related to the canonical fundamental system can be expressed in terms of a certain coupled Vandermonde matrix VC. We will show how det T can be computed from det VC and how det VC can be computed from the pivots that appear in our interpolation algorithm. This will enable us to compute det T in a generically superfast way. In a companion paper [7] we considered the related problem of computing the determinant of a complex Hankel matrix H. By exploiting the connections that exist between Hankel, Loewner, Cauchy and coupled Vandermonde matrices, we were able to show that the determinant of H can be computed from the determinant of a certain h coupled Vandermonde matrix VC (different from VC). The present paper constitutes an improvement on [7] in the following sense: in [7] the size of H had to be a power of 2 whereas in the present paper there is no restriction on the size of T . Also, the h entries that appear in VC are less expensive to compute than those in VC .

Note. In the case of linear systems of Toeplitz equations, the idea of transforming a Toeplitz matrix into a coupled Vandermonde matrix was proposed by Heinig [2,3].

2. A coupled Vandermonde matrix based on the symbol of T

The symbol of T is de®ned as the function

a−n+1 a−1 n−1 a : C → C : z 7→ a(z):= + ···+ + a + a z + ···+ a − z . 0 zn−1 z 0 1 n 1

De®ne ω0, ..., ω2N−1 as the 2Nth roots of unity,   2πi ω := exp k , k = 0, 1, ...,2N − 1, k 2N and let V2N be the corresponding Vandermonde matrix,   1 ω ... ω2N−1  0 0  . . . ∈ C2N×2N V2N :=  . . .  . 2N−1 1 ω2N−1 ... ω2N−1 √ Note that V2N / 2N is a unitary matrix. In theorem 2 below we will need an explicit H expression for the determinant of V2N . The fact that V2N V2N = 2NI2N immediately 2 2N N implies that |det V2N | = (2N) and, hence, |det V2N | = (2N) .Inotherwords, P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants 101

N det V2N = αN (2N) ,whereαN ∈ C and |αN | = 1. In the next theorem we obtain the value of this constant αN of modulus one in case N is a power of 2, which is the case that interests us in this paper. p N Theorem 1. Suppose N = 2 ,wherep ∈ N \ {0, 1}. Then det V2N = i(2N) .Also, det V2 = −2 and det V4 = −16i.

Proof. The 2Nth roots of unity ω0, ω1, ..., ω2N−1 can be grouped into two groups. De®ne + − ωk := ω2k and ωk := ω2k+1 − + + + for k = 0, 1, ..., N 1. Then ω0 , ω1 , ..., ωN−1 are the Nth roots of unity and − − − ω0 , ω1 , ..., ωN−1 are rotated Nth roots of unity, − + − ωk = ηωk , k = 0, 1, ..., N 1, N N×N where η := exp(πi/N ). Note that η = −1. Let VN ∈ C be the Vandermonde matrix based on the Nth roots of unity,    −  1 ω+ ... ω+ N 1  0 0  V :=  . . .  . N . .  .  + + N−1 1 ωN−1 ... ωN−1

By rearranging the rows of V2N we obtain the following:    −     −  1 ω+ ... ω+ N 1 ω+ N ... ω+ 2N 1  0 0 0 0   . . . . .   . .  .   .   .   + + N−1 + N + 2N−1  1 ω − ... ω − ω − ... ω −  P V =  N 1  N 1  N 1  N 1  . 2N 2N  − − N−1 − N − 2N−1  1 ω ... ω ω ... ω   0 0 0 0   . . . . .  . .  .   .   . − − N−1 − N − 2N−1 1 ωN−1 ... ωN−1 ωN−1 ... ωN−1 2N×2N Here P2N ∈ R is the permutation matrix de®ned by   T P2N := e1 e3 ... e2N−1 e2 e4 ... e2N , 2N where ej ∈ R denotes the jth canonical vector for j = 1, ...,2N.Let N−1 ∈ CN×N + N − N N − Dη := diag(1, η, ..., η ) .Since[ωk ] = 1and[ωk ] = η = 1for k = 0, 1, ..., N − 1, it follows that   VN VN P2N V2N = . VN Dη −VN Dη The Schur complement formula then implies that  − − −1 det P2N det V2N = det VN det VN Dη VN DηVN VN N 2 = det VN det(−2VN Dη) = (−2) det Dη[det VN ] . 102 P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants

N−1 (1/2)N(N−1) Since det Dη = exp[(πi/N)(1/2)(N −1)N] = i and det P2N = (−1) = iN(N−1) we may conclude that

N N N−1 −N(N−1) 2 N 2N−(N−1)2 2 det V2N = (−1) 2 i i [det VN ] = 2 i [det VN ] N 4N−(1+N 2) 2 N −(1+N 2) 2 = 2 i [det VN ] = 2 i [det VN ] . Here we have used the fact that i4 = 1. As N is even, N 2 is a multiple of 4 and, hence, iN 2 = 1. Thus, N 3 2 det V2N = 2 i [det VN ] . The fact that   11 det V = det = −2 = 2i2 2 1 −1 implies that det V2N consists of two factors: a power of 2 and a power of i. Let ε2N denote the power of 2 that appears in det V2N .Thenε2 = 1 and in general we have the recursion

ε2N = N + 2εN . The solution to this difference equation is given by

ε2N = N + N log2 N.

Similarly, if we let ζ2N denote the power of i that appears in det V2N ,thenζ2 = 2and

ζ2N = 3 + 2ζN . It follows that

ζ2N = 5N − 3. Therefore,

ε2N ζ2N N+N log N 5N−3 N+1 N det V2N = 2 i = 2 2 i = i (2N) .

For N = 1andN = 2 we have indeed that det V2 = −2 and det V4 = −16i. If N is a power of 2 that is larger than or equal to 4, then N is a multiple of 4 and, hence, iN = 1. This proves the theorem. 

2N×2N Let us de®ne the coupled Vandermonde matrix VC ∈ C as follows: the (k + 1)th row of VC is given by   n n+1 2N−1 n−1 ωk ωk ... ωk a(ωk) ωka(ωk) ... ωk a(ωk) for k = 0, 1, ...,2N − 1. The following theorem shows how det T can be computed from det VC.

n Theorem 2. det VC = (−1) det V2N det T . P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants 103

Proof. We partition the Vandermonde matrix V2N as   (n) (2N−n) V2N =: V2N V2N , (n) ∈ C2N×n (2N−n) ∈ C2N×(2N−n) where V2N and V2N . Let us consider the rectangular Toeplitz matrix in C2N×n whose ®rst row is given by

[a0 a−1 ... a−n+1 ] and whose ®rst column is given by T [a0 a1 ... an−1 0 ... 0 a−n+1 ... a−2 a−1 ] . Let us partition this matrix as   T , Te

e (2N−n)×n where T belongs to C .De®neDa as the  2N−1 Da := diag a(ωk) k=0 . Then the following holds:       T 0 (n) (2N−n) 0 I2N−n V2N e = DaV2N V2N = VC . TI2N−n In 0 One can easily verify that   0 I2N−n n(2N−n) n2 n det = (−1) det In = (−1) = (−1) . In 0 n It follows that indeed det VC = (−1) det V2N det T . 

Theorems 1 and 2 imply that the determinant of the Toeplitz matrix T can be computed from the determinant of the coupled Vandermonde matrix VC: 1 det T = (−1)n(2N)−N det V i C p if N = 2 ,wherep ∈ N \ {0, 1}. It remains to compute det VC in an ef®cient and accurate way. In the next section we will exploit the connection between coupled Vandermonde matrices and linearized rational interpolation problems. We will show how the determinant of a coupled Vandermonde matrix can be computed by multiplying the pivots that appear in the superfast interpolation algorithm that we presented in [5,8].

3. Superfast rational interpolation

Instead of the coupled Vandermonde matrix VC that we have de®ned in the previous section, we will now consider a more general coupled Vandermonde matrix 2N×2N in C , which we will also denote by VC.Letm := 2N − n and suppose that the 104 P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants complex numbers sk, ek and fk are given for k = 1, ..., m + n. We assume that the sk's are mutually distinct. Suppose that the kth row of VC is given by   m−1 n−1 ek eksk ... eksk fk fksk ... fksk for k = 1, ..., m + n. Observe that by setting n sk := ωk−1, ek := ωk−1 and fk := a(ωk−1) for k = 1, ..., m + n = 2N we obtain the coupled Vandermonde matrix de®ned in the previous section. Without loss of generality we may assume that m > n. Indeed, if m happens to be less than n, then we can swap the left and right block of VC, an operation that leads only to a possible sign change of det VC. γ ∈ Cγ Let us introduce the following notation. Let g = [gk]k=1 be a column vector of length γ for some positive integer γ.ThenΛ(γ) denotes the diagonal matrix γ×γ Λ(γ):= diag(g1, ..., gγ) ∈ C . β ∈ Cβ−α+1 6 6 6 Also, let gα:β := [gk]k=α for some integers α and β,1 α β γ. De®ne the column vectors s, e, f ∈ Cm+n as       s1 e1 f1  .   .   .  s :=  .  , e :=  .  and f :=  .  sm+n em+n fm+n and let V:,1:l(s) denote the ®rst l columns of the Vandermonde matrix based on s,   − 1 s ... sl 1  1 1  . . . ∈ C(m+n)×l V:,1:l(s):=  . . .  l−1 1 sm+n ... sm+n for some integer l ∈ {1, ..., m + n}. Then the coupled Vandermonde matrix VC can be written as   VC = Λ(e)V:,1:m(s) Λ(f)V:,1:n(s) .

The numerical stability of the algorithm for computing det VC that we are going to derive is enhanced via pivoting. The algorithm uses a certain criterion to change the order of the components of s, which leads to a corresponding change in e and f. This form of pivoting corresponds to a permutation of the rows of VC and hence can only change the sign of the determinant of VC. It holds that      0 Λ 0 0 Λ 0 0 VC := PVC = e V:,1:m s f V:,1:n s , where P is the permutation matrix, s0 := Ps, e0 := Peand f 0 := Pf are the permuted s, e and f vectors. To simplify the notation we will henceforth omit the primes. In other words, we will identify s0 with s, e0 with e,etc. P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants 105

Let us partition the matrices V:,1:m(s)andV:,1:n(s) as follows:     V1,1 V1,2 V1 V:,1:m(s) = and V:,1:n(s) = , V2,1 V2,2 V2 wheretherowsizeofV1,1, V1,2 and V1 is equal to m − n,therowsizeofV2,1, V2,2 and V2 is equal to 2n, the column size of V1,1 and V2,1 is equal to m − n and the column size of V1,2, V2,2, V1 and V2 is equal to n.ThenVC can be written as   Λ(e1:m−n)V1,1 Λ(e1:m−n)V1,2 Λ(f1:m−n)V1 VC = . Λ(em−n+1:m+n)V2,1 Λ(em−n+1:m+n)V2,2 Λ(fm−n+1:m+n)V2

Our algorithm for computing the determinant of VC will emerge from the constructive proof of the following equation.

Theorem 3.   N N   1,1 1,2 L 0 V  0 N  = 1 ,(1) C 2,2 XL 0 D 2 where the row size of N1,1, N1,2 and L1 is equal to m − n,therowsizeofN2,2 and D is equal to n,therowsizeofX and L2 is equal to 2n, the column size of N1,1, L1 and X is equal to m − n and the column size of N1,2, N2,2, D and L2 is equal to 2n. The matrix N1,1 has to be an upper triangular matrix containing ones on the main diagonal, the matrices N2,2 and D have to be block upper triangular matrices whoseblockshavesize1× 2, the matrix L1 has to be a lower triangular matrix and the matrix L2 has to be a block lower triangular matrix whose blocks have size 2 × 2.

From the proof we will obtain an algorithm for constructing these matrices. If the 1 × 2 diagonal blocks of N2,2 and D are combined into square blocks, then the determinant of these blocks has to be equal to one. Hence,

det VC =  det L1 det L2,(2) where the possible sign change is equal to the determinant of   N2,2 D and corresponds to the permutation needed to combine the diagonal blocks of N2,2 and D.

Proof. Let us translate (1) into interpolation terms. We consider the jth column of (j) − the matrix N1,1 as the stacking vector of a polynomial n1,1(z)forj = 1, ..., m n. 106 P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants

These polynomials have to satisfy the following properties. Let j ∈ {1, ..., m − n}. Then (j) − n1,1(z) is a monic polynomial of degree j 1, (j) − ekn1,1(sk) = 0fork = 1, ..., j 1. (j) 6 ∈ − The latter implies that n1,1(sk) = 0 for each ek = 0, k {1, ..., j 1}. Without loss of generality we may assume that ek =6 0fork = 1, ..., m − n − 1. Indeed, suppose that there exists no permutation of the ek's such that the ®rst m − n − 1 ones are different from zero. Then at least m+n−(m−n−2) = 2n+2 ek's are equal to zero. However, as 2n + 2 >nit follows that in this case det VC = 0, which we exclude. (j) − − Thus, n1,1(sk) = 0forj = 1, ..., m n and k = 1, ..., j 1. In other words, jY−1 (j) − n1,1(z) = (z sk) k=1 for j = 1, ..., m − n. Note that the diagonal entries of the lower triangular matrix L1 are equal to jY−1 (j) − ejn1,1(sj) = ej (sj sk) k=1 for j = 1, ..., m − n. Hence,

mY−n jY−1 det L1 = ej (sj − sk) = e1 ···em−n det V (s1, ..., sm−n), (3) j=1 k=1 where V (s1, ..., sm−n) denotes the Vandermonde matrix with nodes s1, ..., sm−n. We consider the jth block column (of size m × 2) of   N1,2 N2,2 as the stacking vector of a row polynomial vector n(j)(z) ∈ C[z]1×2 for j = 1, ..., n. Similarly, we consider the jth block column (of size n×2) of D as the stacking vector of a row polynomial vector d(j)(z) ∈ C[z]1×2 for j = 1, ..., n.Thendegn(j)(z)

(j) (j) ekn (sk) + fkd (sk) = 0fork = 1, ..., m − n + 2(j − 1). P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants 107

If we de®ne the 2 × 2 matrix polynomial Bj(z)as   (j) n (z) × B (z):= ∈ C[z]2 2, j d(j)(z) then we can write the last property as

[ek fk ]Bj(sk) = [00], k = 1, ..., m − n + 2(j − 1).

In other words, Bj(z) has to satisfy certain interpolation conditions, in addition to certain degree conditions and a determinant condition. 

Let us summarize what we have obtained so far. The determinant of the coupled Vandermonde matrix VC can be computed from the product of det L1 and det L2, cf. equation (2). We have already obtained an explicit expression for det L1,cf. equation (3). The matrix L2 is a block lower triangular matrix (whose blocks have size 2 × 2) and, hence, its determinant is equal to the product of the determinants of the diagonal blocks. We have seen that this matrix is related to linearized rational interpolation problems. We will now show how the matrix polynomials Bj(z) can be computed. The 2 × 2 diagonal blocks of L2 are obtained as a by-product of our interpolation algorithm. Let b1(z) be the interpolating polynomial of degree 6 m − n − 1 that satis®es b1(sk) = −fk/ek for k = 1, ..., m − n.De®neB1(z)as   mY−n  (z − sk) b1(z) B1(z):=   . (5) k=1 01

Clearly B1(z) satis®es (4) for j = 1. We will use the following recurrence relation to compute the 2 × 2 matrix polynomials B2(z), ..., Bn(z):

Bj+1(z) = Bj(z)Bj→j+1(z), j = 1, ..., n − 1.

We will determine the 2 × 2 matrix polynomials Bj→j+1(z) such that  deg Bj→j+1(z) = 1 and det hdc Bj→j+1(z) = 1(6) for j = 1, ..., n − 1. Here ªhdcº denotes the highest degree coef®cient. This already guarantees that B2(z), ..., Bn(z) satisfy the ®rst three conditions of (4). The degrees of freedom that remain in ®xing Bj→j+1(z) are used to satisfy the interpolation condition of (4), i.e.,

[ek fk ]Bj(sk)Bj→j+1(sk) = [00]fork = 1, ..., m − n + 2j. Since

[ek fk ]Bj(sk) = [00] 108 P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants for k = 1, ..., m − n + 2(j − 1), we have to determine Bj→j+1(z) such that     ? ? ? ? e f Bj s Bj→j+1 s = [00] and     ?? ?? ?? ?? e f Bj s Bj→j+1 s = [00], ? ?? ? ?? ? ?? where s := sm−n+2j−1, s := sm−n+2j and similarly for e , e , f and f .If we de®ne      ? ? ? ? ? l r := e f Bj s and      ?? ?? ?? ?? ?? l r := e f Bj s , then Bj→j+1(z) has to satisfy    ? ? ? l r Bj→j+1 s = [00],    (7) ?? ?? ?? l r Bj→j+1 s = [00].

Note that the 2 × 2 diagonal block of L2 that corresponds to Bj(z) is equal to   l? r? . (8) l?? r??

(L) (R) De®ne the matrix polynomials Bj→j+1(z)andBj→j+1(z)as     (L) z − sL αL (R) 10 Bj→j+1(z):= and Bj→j+1(z):= , 01 αR z − sR where αL, αR, sL, sR ∈ C, sL =6 sR. We may assume that the 2 × 2matrix(8)is nonsingular. Indeed, otherwise (2) implies that det VC = 0. Thus, it is impossible ? ? ? ? that both l and r are equal to zero. Suppose that l =6 0. Then let sL := s and ? ? αL := −r /l . It follows that    ? ? (L) ? l r Bj→j+1 s = [00]. (L) We call the construction of Bj→j+1(z)aleft step towards Bj→j+1(z). We have that         r?   l?? r?? B(L) s?? = l?? s?? − s? l?? − + r?? =: l˜?? r˜?? . j→j+1 l? Since   l? r? l?r˜?? = det ,(9) l?? r??

?? ?? ?? ?? it holds that r˜ =6 0. Now let sR := s and αR := −˜l /r˜ .Then    ˜?? ?? (R) ?? l r˜ Bj→j+1 s = [00]. P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants 109

(R) The construction of Bj→j+1(z) is called a right step towards Bj→j+1(z). We combine both steps to obtain Bj→j+1(z):   (L) (R) z + (αLαR − sL) αL(z − sR) Bj→j+1(z) = Bj→j+1(z)Bj→j+1(z) = . αR z − sR This matrix polynomial satis®es indeed the degree and determinant condition (6) and the interpolation condition (7). ? Similarly, in case r =6 0, one can construct Bj→j+1(z) as a right step followed by a left step.

3.1. Fast algorithms for computing the matrix polynomial Bn(z)

We can now formulate the following algorithm for computing Bn(z). One can easily check that it requires O((m + n)2) ¯oating point operations.

Compute B1(z) via (5). for j = 1, ..., n − 1 do ? ?? s ← sm−n+2j−1; s ← sm−n+2j ? ?? e ← em−n+2j−1; e ← em−n+2j ? ?? f ← fm−n+2j−1; f ← fm−n+2j ? ? ? ? ? [l r ] ← [e f ]Bj(s ) ?? ?? ?? ?? ?? [l r ] ← [e f ]Bj(s ) if left step followed by right step then ← · (L) · (R) Bj+1(z) Bj(z) Bj→j+1(z) Bj→j+1(z) else ← · (R) · (L) Bj+1(z) Bj(z) Bj→j+1(z) Bj→j+1(z) end if end for

The algorithm computes the values l?, r?, l?? and r?? only at the time when they are needed to continue the computations. To compute these values, the algorithm ? ?? needs the value that the matrix polynomial Bj(z) takes at the points s and s .Our algorithm can be categorized as a Levinson-like algorithm. Note that a pivoting strategy cannot be based on the values of [ek fk ]Bj(sk) in the interpolation points sk that have not yet been considered, as these values are not available. A Schur-like algorithm can overcome this problem.

Compute B1(z) via (5). for k = 1, ...,2n do [lk rk ] ← [em−n+k fm−n+k ]B1(sm−n+k) end for for j = 1, ..., n − 1 do 110 P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants

2n Apply pivoting on [sm−n+k lk rk ]k=2(j−1)+1. if left step followed by right step then for k = 2j + 1, ...,2n do   ? sm−n+k − s αL 10 [lk rk ] ← [lk rk ] ?? 01αR sm−n+k − s end for ← · (L) · (R) Bj+1(z) Bj(z) Bj→j+1(z) Bj→j+1(z) else for k = 2j + 1, ...,2n do   − ?? ← 10sm−n+k s αL [lk rk ] [lk rk ] ? αR sm−n+k − s 01 end for ← · (R) · (L) Bj+1(z) Bj(z) Bj→j+1(z) Bj→j+1(z) end if end for

Note that the explicit computation of Bj+1(z) can, in fact, be skipped since we only need some of the values of [lk rk ] to compute the determinant of L2. ? Observe that the determinant of L2 can be computed from the products of l and r˜??, cf. equation (9).

3.2. Superfast algorithms for computing the matrix polynomial Bn(z)

Superfast algorithms for computing Bn(z) are based on a divide-and-conquer strategy. Let us start by considering the case that the matrix polynomial B1(z)isgiven and that n is even. Then Bn(z) can be computed as

Bn(z) ← B1(z)B1→n/2(z)Bn/2→n(z), where n  deg B → (z) = and det hdc B → (z) = 1, 1 n/2 2 1 n/2 n  deg B → (z) = and det hdc B → (z) = 1, n/2 n 2 n/2 n and the following interpolation conditions are satis®ed:

[ek fk ]B1(sk)B1→n/2(sk) = [00] for k = (m − n) + 1, ...,(m − n) + n and

[ek fk ]B1(sk)B1→n/2(sk)Bn/2→n(sk) = [00] for k = (m − n) + n + 1, ...,(m − n) + 2n. If n is a power of 2, then these considerations lead to the following recursive algorithm: P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants 111

Bn(z) ← compute_B_power_of_2 (B1(z), s, e, f, m, n) ±Letτfast be an integer > 1. if 2n 6 2τfast then compute Bn(z) using a fast method end if for k = (m − n) + 1, ...,(m − n) + n do (1) (1) ← [lk−(m−n) rk−(m−n) ] [ek fk ]B1(sk) ... [A] end for (1) s ← s(m−n)+1:(m−n)+n (1) ← (1) (1) ← (1) l l1:n; r r1:n (1) (1) (1) B1→n/2(z) ← compute_B_power_of_2 (I2, s , l , r , n/2, n/2) ← · Bn/2(z) B1(z) B1→n/2(z) ... [B] for k = (m − n) + n + 1, ...,(m − n) + 2n do (2) (2) ← 0 [lk−m rk−m ] [ek fk ]Bn/2(sk) ... [A ] end for (2) s ← sm+1:m+n (2) ← (2) (2) ← (2) l l1:n; r r1:n (2) (2) (2) Bn/2→n(z) ← compute_B_power_of_2 (I2, s , l , r , n/2, n/2) ← · 0 Bn(z) Bn/2(z) Bn/2→n(z) ... [B ]

The matrix polynomial multiplications in steps [B] and [B0] can be done via the celebrated fast Fourier transform (see, for example, [1]). In this case, these steps require O((m − n/2) log(m − n/2)) and O(m log m) ¯oating point operations, respectively. If the interpolation points sk are the (m + n)th roots of unity, then these points can be permuted such that steps [A] and [A0] can also be done via FFT. The algorithm needs to be adapted only slightly. We will return to this problem in a moment. Let us ®rst concentrate on how to compute B1(z). After all, compute_B_power_of_2 assumes that B1(z) is already available. We are going to compute the matrix polynomial B1(z) as the last element in the (0) (1) (m−n) sequence B1 (z), B1 (z), ..., B1 (z), where   10 B(0)(z):= 1 01 and   Yj  (z − s ) β (z) B(j)(z):=  k j  , j = 1, ..., m − n, 1 k=1 01 where βj(z) denotes the interpolating polynomial of degree j−1 that satis®es βj(sk) = − (m−n) fk/ek for k = 1, ..., j. Clearly, B1(z) = B1 (z), cf. equation (5). One can easily 112 P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants

(j) − verify that the matrix polynomials B1 (z), j = 1, ..., m n, can be computed in a fast way via the following Schur-type algorithm:   10 B(0)(z) ← 1 01 for j = 1, ..., m − n do αL ←−fj/ej for k = j + 1, ..., m − n do [ek fk ] ← [ek(sk − sj) ekαL + fk ] end for   − z − s α B(j)(z) ← B(j 1)(z) j L 1 1 01 end for

Note that this algorithm corresponds to the Schur-type algorithm that we have presented earlier. However, only left steps are used. Otherwise, we will not obtain the required degree structure. If m − n is a power of 2, then a divide-and-conquer strategy similar to the one presented above can be used to obtain a superfast algorithm for computing B1(z). Let us combine what we have obtained so far. The matrix polynomial B1(z) (0) (1) (m−n) is computed as the last element in the sequence B1 (z), B1 (z), ..., B1 (z). Our algorithm takes into account the ®rst m − n interpolation conditions. If m − n is a power of 2, then the computations can be done in a superfast way. Given B1(z), the matrix polynomial Bn(z) is computed as the last element in the sequence B1(z), B2(z), ..., Bn(z). Our algorithm takes into account the remaining 2n inter- polation conditions. If 2n is a power of 2, then the computations can be done in a superfast way. Of course, we do not know if m − n and 2n are a power of 2. We only know that m + n = 2N is a power of 2. However, by combining the two sequences, i.e., by computing Bn(z) as the last element in the sequence

(0) (1) (m−n) B1 (z), B1 (z), ..., B1 (z), B2(z), ..., Bn(z), we can obtain a superfast algorithm. Note that the degree structure of B1(z) requires that the ®rst m − n steps in this algorithm have to be left steps. The remaining steps can either be a left step followed by a right step or a right step followed by a left step. The previous considerations lead to the following algorithm. We assume that the interpolation points sk are the (m + n)th roots of a complex number γ of modulus one. In other words, let α ∈ [0, 2π)andγ := eiα. Then we de®ne   iα s := exp k , k = 1, ..., m + n. k m + n P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants 113

In the divide-and-conquer step, we will split the vector s up into its odd and its even part: ν := (m + n)/2and

(1) (2) sk := s2k−1 and sk := s2k

(1) (1) (2) (2) for k = 1, ..., ν. Also, let s := s1:ν and s := s1:ν .

Bn(z) ← compute_B (s, e, f, m, n) ± We assume that m > n and that m + n is > 2 and a power of 2. s = s m+n ± [ k]k=1 where  iα s = exp k for k = 1, ..., m + n and α ∈ [0, 2π). k m + n ± Is the interpolation problem a polynomial one? if n = 0 then Let b(m)(z) be the interpolating polynomial of degree 1. if m + n 6 2τfast then Use a fast method to compute Bn(z). Keep in mind that the ®rst m − n interpolation conditions can only correspond to left steps while the remaining 2n interpolation conditions can correspond to a left step followed by a right step or vice versa. else ± divide-and-conquer ν ← (m + n)/2 (1) ← ν (2) ← ν s [s2k−1]k=1; s [s2k]k=1 (1) ← ν (2) ← ν e [e2k−1]k=1; e [e2k]k=1 (1) ← ν (2) ← ν f [f2k−1]k=1; f [f2k]k=1 n1 ← max{0, n − ν/2}; m1 ← ν − n1 (1) (1) (1) (1) B (z) ← compute_B (s , e , f , m1, n1) for k = 1, ..., ν do (2) (2) ← (2) (2) (1) (2) [ek fk ] [ek fk ]B (sk ) end for n2 ← n − n1; m2 ← m − m1 (2) (2) (2) (2) B (z) ← compute_B (s , e , f , m2, n2) (1) (2) Bn(z) ← B (z) · B (z) end if end if 114 P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants

3.3. Numerical stability

The numerical stability of our superfast algorithm can be enhanced in several ways: via pivoting of the interpolation conditions, via iterative re®nement of the computed solutions to the linearized rational interpolation problems, via downdating of interpolation conditions, and by postponing what we have called ªdif®cult interpolation pointsº until the very end of the algorithm. We have introduced these stabilizing techniques in the context of fast and superfast solvers for linear systems of equations that have Hankel or Toeplitz structure and we refer the reader to [5,8] for more details. An important difference with the solution of structured linear systems, though, is the fact that we cannot use iterative re®nement at the very end of the algorithm to increase the accuracy of the computed determinant. It is an open question if a procedure exists for re®ning an approximation for a determinant that is similar to the classical iterative re®nement procedure that can be used to improve an approximation for the solution of a linear system of equations.

4. Numerical experiments

We have implemented our algorithm in Fortran 90.

Figure 1. The accuracy of the results obtained by our approach. P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants 115

Figure 2. Execution time.

In the following numerical experiments we have considered circulant Toeplitz matrices whose entries are uniformly distributed in [0, 1]. It is well known that circulant matrices can be transformed into diagonal matrices via FFT [1]. We have used this to obtain the exact value of the determinant. We let n = 2k,wherek = 8, ..., 18. Note that 218 = 262144. For each value of n we consider ®ve matrices. The accuracy of the results obtained by our approach is shown in ®gure 1. We plot the logarithm with base 10 of the absolute value of the relative error versus the order of the Toeplitz matrix. We show the accuracy obtained by the fast algorithm, by the superfast algorithm in case no iterative re®nement is applied, and by the superfast algorithm in case the computed solutions to interpolation problems involving more than 28 interpolation conditions are re®ned iteratively. In ®gure 2 we plot the corresponding execution time in seconds versus the order of the Toeplitz matrix.

References

[ 1] D . A . B i ni a nd V. Y. P a n, Polynomial and Matrix Computations,Vol.1: Fundamental Algorithms (Birkhauser,È Basel, 1994). [2] G. Heinig, Solving Toeplitz systems via extension and transformation, in: Proc. of the Workshop Toeplitz Matrices: Structure, Algorithms and Applications, Cortona, Italy (September 9±12, 1996), Calcolo 33 (1996) 115±129. 116 P. Kravanja, M. Van Barel / Superfast computation of Toeplitz determinants

[3] G. Heinig, Transformation approaches for fast and stable solution of Toeplitz systems and polynomial equations, in: Proc. of the Internat. Workshop ªRecent Advances in Applied Mathematicsº,Kuwait (May 4±7, 1996) pp. 223±238. [4] G. Heinig and K. Rost, Algebraic Methods for Toeplitz-like Matrices and Operators, Operator Theory: Advances and Applications, Vol. 13 (Birkhauser,È Basel, 1984). [5] P. Kravanja, On computing zeros of analytic functions and related problems in structured numerical , Ph.D. thesis, Department of Computer Science, Katholieke Universiteit Leuven (1999). [6] M. Van Barel, G. Heinig and P. Kravanja, A stabilized superfast solver for nonsymmetric Toeplitz systems, Report TW 293, Department of Computer Science, Katholieke Universiteit Leuven (October 1999). [7] M. Van Barel and P. Kravanja, On the generically superfast computation of Hankel determinants, in: Large-Scale Scienti®c Computations of Engineering and Environmental Problems II, eds. M. Griebel, S. Margenov and P. Yalamov, Notes on Numerical Fluid Mechanics, Vol. 73 (Vieweg, 2000) pp. 57± 64; Proc. of the 2nd Workshop on Large-Scale Scienti®c Computations, Sozopol, Bulgaria (June 2±6, 1999). [8] M. Van Barel and P. Kravanja, A stabilized superfast solver for inde®nite Hankel systems, in: Internat. Linear Algebra Society Symposium ªLinear Algebra in Control Theory, Signals and Image Processingº, University of Manitoba, Canada (6±8 June 1997), special issue of Linear Algebra Appl. 284(1±3) (1998) 335±355.