<<

The Unreasonable Effectiveness of Banach Algebras in Numerical Analysis

Thomas Strohmer Dept. of University of California, Davis

May 2007 Parts of this research have been done in collaboration with Karlheinz Grochenig,¨ Ziemowit Rzeszotnik, and Ilya Krishtal Some problems from numerical analysis

◮ Solving an infinite system of linear equations ◮ Causal Wiener filters and spectral factorizations ◮ Localized Gram-Schmidt factorization ◮ exponentials Solving an infinite system of linear equations Assume we want to solve

Ax = b

where A is a bi-infinite matrix: A : ℓ2(Z) → ℓ2(Z), b ∈ ℓ2(Z). Finite Section Method: Let

Pnb = (..., 0, b−n, b−n+1,..., bn−1, bn, 0,... )

be the orthogonal projection of rank 2n + 1. We set

An = PnAPn and bn = Pnb,

and try to solve the finite system Anxn = bn. Questions:

◮ Does xn converge to x as n →∞? ◮ How fast is the convergence?

◮ What if xn does not converge to x? Consider Ax = b where . ..   0100  1000  A =   .  0001       0010   .   ..   The solution to Ax = b is Ab, since A−1 = A. Now assume we don’t know A−1. Example contd.

Now define Pnb = (..., 0, b−n, b−n+1,..., bn−1, bn, 0,... ) and we try to solve PnAPnxn = Pnb for n = 1, 2,... . This fails, since none of the A2n, n ∈ N is invertible. Note that A is symmetric, so it’s not a question of picking the right main diagonal! In this block-Toeplitz-case it is easy how to construct a finite approximation scheme. But how do we handle more general matrices? Causal Wiener filters and spectral factorizations

Given a wide sense stationary process y = {yk }k∈Z with known power spectrum Φ as input of a (yet to be determined) linear filter h = {hk }k∈Z. Let x = {xk }k∈Z be the given desired output of h, and let Ψ be the of x ∗ y.

Problem: We look for a causal h (i.e., hk = 0 if k < 0) such that the output of this filter, xˆk = l∈Z hl yk−l , minimizes kx − xˆk2. P Solution (Wiener ’32): requires spectral factorization of Φ, i.e., Cholesky factorization of bi-infinite Toeplitz matrix T with entries Tkl = (y ∗ y)(k − l): Causal Wiener filters and spectral factorizations

Given a wide sense stationary process y = {yk }k∈Z with known power spectrum Φ as input of a (yet to be determined) linear filter h = {hk }k∈Z. Let x = {xk }k∈Z be the given desired output of h, and let Ψ be the Fourier transform of x ∗ y.

Problem: We look for a causal h (i.e., hk = 0 if k < 0) such that the output of this filter, xˆk = l∈Z hl yk−l , minimizes kx − xˆk2. P Solution (Wiener ’32): requires spectral factorization of Φ, i.e., Cholesky factorization of bi-infinite Toeplitz matrix T with entries Tkl = (y ∗ y)(k − l):

Compute T = C∗C, where C, C−1 are upper triangular Cholesky factorization for Toeplitz and other matrices

Problems of Cholesky factorization for biinfinite matrices: ◮ C is not unique ◮ C−1 may not be upper triangular ◮ need numerical scheme to (approximately) compute C ◮ finite-dim. triangular matrices can behave fundamentally different from infinite-dim. ones ◮ for Toeplitz case: want to approximate h by FIR filter. The last condition means for the classical Wiener filter setup that h should be well approximated by an FIR filter. Cholesky factorization for Toeplitz and other matrices

Problems of Cholesky factorization for biinfinite matrices: ◮ C is not unique ◮ C−1 may not be upper triangular ◮ need numerical scheme to (approximately) compute C ◮ finite-dim. triangular matrices can behave fundamentally different from infinite-dim. ones ◮ for Toeplitz case: want to approximate h by FIR filter. The last condition means for the classical Wiener filter setup that h should be well approximated by an FIR filter. It follows from Arveson’s work [’75] on factorization in nest algebras that a biinfinite positive-definite matrix A has a unique factorization A = C∗C such that C and C−1 are upper-triangular (see also Chui, Ward, Smith [’82]). Localized ONBs and Gram-Schmidt algorithm

In various applications (kernel-based data mining, wireless communications, ...) one is given a (symmetric) matrix A whose entries decay fast off the diagonal, and one wants to construct a “localized” orthonormal basis (ONB) {ψk } where the vectors ψk are approximately compactly supported. Localized ONBs and Gram-Schmidt algorithm

In various applications (kernel-based data mining, wireless communications, ...) one is given a (symmetric) matrix A whose entries decay fast off the diagonal, and one wants to construct a “localized” orthonormal basis (ONB) {ψk } where the vectors ψk are approximately compactly supported. Eigenvectors of A are in general not localized. Localized ONBs and Gram-Schmidt algorithm

In various applications (kernel-based data mining, wireless communications, ...) one is given a (symmetric) matrix A whose entries decay fast off the diagonal, and one wants to construct a “localized” orthonormal basis (ONB) {ψk } where the vectors ψk are approximately compactly supported. Eigenvectors of A are in general not localized. What about ONB constructed via Gram-Schmidt method? Matrix exponentials

Matrix exponential eAt forms vital part of many recent algorithms for differential equations evolving on Lie groups. Iserles [’00]: Assume A is an M-banded matrix, i.e., akl = 0 for |k − l| > M. Then the entries of eA decay exponentially off the diagonal with exponent and constants depending on M. Benzi, Golub [’99]: consider more general matrix functions f (A) where A is banded. Using non-trivial, lengthy proofs, both obtain very good estimates when M is small, but for large M current theory predicts practically no decay. Many matrices are not exactly banded, but have entries which decay fast off the main diagonal. Approximating them by banded matrix gives rise to fairly large M and thus overly pessimistic estimates. Banach algebras

Take an A which is also a Banach . The algebra multiplication and the must satisfy: ∀x, y ∈A : kx yk ≤ kxk kyk. We assume that A is a B∗-algebra, i.e., a with involution ∗. Note: C∗-algebras are not sufficient to solve the aforementioned problems We focus on B∗-algebras which describe matrices with off-diagonal decay. Weights

Off-diagonal decay is quantified by weight functions. We consider weight functions v which satisfy: (i) v is even and normalized such that v(0)= 1. (ii) v is submultiplicative, i.e., v(k + l) ≤ v(k)v(l) for all k, l ∈ Z. (iii) v satisfies the Gelfand-Raikov-Shilov (GRS) condition

1 lim v(nk) n = 1 for all k ∈ Z. n→∞

Example:

b v(x)= eax (1 + |x|)s, a, s > 0, 0 ≤ b ≤ 1 Examples for matrix Banach algebras

Jaffard class Av : consists of matrices A = (akl )k,l∈Z which satisfy −s |akl |≤ C(1 + |k − l|) ∀k, l ∈ Z with norm s kAkAv = sup |akl |(1 + |k − l|) k,l∈Z and weight v(x) = (1 + |x|)−s.

Gohberg-Baskakov-Sjostrand¨ class Cv : consists of matrices A = (akl )k,l∈Z which satisfy

kAkCv := sup |ak,k−l| v(l) < ∞. k Z Xl∈Z ∈ Wiener’s Lemma and Banach algebras

Norbert Wiener [1932]: Let f be a periodic function with

2πikx f (x)= ak e for which |ak | < ∞ kX∈Z kX∈Z If f (x) 6= 0 for all x, then

−1 2πikx f (x)= bk e with |bk | < ∞ kX∈Z kX∈Z

This result can also be stated in the following way: Let A be a biinfinite Toeplitz matrix with entries Ak,l = ak−l for k, l ∈ Z. If A is invertible on ℓ2(Z) then its inverse B := A−1 has entries

Bk,l = bk−l which satisfy k∈Z |bk | < ∞. The biinfinite Toeplitz matricesP with absolutely summable entries form a commutative inverse-closed Banach algebra. Inverse-closedness (Wiener property)

Assume A is a matrix Banach algebra and A ∈A. If the property that A is an on ℓ2(Z) implies that A−1 ∈A then A is called inverse-closed.

This is equivalent to: the spectrum σA(A) in the algebra A coincides with the spectrum of σ(A) as an on ℓ2(Z). Theorem: Av is inverse-closed [Jaffard ’91, Grochenig-Leinert¨ ’04] Theorem: Cv is inverse-closed [Gohberg ’89, Kurbatov ’90, Baskakov ’90, Sjostrand¨ ’94] The finite section method revisited

Recall: We want to approximate the solution of Ax = b by solving Anxn = bn for increasing n. Theorem: Assume A is symmetric positive definite on ℓ2(Z), 2 then xn converges to x in ℓ (Z). Proof [Gohberg-Feldman, Boettcher-Silbermann,...]: By assumption σ(A) ∈ [λ−, λ+]. Step 1: Show that σ(An) ⊂ σ(A); this implies that

−1 −1 sup kAn kop ≤ kA kop n∈N

Step 2: Define

An := An + λ+(I − Pn) f 2 then σ(An) ⊆ [λ−, λ+]. Moreover all An are invertible on ℓ (Z), and An fconverges strongly to A. f f Step 3:

1 1 − −1 − −1 kAn b − A bk2 ≤ sup kAn kop k(A − An)A bk2 n f f f −1 The strong convergence An → A implies that An converges −1 strongly to A . f f Step 4:

1 1 −1 − − kx − xnk2 ≤ k(A − An )bk2 + kAn (b − Pnb)k2 f f The first term goes to 0 by Step 3, and the second term is estimated by

−1 −1 −1 kAn (b − Pnb)k2 ≤ sup kAn kopkb − Pnbk2 ≤ λ− kb − Pnbk2 n f f and also goes to zero. Questions: ◮ does the finite section method also converge in other p norms, e.g. in ℓv -norms? ◮ Can we give quantitative estimates about rate of convergence? ◮ What if A is not symmetric positive definite? Inspection of proof shows: we can answer the first two questions if we can show: ◮ −1 p Both A and A are bounded on ℓv −1 ◮ sup kAn k p p is finite; and n ℓv →ℓv ◮ p the finitef sequences are dense in ℓv . Let A be one of the matrix algebras Av , Cv (or others). Lemma: Assume A ∈A. Inverse-closedness of A implies that p A is bounded on all ℓv (Z). Here, the weight m is “almost the same” as the weight v. Harder part of the proof: to show that

−1 sup kAn kℓp→ℓp < ∞ n v v f This would be easy for ℓ2 or if we did consider C∗-algebras instead of B∗-algebras. But C∗-algebras do not allow quantitative estimates! Our current proofs are lengthy and complicated. Is there a more natural proof? p Convergence in ℓv and rate of convergence

Theorem: Let A be one of the inverse-closed algebras Av , Cv . Assume that A ∈A is positive and invertible on ℓ2(Z) and acts p boundedly on ℓv (Z). If p < ∞, then the finite section method converges in the norm p of ℓv (Z). If p = ∞, then the finite section method converges in the ∗ weak -topology. In particular, xn goes to x entrywise. p q If b ∈ ℓv , then the finite section method converges in ℓv with the error estimate kx − xnk q ≤ Ckbk p ϕ(n), ℓv ℓv −1 −1 where C = kA k q (1 + kAk p kA k ) and ℓv ℓv n A

e 1 ϕ(n)= v(k)−2 2  kX>|n|  Finite section method for non-symmetric matrices

Assumption that A is symmetric positive definite was crucial. Up to now no good results are known when A is not symmetric, unless A is of Toeplitz-type. We cannot simply consider A∗Ax = A∗b instead of Ax = b, since in general we cannot compute A∗A exactly. Consider the system Ax = b where A has an (left)inverse, but is not necessarily a . We set

∗ Ar,n = Pr APn, and br,n = Ar,nb,

and try to solve the system

∗ Ar,nAr,nxr,n = br,n

for properly chosen r and n. Nonsymmetric case contd.

p Theorem: Let Ax = b be given and assume that b ∈ ℓv (Z) and 2 p that A ∈Av is invertible on ℓ (Z) and acts on ℓm. Then, for every n there exists an R(n) (depending on λ− and v) p such that xr(n),n converges to x in the norm of ℓm, for every choice r(n) ≥ R(n). Simple key idea: We need to let r grow faster than n. If v(k) = (1 + |k|)s for s > 1, then R(n) can be chosen to be nα 2s for α> 2s−1 This theorem even considerably generalizes the class of Laurent operators (=biinfinite Toeplitz matrices) for which the (nonsymmetric) finite section method is applicable. Localized ONBs via Gram-Schmidt

Let A ∈A and assume A has a left inverse on ℓ2(Z). We want to construct an ONB {ψl }l∈Z for range of A such that ψl are “nicely localized”. Applying Gram-Schmidt to A is equivalent to computing QR decomposition of A: A = QR where Q has unitary columns and R is upper triangular. Existence of QR decomposition for A in ℓ2(Z)-setting follows from Arveson. Localized QR decomposition

Theorem: Let A ∈Cv and A = QR. Then Q, R ∈Cv . Proof: We first prove that A∗A has a unique Cholesky factorization A∗A = C∗C, such that C, C−1 are upper triangular and C, C−1 ∈A. Key idea (due to Gohberg, Baskakov): expand A into noncommutative Fourier series

2πikω A(ω)= e Ak , kX∈Z

where Ak is the k-th diagonal of A. Now we apply results from Wiener and Gohberg on spectral factorization; weighted case can be handled by techniques due to Gelfand et al. Now observe that R∗R = R∗Q∗QR = A∗A = C∗C. Uniqueness implies that R ∈A. −1 Since Cv is a Banach algebra we have AR = Q ∈A. Hence ψl are “as localized” as off-diagonal decay of A. Cholesky factorization in Banach algebras

Theorem: Let A be a biinfinite lower triangular matrix, such that −1 all its finite sections An are invertible, supn k(An) k1 = C < ∞ −1 and supn k(An) k∞ = C < ∞. Then A has a lower-triangular inverse and kA−1k≤ C. The Schur-type conditions can be weakened. Our results also answers problems on possibility and quality of approximation of causal Wiener filters via FIR filters. Boche et al. proved some no-go theorems for causal approximation of FIR filters. Our results show that with modest decay, stable causal approximation of FIR filters is possible. Other applications: design of feedback equalizers for MIMO communications,... Open problem

Assume A ∈A where A is any inverse-closed Banach algebra. Let A be positive definite in A. Is it true that there exists a Cholesky factorization A = CC∗ such that C, C−1 ∈A? Some progress with Ilya Krishtal on this question (using connection to finite section method,...), but by far no complete solution. , SVD

Polar decomposition: Theorem: Let A ∈A. Then A has a polar decomposition A = PU with P, U ∈A ... follows easily from Banach square root theorem. Singular value decomposition, spectral decomposition: Singular vector or eigenvectors of A do not inherit localization of matrix A But if {φl } are left singular vectors of A, {ψk } are right singular vectors of A, and if A ∈A, then the “cross correlation matrix” (hφl , ψk i)k,l ∈A. Matrix exponentials

Theorem: Let A be invertible on ℓ2(Z). Assume A ∈A where A is inverse-closed and f is analytic on σ(A) then f (A) ∈A. This follows from Dunford representation of f (A):

1 1 f (A)= f (z) zI − A) − dz 2πi I Γ 

where Γ is a closed contour containing σ(A) in its . Corollary: Let A be invertible on ℓ2(Z) and A ∈A, then eA ∈A. Conclusion

B∗-algebras are an excellent example for abstract nonsense, yet they provide an “unreasonably” powerful and elegant tool for a variety of problems in numerical analysis. C∗-algebras (advocated by Arveson, Boettcher, Silbermann,...) are useful to prove qualitative statements in numerical analysis. B∗-algebras are useful to prove quantitative statements in numerical analysis.