Block Matrix Computations and the Singular Value Decomposition A

BlockMatrixComputations and the Singular Value Decomposition ATaleofTwoIdeas Charles F. Van Loan Department of Computer Science Cornell University Supported in part by the NSF contract CCR-9901988. Block Matrices A block matrix is a matrix with matrix entries, e.g., ×××× ×××× A11 A12 3 2 A = = A IR ×××× ij × A21 A22 ∈ ×××× ×××× ×××× Operations are pretty much “business as usual”, e.g. T T T T T A11 A12 A11 A12 A11 A21 A11 A12 A11A11 + A21A21 etc = = T T A21 A22 A21 A22 A A A21 A22 etc etc 12 22 E.g., Strassen Multiplication C11 C12 A11 A12 B11 B12 = C21 C22 A21 A22 B21 B22 P1 =(A11 + A22)(B11 + B22) P2 =(A21 + A22)B11 P3 = A11(B12 B22) − P4 = A22(B21 B11) − P5 =(A11 + A12)B22 P6 =(A21 A11)(B11 + B12) − P7 =(A12 A22)(B21 + B22) − C11 = P1 + P4 P5 + P7 − C12 = P3 + P5 C21 = P2 + P4 C22 = P1 + P3 P2 + P6 − Singular Value Decomposition (SVD) m n m m n n If A IR × then there exist orthogonal U IR × and V IR × so ∈ ∈ ∈ σ1 0 T U AV = Σ =diag(σ1,...,σn)= 0 σ2 00 where σ1 σ2 σn are the singular values. ≥ ≥ ···≥ Singular Value Decomposition (SVD) m n m m n n If A IR × then there exist orthogonal U IR × and V IR × so ∈ ∈ ∈ σ1 0 T U AV = Σ =diag(σ1,...,σn)= 0 σ2 00 where σ1 σ2 σn are the singular values. ≥ ≥ ···≥ Fact 1. The columns of V = v1 vn and U = u1 un are the right and left singular vectors and··· they are related: ··· Avj = σjuj j =1:n T A uj = σjvj Singular Value Decomposition (SVD) m n m m n n If A IR × then there exist orthogonal U IR × and V IR × so ∈ ∈ ∈ σ1 0 T U AV = Σ =diag(σ1,...,σn)= 0 σ2 00 where σ1 σ2 σn are the singular values. ≥ ≥ ···≥ Fact 2. The SVD of A is related to the eigen-decompositions of AT A and AAT : T T 2 2 V (A A)V =diag(σ1,...,σn) T T 2 2 U (AA )U =diag(σ1,...,σn, 0,...,0) m n − Singular Value Decomposition (SVD) m n m m n n If A IR × then there exist orthogonal U IR × and V IR × so ∈ ∈ ∈ σ1 0 T U AV = Σ =diag(σ1,...,σn)= 0 σ2 00 where σ1 σ2 σn are the singular values. ≥ ≥ ···≥ Fact 3. The smallest singular value is the distance from A to the set of rank deficient matrices: σmin = min A B F rank(B) <n − Singular Value Decomposition (SVD) m n m m n n If A IR × then there exist orthogonal U IR × and V IR × so ∈ ∈ ∈ σ1 0 T U AV = Σ =diag(σ1,...,σn)= 0 σ2 00 where σ1 σ2 σn are the singular values. ≥ ≥ ···≥ T Fact 4. The matrix σ1u1v1 is the closest rank-1 matrix to A, i.e., it solves the problem: σmin = min A B F rank(B)=1 − The High-Level Message.. It is important to be able to think at the block level because of problem • structure. It is important to be able to develop block matrix algorithms • There is a progression... • “Simple” Linear Algebra ↓ Block Linear Algebra Multilinear↓ Algebra Reasoning at the Block Level Uncontrollability The system n n n p x˙ = Ax + Bu A IR × ,B IR × ,n>p ∈ ∈ is uncontrollable if t A(t τ) T Aτ G = e − BB e dτ 0 is singular. Uncontrollability The system n n n p x˙ = Ax + Bu A IR × ,B IR × ,n>p ∈ ∈ is uncontrollable if t A(t τ) T Aτ G = e − BB e dτ 0 is singular. T ABB F11 F12 At˜ T A˜ = e = G = F F12 11 T 0 A −→ 0 F22 −→ Nearness to Uncontrollability The system n n n p x˙ = Ax + Bu A IR × ,B IR × ,n>p ∈ ∈ is nearly uncontrollable if the minimum singular value of t A(t τ) T Aτ G = e − BB e dτ 0 is small. T ABB F11 F12 At˜ T A˜ = e = G = F F12 11 0 A −→ 0 F22 −→ Developing Algorithms at the Block Level Block Matrix Factorizations: A Challenge By a block algorithm we mean an algorithm that is rich in matrix-matrix multiplication. Is there a block algorithm for Gaussian elimination? I.e., is there a way to rearrange the O(n3) operations so that the implementation spends at most O(n2)timenot doing matrix-matrix multiplication? Why? Re-use ideology: when you touch data, you want to use it a lot. Not all linear algebra operations are equal in this regard. Level Example Data Work 1 α = yT z O(n) O(n) y = y + αz O(n) O(n) 2 y = y + As O(n2) O(n2) A = A + yzT O(n2) O(n2) 3 A = A + BC O(n2) O(n3) Here, α is a scalar, y, z vectors, and A, B, C matrices Scalar LU a11 a12 a13 100 u11 u12 u13 a21 a22 a23 = 21 10 0 u22 u23 a a a 1 00u 31 32 33 31 32 33 a11 = u11 u11 = a11 a = u u = a 12 12 12 12 a = u u = a 13 13 13 13 a21 = 21u11 21 = a21/u11 a = u = a /u 31 31 11 31 31 11 ⇒ a22 = 21u12 + u22 u22 = a22 21u12 − a23 = 21u13 + u23 u23 = a23 21u13 − a32 = 31u12 + 32u22 32 =(a32 31u12)/u22 − a = u + u + u u = a u u 33 31 13 32 23 33 33 33 31 13 32 23 − − Recursive Description n 1 (n 1) (n 1) If α IR, v, w IR − ,andB IR − × − then ∈ ∈ ∈ α wT 10 α wT A = = ˜ ˜ vB v/α L 0 U is the LU factorization of A if L˜U˜ = A˜ is the LU factorization of A˜ = B vwT /α. − Rich in level-2 operations. Block LU: Recursive Description L11 0 U11 U12 A11 A12 p = A21 A22 n p L21 L22 0 U22 − pnp − A11 = L11U11 Get L11, U11 via LU of A11 A21 = L21U11 Solve triangular systems for L21 A12 = L11U12 Solve triangular systems for U12 A22 = L21U12 + L22U22 Form A˜ = A22 L21U12 − ↓↓↓ Get L22, U22 via LU of A˜ Block LU: Recursive Description L11 0 U11 U12 A11 A12 p = A21 A22 n p L21 L22 0 U22 − pnp − 3 A11 = L11U11 Get L11, U11 via LU of A11 O(p ) 2 A21 = L21U11 Solve triangular systems for L21 O(np ) 2 A12 = L11U12 Solve triangular systems for U12 O(np ) 2 A22 = L21U12 + L22U22 Form A˜ = A22 L21U12 O(n p) − ↓↓↓ Recur Get L22, U22 via LU of A˜ →→ Rich in level-3 operations!!! Consider: A˜ = A22 L21U12 − p =3: ××××××× ××× ××××××× ××× ××××××× ××× ××××××× ××××××× − ××× ××××××× ××××××× ××× ××××××× ××××××× ××× ××××××× ××× p must be “big enough” so that the advantage of data re-use is realized. Block Algorithms for (Some) Matrix Factorizations LU with pivoting: • PA = LU P permutation, L lower triangular, U upper triangular Cholesky factorization for symmetric positive definite A: • A = GGT G lower triangular The QR factorization for rectangular matrices: • m m m n A = QR Q IR × orthogonal R IR × upper triangular ∈ ∈ Block Matrix Factorization Algorithms via Aggregation Developing a Block QR Factorization The standard algorithm computes Q as a product of Householder reflections, T Q A = Hn H1A = R ··· After 2 steps... ××××× 0 ×××× 00 H H A = . 2 1 ××× 00 ××× 00 ××× 00 ××× The H matrices look like this: H = I 2vvT v a unit 2-norm column vector − Developing a Block QR Factorization The standard algorithm computes Q as a product of Householder reflections, T Q A = Hn H1A = R ··· After 2 steps... ××××× 0 ×××× 002 H H A = . 2 1 × ×× 2 00 × ×× 002 × ×× 002 × ×× The H matrices look like this: H = I 2vvT v a unit 2-norm column vector − Developing a Block QR Factorization The standard algorithm computes Q as a product of Householder reflections, T Q A = Hn H1A = R ··· After 3 steps... ××××× 0 ×××× 00 H H H A = . 3 2 1 ××× 000 ×× 000 ×× 000 ×× The H matrices look like this: H = I 2vvT v a unit 2-norm column vector − Aggregation A = A1 A2 p<<n pn p − Generate the first p Householders based on A1: • Hp H1A1 = R11 (upper triangular) ··· Aggregate H1,...,Hp: • T m p Hp H1 = I 2WY W, Y IR × ··· − ∈ Apply to rest of matrix: • T (Hp H1)A = (Hp H1)A1 (I 2WY )A2 ··· ··· − The WY Representation Aggregation: • T T T (I 2WY )(I 2vv )=I 2W+Y − − − + where T W+ = W (I 2WY )v − Y+ = Y v Application • A (I 2WYT )A = A (2W )(Y T A) ← − − The Curse of Similarity Transforms A Block Householder Tridiagonalization? A symmetric. Compute orthogonal Q such that 0000 ×× 000 ××× 0 00 T Q AQ = . ××× 00 0 ××× 000 ××× 0000 ×× The standard algorithm computes Q as a product of Householder reflections, Q = H1 Hn 2 ··· − A Block Householder Tridiagonalization? H1 introduces zeros in first column ×××××× ×××××× 0 T H A = . 1 ××××× 0 ××××× 0 ××××× 0 ××××× Scrambles rows 2 through 6. A Block Householder Tridiagonalization? Must also post-multiply... 0000 ×× ×××××× 0 T H AH = . 1 1 ××××× 0 ××××× 0 ××××× 0 ××××× H1 scrambles columns 2 through 6. A Block Householder Tridiagonalization? H2 introduces zeros into second column 0000 ×× ×××××× 0 2 T H AH = . 1 1 × ×××× 2 0 × ×××× 2 0 × ×××× 0 2 × ×××× Note that because of H1’s impact, H2 depends on all of A’s entries. The Hi can be aggregated, but A must be completely updated along the way destroys the advantage. Jacobi Methods for Symmetric Eigenproblem These methods do not initially reduce A to “condensed” form. They repeatedly reduce the sum-of-squares of the off-diagonal elements. 1000T 1 000 ×××× 0 0 c 0 s 0 c 0 s A = ××× A 0010 0 010 ← ×××× 0 0 s 0 c 0 s 0 c × ×× − − The (2,4) subproblem: T d1 0 cs a22 a24 cs = 0 d2 sc a42 a44 sc − − 2 2 2 c + s = 1 and the off-diagonal sum of squares is reduced by 2a24.

Block Matrix Computations and the Singular Value Decomposition A

On Multivariate Interpolation

Determinants of Commuting-Block Matrices by Istvan Kovacs, Daniel S

Irreducibility in Algebraic Groups and Regular Unipotent Elements

9. Properties of Matrices Block Matrices

Block Matrices in Linear Algebra

Linear Algebra Review

MATH. 513. JORDAN FORM Let A1,...,Ak Be Square Matrices of Size

Numerically Stable Coded Matrix Computations Via Circulant and Rotation Matrix Embeddings

The Jordan-Form Proof Made Easy∗

Lecture 13: Complex Eigenvalues & Factorization

11.6 Jordan Form and Eigenanalysis

The Analytic Theory of Matrix Orthogonal Polynomials