The Kronecker Product SVD Charles Van Loan October 19, 2009 The Kronecker Product B C is a block matrix whose ij-th block is b C. ⊗ ij E.g., b b b C b C 11 12 C = 11 12 b21 b22 ⊗ b21C b22C Replicated Block Structure The KP-SVD If A A N 11 ··· 1 A = . ... A IRp q ij ∈ × AM AMN 1 ··· then there exists a positive integer rA with rA MN so that rA ≤ A = σ B C rA = rankKP(A) k k ⊗ k Xk=1 The KP-singular values: σ σrA > 0. 1 ≥ ··· ≥ M N p q The Bk IR × and Ck IR × satisfy <Bi,Bj >= δij and ∈ ∈ T < Ci, Cj >= δij where <F,G>= trace(F G). Nearness Property Let r be a positive integer that satisfies r rA. The problem ≤ min A X k − kF rankKP(X) = r is solved by setting r X(opt) = σ B C . k k ⊗ k Xk=1 Talk Outline 1. Survey of Essential KP Properties Just enough to get through the talk. 2. Computing the KP-SVD It’s an SVD computation. 3. Nearest KP Preconditioners Solving KP Systems is fast. 4. Some Constrained Nearest KP Problems Nearest (Markov) (Markov) ⊗ 5. Multilinear Connections A low-rank approximation of a 4-dimensional tensor 6. Off-The-Wall / Just-For-Fun Computing log(det(A)) for Large Sparse Pos Def A Essential KP Properties Every bijckl Shows Up c c c b b 11 12 13 11 12 c c c b b ⊗ 21 22 23 21 22 c c c 31 32 33 = b11c11 b11c12 b11c13 b12c11 b12c12 b12c13 b11c21 b11c22 b11c23 b12c21 b12c22 b12c23 b c b c b c b c b c b c 11 31 11 32 11 33 12 31 12 32 12 33 b c b c b c b c b c b c 21 11 21 12 21 13 22 11 22 12 22 13 b21c21 b21c22 b21c23 b22c21 b22c22 b22c23 b21c31 b21c32 b21c33 b22c31 b22c32 b22c33 Hierarchical c c c c 11 12 13 14 d d d b b c c c c 11 12 13 A = 11 12 21 22 23 24 d d d b b ⊗ c c c c ⊗ 21 22 23 21 22 31 32 33 34 d d d c c c c 31 32 33 41 42 43 44 A is a 2-by-2 block matrix whose entries are 4-by-4 block matrices whose entries are 3-by-3 matrices. Algebra (B C)T = BT CT ⊗ ⊗ (B C) 1 = B 1 C 1 ⊗ − − ⊗ − (B C)(D F ) = BD CF ⊗ ⊗ ⊗ B (C D) =(B C) D ⊗ ⊗ ⊗ ⊗ No: B C = C B ⊗ 6 ⊗ Yes: B C = (Perfect Shuffle)(C B)(Perfect Shuffle)T ⊗ ⊗ The vec Operation Turns matrices into vectors by stacking columns: 1 2 1 10 3 X = 2 20 vec(X) = ⇒ 10 3 30 20 30 Important special case: 1 1 1 vec(rank-1 matrix) = vec 2 1 10 = 2 10 ⊗ 3 3 Reshaping The matrix equation Y = CXBT can be reshaped into a vector equation vec(Y )=(B C)vec(X) ⊗ Implies fast linear equation solving and fast matrix-vector multi- plication. (More later.) Inheriting Structure nonsingular nonsingular lower(upper) triangular lower(upper)triangular banded block banded symmetric symmetric If B and C are positive definite then B C is positive definite stochastic ⊗ stochastic Toeplitz block Toeplitz permutations a permutation orthogonal orthogonal Computing the KP-SVD Warm-Up: The Nearest KP Problem Given A IRm n with m = m m and n = n n . ∈ × 1 2 1 2 Find B IRm1 n1 and C IRm2 n2 so ∈ × ∈ × φ(B, C) = A B C = min k − ⊗ kF A bilinear least squares problem. Fix B (or C) and it becomes linear in C (or B). Reshaping the Nearest KP Problem a11 a12 a13 a14 a21 a22 a23 a24 b11 b12 a31 a32 a33 a34 c11 c12 φ(B, C)= b21 b22 a41 a42 a43 a44 − ⊗ c21 c22 b31 b32 a51 a52 a53 a54 a61 a62 a63 a64 F a11 a21 a12 a22 b11 a31 a41 a32 a42 b21 a51 a61 a52 a62 b31 = c11 c21 c12 c22 a13 a23 a14 a24 − b12 a33 a43 a34 a44 b22 a53 a63 a54 a64 b32 F !!! Finding the nearest rank-1 matrix is an SVD problem !!! SVD Primer m n T A IR U AV = Σ=diag(σ ,...,σn) ∈ × ⇒ 1 If U =[u u um] and V =[v v vn] then 1 | 2 | ···| 1 | 2 | ···| The rank-1 matrix σ u vT solves • 1 1 1 min A A˜ F k − k rank(A˜) = 1 T v1 is the dominant eigenvector for A A: • T 2 T A Av1 = σ1v1 Av1 = σ1u1 σ1 = u1 Av1 T u1 is the dominant eigenvector for AA : • T 2 T T T AA u1 = σ1u1 A u1 = σ1v1 σ1 = v1 A u1 Sol’n: SVD of Permuted A + Reshaping a11 a12 a13 a14 a21 a22 a23 a24 b11 b12 a31 a32 a33 a34 c11 c12 φ(B, C)= b21 b22 a41 a42 a43 a44 − ⊗ c21 c22 b31 b32 a51 a52 a53 a54 a61 a62 a63 a64 F a11 a21 a12 a22 b11 a31 a41 a32 a42 b21 a51 a61 a52 a62 b31 = c11 c21 c12 c22 a13 a23 a14 a24 − b12 a33 a43 a34 a44 b22 a53 a63 a54 a64 b32 F General Solution Procedure Minimize T φ(B, C) = A B C F = A˜ vec(B)vec(C) k − ⊗ k − F where T vec(A11) T vec(A21) T vec(A31) A˜ = T . vec(A12) T vec(A22) T vec(A32) Solution: Compute the SVD UT AV˜ = Σ and set (opt) (opt) vec(B ) = √σ1 U(:, 1) vec(C ) = √σ1 V (:, 1). Lanczos SVD Algorithm T Need to compute the dominant eigenvector v1 of A A and the T dominant eigenvector u1 of AA . The power method approach... T b = initial guess of v1; c = initial guess of u1 ; s = c Ab; while ( Ab sc Av σ u is too big ) k − k2 ≈k 1 − 1 1 k2 c = Ab; c = c/ c ; k k2 b = AT c; b = b/ b ; s = cT Ab; k k2 end The Lanczos method is better than this because it uses more than just the most recent b and c vectors. It too lives off of matrix-vector products, i.e., is “sparse friendly.” The Nearest KP-rank r Problem Use Block Lanczos. E.g., To minimize A B C B C B C k − 1 ⊗ 1 − 2 ⊗ 2 − 3 ⊗ 3kF use block Lanczos SVD with block width 3 and set (opt) vec(Bi ) = √σi U(:, i) i = 1:3 (opt) vec(Ci ) = √σi V (:, i) The Complete KP-SVD Given: A A N 11 ··· 1 A = . ... A IRp q ij ∈ × AM AMN 1 ··· Form A˜ (MN-by-pq) and apply LAPACK SVD: rA ˜ T A = σiuivi Xi=1 Then: rA A = σ reshape(u ,M,N) reshape(v ,p,q) i · i ⊗ i Xi=1 The Theorems Follow From This A A˜ ⇐⇒ m m rA rA A = σ B C A˜ = σu vT i i ⊗ i ⇐⇒ i i Xi=1 Xi=1 A Related Problem Problem. Find X and Y to minimize A (X Y Y X) k − ⊗ − ⊗ kF Solution. Find vectors x and y so A˜ (xyT yxT ) k − − kF is minimized and reshape x and y to get X(opt) and Y (opt). The Schur decomposition of A˜ A˜T is involved. − Another Related Problem Problem. Find X to minimize A X X) k − ⊗ kF Solution. Find vector x so A˜ xxT k − kF is minimized and reshape to get X(opt). The Schur decomposition of A˜ + A˜T is involved. A Much More Difficult Problem min A B C D k − ⊗ ⊗ kF B,C,D Computational multilinear algebra is filled with problems like this. Nearest KP Preconditioners Main Idea (i) Suppose A and an N-by-N block matrix with p-by-p blocks. (ii) Need to solve Ax = b. Ordinarily this is O(N3p3) (iii) A system of the form (B C + B C )z = r 1 ⊗ 1 2 ⊗ 2 3 3 T T can be solved in O(N + p ) time. Hint C1ZB1 + C2ZB2 = R. (iv) If (B C + B C ) A 1 ⊗ 1 2 ⊗ 2 ≈ we have a potential preconditioner. A Block Toeplitz De-Blurring Problem (Nagy, O’Leary, Kamm(1998)) Need to solve a large block Toeplitz system T x = b Preconditioner: T T T ≈ 1 ⊗ 2 Can solve the nearest KP problem with the constraint that the factor matrices T1 and T2 are Toeplitz. A Poisson-Related Problems Poisson’s equation on a rectangle with a regular (M+1)-by-(N+1) grid discretizes to Au = (I T + T I ) u = f M ⊗ N M ⊗ N where the T ’s are 1-2-1 tridiagonals. Can be solved very fast. A new method for the Navier-Stokes problem being developed by Diamessis and Escobar-Vargas leads to linear system where the highly structured A-matrix has KP-rank rA = 16. Looking for a KP-Preconditioner M of the form M = B C + B C 1 ⊗ 1 2 ⊗ 2 Some Constrained Nearest KP Problems Joint with Stefan Ragnarsson NOT Inheriting Structure In the min A B C B,C k − ⊗ kF problem, sometimes B and C fail to inherit A’s special attributes. Stochastic then B and C are Stochastic If A is Orthogonal not quite Orthogonal KP Approximation of Stochastic Matrices If A IRn n , B IRn1 n1 , and C IRn2 n2 , and ∈ × ∈ × ∈ × A = B C = stochastic stochastic ⊗ ⊗ then each A-entry has the form bijcpq.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages61 Page
-
File Size-