Rank Revealing Factorizations and Low Rank Approximations

Rank revealing factorizations, and low rank approximations L. Grigori Inria Paris, UPMC January 2018 Plan Low rank matrix approximation Rank revealing QR factorization LU CRTP: Truncated LU factorization with column and row tournament pivoting Experimental results, LU CRTP Randomized algorithms for low rank approximation 2 of 70 Plan Low rank matrix approximation Rank revealing QR factorization LU CRTP: Truncated LU factorization with column and row tournament pivoting Experimental results, LU CRTP Randomized algorithms for low rank approximation 3 of 70 Low rank matrix approximation T Problem: given m × n matrix A, compute rank-k approximation ZW , where Z is m × k and W T is k × n. Problem with diverse applications from scientific computing: fast solvers for integral equations, H-matrices to data analytics: principal component analysis, image processing, ... Ax ! ZW T x Flops 2mn ! 2(m + n)k 4 of 70 Singular value decomposition Given A 2 Rm×n, m ≥ n its singular value decomposition is 0 1 Σ1 0 T T A = UΣV = U1 U2 U3 · @ 0 Σ2A · V1 V2 0 0 where U is m × m orthogonal matrix, the left singular vectors of A , U1 is m × k, U2 is m × n − k, U3 is m × m − n Σ is m × n, its diagonal is formed by σ1(A) ≥ ::: ≥ σn(A) ≥ 0 Σ1 is k × k,Σ2 is n − k × n − k V is n × n orthogonal matrix, the right singular vectors of A, V1 is n × k, V2 is n × n − k 5 of 70 Norms v u m n q uX X 2 2 2 jjAjjF = t jaij j = σ1(A) + : : : σn(A) i=1 j=1 jjAjj2 = σmax (A) = σ1(A) Some properties: p jjAjj2 ≤ jjAjjF ≤ min(m; n)jjAjj2 Orthogonal Invariance: If Q 2 Rm×m and Z 2 Rn×n are orthogonal, then jjQAZjjF = jjAjjF jjQAZjj2 = jjAjj2 6 of 70 Low rank matrix approximation Best rank-k approximation Ak = Uk Σk Vk is rank-k truncated SVD of A [Eckart and Young, 1936] ~ min jjA − Ak jj2 = jjA − Ak jj2 = σk+1(A) (1) ~ rank(Ak )≤k v u n ~ u X 2 min jjA − Ak jjF = jjA − Ak jjF = t σ (A) (2) ~ j rank(Ak )≤k j=k+1 Original image of size Rank-38 approximation, Rank-75 approximation, 919 × 707 SVD SVD Image source: https: //upload.wikimedia.org/wikipedia/commons/a/a1/Alan_Turing_Aged_16.jpg 7 of 70 Large data sets Matrix A might not exist entirely at a given time, rows or columns are added progressively. Streaming algorithm: can solve an arbitrarily large problem with one pass over the data (a row or a column at a time). Weakly streaming algorithm: can solve a problem with O(1) passes over the data. Matrix A might exist only implicitly, and it is never formed explicitly. 8 of 70 Low rank matrix approximation: trade-offs 9 of 70 Plan Low rank matrix approximation Rank revealing QR factorization LU CRTP: Truncated LU factorization with column and row tournament pivoting Experimental results, LU CRTP Randomized algorithms for low rank approximation 10 of 70 Rank revealing QR factorization Given A of size m × n, consider the decomposition R11 R12 APc = QR = Q ; (3) R22 where R11 is k × k, Pc and k are chosen such that jjR22jj2 is small and R11 is well-conditioned. By the interlacing property of singular values [Golub, Van Loan, 4th edition, page 487], σi (R11) ≤ σi (A) and σj (R22) ≥ σk+j (A) for 1 ≤ i ≤ k and 1 ≤ j ≤ n − k. σk+1(A) ≤ σmax (R22) = jjR22jj 11 of 70 Rank revealing QR factorization Given A of size m × n, consider the decomposition R11 R12 APc = QR = Q : (4) R22 If jjR22jj2 is small, Q(:; 1 : k) forms an approximate orthogonal basis for the range of A, min(j;k) X a(:; j) = rij Q(:; i) 2 spanfQ(:; 1);::: Q(:; k)g i=1 ran(A) 2 spanfQ(:; 1);::: Q(:; k)g −R−1R P 11 12 is an approximate right null space of A. c I 12 of 70 Rank revealing QR factorization The factorization from equation (4) is rank revealing if σi (A) σj (R22) 1 ≤ ; ≤ q1(n; k); σi (R11) σk+j (A) for 1 ≤ i ≤ k and 1 ≤ j ≤ min(m; n) − k, where σmax (A) = σ1(A) ≥ ::: ≥ σmin(A) = σn(A) It is strong rank revealing [Gu and Eisenstat, 1996] if in addition −1 jjR11 R12jjmax ≤ q2(n; k) Gu and Eisenstat show that given k and f , there exists a Pc such that p 2 q1(n; k) = 1 + f k(n − k) and q2(n; k) = f . Factorization computed in 4mnk (QRCP) plus O(mnk) flops. 13 of 70 QR with column pivoting [Businger and Golub, 1965] Idea: At first iteration, trailing columns decomposed into parallel part to first column (or e1) and orthogonal part (in rows 2 : m). The column of maximum norm is the column with largest component orthogonal to the first column. Implementation: Find at each step of the QR factorization the column of maximum norm. Permute it into leading position. If rank(A) = k, at step k + 1 the maximum norm is 0. No need to compute the column norms at each step, but just update them since T 2 2 2 Q v = w = w1 w(2 : n) ; jjw(2 : n)jj2 = jjvjj2 − w1 14 of 70 QR with column pivoting [Businger and Golub, 1965] Sketch of the algorithm column norm vector: colnrm(j) = jjA(:; j)jj2; j = 1 : n. for j = 1 : n do Find column p of largest norm if colnrm[p] > then 1. Pivot: swap columns j and p in A and modify colnrm. 2. Compute Householder matrix Hj s.t. Hj A(j : m; j) = ±||A(j : m; j)jj2e1. 3. Update A(j : m; j + 1 : n) = Hj A(j : m; j + 1 : n). 4. Norm downdate colnrm(j + 1 : n)2− = A(j; j + 1 : n)2. else Break end if end for If algorithm stops after k steps p p σmax (R22) ≤ n − k max jjR22(:; j)jj2 ≤ n − k 1≤j≤n−k 15 of 70 Strong RRQR [Gu and Eisenstat, 1996] Since k q n−k Y T Y det(R11) = σi (R11) = det(A A)= σi (R22) i=1 i=1 a stron RRQR is related to a large det(R11). The following algorithm interchanges columns that increase det(R11), given f and k. Compute a strong RRQR factorization, given k: Compute AΠ = QR by using QRCP while there exist i and j such that det(R~11)=det(R11) > f , where R11 = R(1 : k; 1 : k), Πi;j+k permutes columns i and j + k, RΠi;j+k = QR~; R~11 = R~(1 : k; 1 : k) do Find i and j Compute RΠi;j+k = QR~ and Π = ΠΠi;j+k end while 16 of 70 Strong RRQR (contd) It can be shown that ~ r det(R11) −1 2 2 2 = R11 R12 i;j + !i (R11) χj (R22) (5) det(R11) for any 1 ≤ i ≤ k and 1 ≤ j ≤ n − k (the 2-norm of the j-th column of A is −1 χj (A), and the 2-norm of the j-th row of A is !j (A) ). Compute a strong RRQR factorization, given k: Compute AΠ = QR by using QRCP q −1 2 2 2 while max1≤i≤k;1≤j≤n−k R11 R12 i;j + !i (R11) χj (R22) > f do q −1 2 2 2 Find i and j such that R11 R12 i;j + !i (R11) χj (R22) > f Compute RΠi;j+k = QR~ and Π = ΠΠi;j+k end while 17 of 70 Strong RRQR (contd) det(R11) strictly increases with every permutation, no permutation repeats, hence there is a finite number of permutations to be performed. 18 of 70 Strong RRQR (contd) Theorem [Gu and Eisenstat, 1996] If the QR factorization with column pivoting as in equation (4) satisfies inequality r −1 2 2 2 R11 R12 i;j + !i (R11) χj (R22) < f for any 1 ≤ i ≤ k and 1 ≤ j ≤ n − k, then σ (A) σ (R ) 1 ≤ i ; j 22 ≤ p1 + f 2k(n − k); σi (R11) σk+j (A) for any 1 ≤ i ≤ k and 1 ≤ j ≤ min(m; n) − k. 19 of 70 Tournament pivoting [Demmel et al., 2015] One step of CA RRQR, tournament pivoting used to select k columns Partition A = (A1; A2; A3; A4). Select k cols from each column block, by using QR with column pivoting At each level i of the tree At each node j do in parallel Let Av;i−1; Aw;i−1 be the cols selected by the children of node j Select k cols from (Av;i−1; Aw;i−1), by using QR with column pivoting Permute Aji in leading positions, compute QR with no pivoting R ∗ AP = Q 11 c1 1 ∗ 20 of 70 Tournament pivoting [Demmel et al., 2015] One step of CA RRQR, tournament pivoting used to select k columns Partition A = (A1; A2; A3; A4). Select k cols from each column block, by using QR with column pivoting At each level i of the tree At each node j do in parallel Let Av;i−1; Aw;i−1 be the cols selected by the children of node j Select k cols from (Av;i−1; Aw;i−1), by using QR with column pivoting Permute Aji in leading positions, compute QR with no pivoting R ∗ AP = Q 11 c1 1 ∗ 20 of 70 Tournament pivoting [Demmel et al., 2015] One step of CA RRQR, tournament pivoting used to select k columns Partition A = (A1; A2; A3; A4).

Rank Revealing Factorizations and Low Rank Approximations

A Survey of Direct Methods for Sparse Linear Systems

Parallel Codes for Computing the Numerical Rank '

Efficient Scalable Algorithms for Hierarchically Semiseparable Matrices

A Parallel Butterfly Algorithm

Solving Linear Systems Arising from Reservoirs Modeling Hussam Al Daas

Dense Matrix Computations : Communication Cost and Numerical Stability

50 LA15 Abstracts

Cranfield University 2006

A Randomized Algorithm for Rank Revealing QR Factorizations and Applications

Low Rank Approximation of a Sparse Matrix Based on LU Factorization with Column and Row Tournament Pivoting

Parallel Implementations for Solving Matrix Factorization Problems with Optimization

Low Rank Approximation of a Sparse Matrix Based on LU Factorization with Column and Row Tournament Pivoting Laura Grigori, Sébastien Cayrols, James W