Second Exam Presentation Low Rank Matrix Approximation John Svadlenka

Introduction Classical Results Approximation and Probabilistic Results Randomized Algorithms - Strategies and Benefits Research Activity Open Problems and Future Research Directions Second Exam Presentation Low Rank Matrix Approximation John Svadlenka City University of New York Graduate Center Date Pending Introduction Classical Results Approximation and Probabilistic Results Randomized Algorithms - Strategies and Benefits Research Activity Open Problems and Future Research Directions Outline 1 Introduction 2 Classical Results 3 Approximation and Probabilistic Results 4 Randomized Algorithms - Strategies and Benefits 5 Research Activity 6 Open Problems and Future Research Directions Introduction Classical Results Problem Definition Approximation and Probabilistic Results Overview of Conventional Algorithms Randomized Algorithms - Strategies and Benefits Related Problems Research Activity Motivation for New Approaches Open Problems and Future Research Directions Given an m × n matrix A, we are often interested in approximating A as the product of an m × k matrix B and a k × n matrix C. A ≈ B · C Why? Provided it is true that k min(m; n): Arithmetic cost of matrix vector product is 2(m + n)k Storage space of matrix A is (m + n)k (m + n)k m × n We denote the product B · C as a rank k approximation of A Introduction Classical Results Problem Definition Approximation and Probabilistic Results Overview of Conventional Algorithms Randomized Algorithms - Strategies and Benefits Related Problems Research Activity Motivation for New Approaches Open Problems and Future Research Directions More formally, we seek a rank k matrix approximation of matrix A for some > 0 such that: kA − Abk k ≤ (1 + )kA − Ak k Ak is the theoretical best rank k approximation of A Matrix norms are Frobenius k· kF or Spectral k· k2 2 Pm;n 2 jjAjjF := i;j=1 jaij j jjAjj2 := supjjvjj2=1 jjAvjj2 Ak can be computed from the SVD with cost O((m + n)mn). So we seek less costly approaches. Why? Introduction Classical Results Problem Definition Approximation and Probabilistic Results Overview of Conventional Algorithms Randomized Algorithms - Strategies and Benefits Related Problems Research Activity Motivation for New Approaches Open Problems and Future Research Directions Suppose m = n and compare mn(m + n) = 2n3 with n2 log n: n n3 n2 log n 10 1,000 332 100 1.00e+06 66,400 1,000 1.00 e+09 1.00 e+06 10,000 1.00 e+12 1.33 e+09 Consider the above statistics in light of some recent trends: Conventional LRA does not scale for Big Data purposes Approximation algorithms are increasingly preferred Applications utilizing numerical linear algebra are expanding beyond traditional scientific and engineering disciplines Introduction Classical Results Problem Definition Approximation and Probabilistic Results Overview of Conventional Algorithms Randomized Algorithms - Strategies and Benefits Related Problems Research Activity Motivation for New Approaches Open Problems and Future Research Directions Conventional LRA algorithms generate decompositions, most important of these are SVD, Rank-Revealing QR (RRQR), and RRLU: Singular Value Decomposition (SVD) [Eckhart-Young] Let A be an m × n matrix with r = rank(A) whose elements may be complex. Then there exists two unitary matrices U and V , and an m × n diagonal matrix Σ with nonnegative elements σi , where σ1 ≥ σ2 ≥ · · · ≥ σr > 0 and σj = 0 for j > r, such that: A = UΣV ∗ U and V are m × m and n × n, respectively. Introduction Classical Results Problem Definition Approximation and Probabilistic Results Overview of Conventional Algorithms Randomized Algorithms - Strategies and Benefits Related Problems Research Activity Motivation for New Approaches Open Problems and Future Research Directions QR Decomposition Let A be an m × n matrix with m ≥ n whose elements may be complex. Then there exists an m × n matrix Q and an n × n matrix R such that A = QR where the columns of Q are orthonormal and R is upper triangular. Cost O(mn min (m; n)) is lower than that for SVD. There are several efficient strategies to orthogonalize A Column i of A is the linear combination of columns of Q with the coefficients given by column i of R Introduction Classical Results Problem Definition Approximation and Probabilistic Results Overview of Conventional Algorithms Randomized Algorithms - Strategies and Benefits Related Problems Research Activity Motivation for New Approaches Open Problems and Future Research Directions The LRA problem is also significant for these related subjects: Principal Component Analysis Clustering Algorithms Tensor Decomposition Rank Structured Matrices But a series of recent trends have provided impetus for new approaches to LRA... Introduction Classical Results Problem Definition Approximation and Probabilistic Results Overview of Conventional Algorithms Randomized Algorithms - Strategies and Benefits Related Problems Research Activity Motivation for New Approaches Open Problems and Future Research Directions Consider these examples of Emerging Applications and Big Data: New disciplines: Machine Learning, Data Science, Image Processing Modern Massive Data Sets from Physical Systems Modelling, Sensor Measurements, Internet New Fields: Recommender Systems, Complex Systems Science Classical LRA algorithms and their implementations, though well-developed over many years, are characterized by: Limited parallelization opportunities Relatively high computational complexity Memory bottlenecks with out-of-core data sets Introduction Eckhart-Young Theorem For SVD Classical Results Low Rank Format and Matrix Decompositions Approximation and Probabilistic Results QR Decomposition Randomized Algorithms - Strategies and Benefits Skeleton (CUR) Decomposition Research Activity Interpolative Decomposition Open Problems and Future Research Directions Decomposition Summary [Eckhart-Young Theorem] m×n Let A 2 C and let Ak be the truncated SVD of rank k where Uk , Vk , and Σk are m × k, n × k and k × k, respectively. We have: ∗ Ak = Uk Σk Vk Then the approximation errors are defined as below. Furthermore, these are the smallest errors of any rank k approximation of A. kA − Ak k2 = σk+1 v u umin(m;n) u X 2 kA − Ak kF = t σj j=k+1 Introduction Eckhart-Young Theorem For SVD Classical Results Low Rank Format and Matrix Decompositions Approximation and Probabilistic Results QR Decomposition Randomized Algorithms - Strategies and Benefits Skeleton (CUR) Decomposition Research Activity Interpolative Decomposition Open Problems and Future Research Directions Decomposition Summary Given a rank-k SVD representation of a matrix we may generate its low rank format: ∗ Ak = Uk · (Σk Vk ) Other decompositions consist of matrix factors being orthogonal or having a row and/or column subset of the original matrix: RRQR UTV CUR Interpolative Decomposition (ID) (one-sided and two-sided) We may generate a low rank format similarly with: W = CUR = [CU]R = C[UR] W = UTV = (UT )V = U(TV ) Introduction Eckhart-Young Theorem For SVD Classical Results Low Rank Format and Matrix Decompositions Approximation and Probabilistic Results QR Decomposition Randomized Algorithms - Strategies and Benefits Skeleton (CUR) Decomposition Research Activity Interpolative Decomposition Open Problems and Future Research Directions Decomposition Summary Existence of a QR factorization for any matrix can be proven in many ways. For example, it follows from Gram-Schmidt orthogonalization: Theorem Suppose (a1; a2;:::; an) is a linearly independent list of vectors of a fixed dimension. Then there is an orthonormal list of vectors (q1; q2;:::; qn) such that span(a1; a2;:::; an) = span(q1; q2;:::; qn). Introduction Eckhart-Young Theorem For SVD Classical Results Low Rank Format and Matrix Decompositions Approximation and Probabilistic Results QR Decomposition Randomized Algorithms - Strategies and Benefits Skeleton (CUR) Decomposition Research Activity Interpolative Decomposition Open Problems and Future Research Directions Decomposition Summary Shortcomings of Gram-Schmidt QR algorithm wrt LRA: Problem: The algorithm may fail if rank(A) < n Solution: Introduce a column pivoting strategy Impact: A = QRP where P is a permutation matrix Problem: Rounding error impacts orthogonalization Solution: Normalize qi before computing qi+1 0 Solution: Compute qi s up to some epsilon tolerance Introduction Eckhart-Young Theorem For SVD Classical Results Low Rank Format and Matrix Decompositions Approximation and Probabilistic Results QR Decomposition Randomized Algorithms - Strategies and Benefits Skeleton (CUR) Decomposition Research Activity Interpolative Decomposition Open Problems and Future Research Directions Decomposition Summary Skeleton (CUR) Decomposition Theorem Let A be an m × n matrix of rank k of real elements with rank(A) = k. Then there exists a nonsingular k × k submatrix Ab of A. Moreover, let I be and J be the index sets of the rows and columns of A, respectively, in Ab. Then A = CUR where U = Ab−1 and C = A(1::m; J) and R = A(I ; 1::n). A set of k columns and rows captures A0s column, row spaces Skeleton is in contrast to SVD's left and right singular vectors Can use QRP or LUP algorithms to find the submatrix Ab Introduction Eckhart-Young Theorem For SVD Classical Results Low Rank Format and Matrix Decompositions Approximation and Probabilistic Results QR Decomposition Randomized Algorithms - Strategies and Benefits Skeleton (CUR) Decomposition Research Activity Interpolative Decomposition Open Problems and Future Research Directions Decomposition Summary Interpolative Decomposition Lemma Suppose A is an m × n

Load more