Low Rank Approximation Lecture 1
Low Rank Approximation Lecture 1
Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL [email protected]
1 Organizational aspects
I Lectures: Tuesday 8-10, MA A110. First: September 25, Last: December 18.
I Exercises: Tuesday 8-10, MA A110. First: September 25, Last: December 18.
I Exam: Miniproject + oral exam.
I Webpage: https://anchp.epfl.ch/lowrank.
I [email protected], [email protected]
2 From http://www.niemanlab.org
... his [Aleksandr Kogan’s] message went on to confirm that his approach was indeed similar to SVD or other matrix factorization meth- ods, like in the Netflix Prize competition, and the Kosinki-Stillwell- Graepel Facebook model. Dimensionality reduction of Facebook data was the core of his model.
3 Rank and basic properties For field F, let A ∈ F m×n. Then
rank(A) := dim(range(A)).
For simplicity, F = R throughout the lecture and often m ≥ n. Lemma Let A ∈ Rm×n. Then 1. rank(AT ) = rank(A); 2. rank(PAQ) = rank(A) for invertible matrices P ∈ Rm×m, Q ∈ Rn×n; 3. rank(AB) ≤ min{rank(A), rank(B)} for any matrix B ∈ Rn×p. A11 A12 m ×n 4. rank = rank(A11) + rank(A22) for A11 ∈ R 1 1 , 0 A22 m ×n m ×n A12 ∈ R 1 2 ,A22 ∈ R 2 2 . Proof: See Linear Algebra 1 / Exercises.
4 Rank and matrix factorizations m Let B = {b1,..., br } ⊂ R with r = rank(A) be basis of range(A). Then each of the columns of A = a1, a2,..., an can be expressed as linear combination of B: ci1 ,..., . ai = b1ci1 + b2ci2 + ··· + br cir = b1 br . , cir
for some coefficients cij ∈ R with i = 1,..., n, j = 1,..., r. Stacking these relations column by column c11 ··· cn1 ,..., ,..., . . a1 an = b1 br . . c1r ··· cnr
5 Rank and matrix factorizations Lemma. A matrix A ∈ Rm×n of rank r admits a factorization of the form T m×r n×r A = BC , B ∈ R , C ∈ R . We say that A has low rank if rank(A) m, n. Illustration of low-rank factorization:
A BCT #entries mn mr + nr
I Generically (and in most applications), A has full rank, that is, rank(A) = min{m, n}.
I Aim instead at approximating A by a low-rank matrix.
6 Questions addressed in lecture series
What? Theoretical foundations of low-rank approximation. When? A priori and a posteriori estimates for low-rank approximation. Situations that allow for low-rank approximation techniques. Why? Applications in engineering, scientific computing, data analysis, ... where low-rank approximation plays a central role. How? State-of-the-art algorithms for performing and working with low-rank approximations.
Will cover both, matrices and tensors.
7 Literature for Lecture 1
Golub/Van Loan’2013 Golub, Gene H.; Van Loan, Charles F. Matrix computations. Fourth edition. Johns Hopkins University Press, Baltimore, MD, 2013. Horn/Johnson’2013 Horn, Roger A.; Johnson, Charles R. Matrix analysis. Second edition. Cambridge University Press, 2013. + References on slides.
8 1. Fundamental tools
I SVD
I Relation to eigenvalues
I Norms
I Best low-rank approximation
9 The singular value decomposition Theorem (SVD). Let A ∈ Rm×n with m ≥ n. Then there are orthogonal matrices U ∈ Rm×m and V ∈ Rn×n such that σ1 .. T . m×n A = UΣV , with Σ = ∈ R σn 0
and σ1 ≥ σ2 ≥ · · · ≥ σn ≥ 0.
I σ1, . . . , σn are called singular values
I u1,..., un are called left singular vectors
I v1,..., vn are called right singular vectors T I Avi = σi ui , A ui = σi vi for i = 1,..., n.
I Singular values are always uniquely defined by A.
I Singular values are never unique. If σ1 > σ2 > ··· σn > 0 then unique up to ui ← ±ui , vi ← ±vi .
10 SVD: Sketch of proof Induction over n. n = 1 trivial.
For general n, let v1 solve max{kAvk2 : kvk2 = 1} =: kAk2. Set 1 σ1 := kAk2 and u1 := Av1/σ1. By definition,
Av1 = σ1u1.