Lecture 6: October 19, 2015 1 Singular Value Decomposition
Total Page:16
File Type:pdf, Size:1020Kb
Mathematical Toolkit Autumn 2015 Lecture 6: October 19, 2015 Lecturer: Madhur Tulsiani 1 Singular Value Decomposition Let V; W be finite-dimensional inner product spaces and let ' : V ! W be a linear transformation. Since the domain and range of ' are different, we cannot analyze it in terms of eigenvectors. However, we can use the spectral theorem to analyze the operators '∗' : V ! V and ''∗ : W ! W and use their eigenvectors to derive a nice decomposition of '. This is known as the singular value decomposition (SVD) of '. Proposition 1.1 Let ' : V ! W be a linear transformation. Then '∗' : V ! V and ''∗ : W ! W are positive semidefinite linear operators with the same non-zero eigenvalues. In fact, we can notice the following from the proof of the above proposition Proposition 1.2 Let v be an eigenvector of '∗' with eigenvalue λ 6= 0. Then '(v) is an eigen- vector of ''∗ with eigenvalue λ. Similarly, if w is an eigenvector of ''∗ with eigenvalue λ 6= 0, then '∗(w) is an eigenvector of '∗' with eigenvalue λ. Using the above, we get the following 2 2 2 ∗ Proposition 1.3 Let σ1 ≥ σ2 ≥ · · · ≥ σr > 0 be the non-zero eigenvalues of ' ', and let v1; : : : ; vr be a corresponding othronormal eigenbasis. For w1; : : : ; wr defined as wi = '(vi)/σi, we have that 1. fw1; : : : ; wrg form an orthonormal set. 2. For all i 2 [r] ∗ '(vi) = σi · wi and ' (wi) = σi · vi : The values σ1; : : : ; σr are known as the (non-zero) singular values of '. For each i 2 [r], the vector vi is known as the right singular vector and wi is known as the right singular vector corresponding to the singular value σi. Proposition 1.4 Let r be the number of non-zero eigenvalues of '∗'. Then, rank(') = dim(im(')) = r : Using the above, we can write ' in a particularly convenient form. We first need the following definition. 1 Definition 1.5 Let V; W be inner product spaces and let v 2 V; w 2 W be any two vectors. The outer product of w with v, denoted as jwi hvj, is a linear transformation from V to W such that jwi hvj (u) := hu; vi · w : Note that if kvk = 1, then jwi hvj (v) = w and jwi hvj (u) = 0 for all u ? v. Also, note that the rank of the linear transformation defined above is one. We can then write ' : V ! W in terms of outer products of its singular vectors. Proposition 1.6 Let V; W be finite dimensional inner product spaces and let ' : V ! W be a linear transformation with non-zero singular values σ1; : : : ; σr, right singular vectors v1; : : : ; vr and left singular vectors w1; : : : ; wr. Then, r X ' = σi · jwii hvij : i=1 2 Singular Value Decomposition for matrices m×n Using the previous discussion, we can write matrices in convenient form. Let A 2 C , which can n m be thought of as an operator from C to C . Let σ1; : : : ; σr be the non-zero singular values and let n v1; : : : ; vr and w1; : : : ; wr be the right and left singular vectors respectively. Note that V = C and m ∗ ∗ W = C and v 2 V; w 2 W , we can write the operator jwi hvj as the matrix wv , there v denotes vT . This is because for any u 2 V , wv∗u = w(v∗u) = hu; vi · w. Thus, we can write r X ∗ A = σi · wivi : i=1 m×r th Let W 2 C be a matrix with w1; : : : ; wr as columns, such that i column equals wi. Similarly, n×r r×r let V 2 C be a matrix with v1; : : : ; vr as the columns. Let Σ 2 C be a diagonal matrix with Σii = σi. Then, check that the above expression for A can also be written as A = W ΣV ∗ ; where V ∗ = V T as before. n m We can also complete the bases fv1; : : : ; vrg and fw1; : : : ; wrg to bases for C and C respectively and write the above in terms of unitary matrices. n×n Definition 2.1 A matrix U 2 C is known as a unitary matrix if the columns of U form an n orthonormal basis for C . n×n ∗ ∗ Proposition 2.2 Let U 2 C be a unitary matrix. Then UU = U U = id, where id denotes the identity matrix. n n×n Let fv1; : : : ; vng be a completion of fv1; : : : ; vrg to an orthonormal basis of C , and let Vn 2 C m×m be a unitary matrix with fv1; : : : ; vng as columns. Similarly, let Wm 2 C be a unitary matrix 0 m×n 0 with a completion of fw1; : : : ; wrg as columns. Let Σ 2 C be a matrix with Σii = σi if i ≤ r, and all other entries equal to zero. Then, we can also write 0 ∗ A = WmΣ Vn : 2 2.1 SVD as a low-rank approximation for matrices m×n Given a matrix A 2 C , we want to find a matrix B of rank at most k which \approximates" A. For now we will consider the notion of approximation in spectral norm i.e., we want to minimixe kA − Bk2, where k(A − B)xk2 k(A − B)k2 = inf : x6=0 kxk2 SVD also gives the optimal solution for another notion of approximation: minimizing the Frobenius 1=2 P 2 Pr ∗ norm kA − BkF , which equals ij(Aij − Bij) . We will see this later. Let A = i=1 wivi be the singular value decomposition of A and let σ1 ≥ · · · ≥ σr > 0. If k ≥ r, we can simply use Pk ∗ B = A since rank(A) = r. If k < r, we claim that Ak = i=1 σiwivi is the optimal solution. If is easy to check the following. Proposition 2.3 kA − Akk2 = σk+1. We also show the following. m×n Proposition 2.4 Let B 2 C have rank(B) ≤ k and let k < r. Then kA − Bk2 ≥ σk+1. Proof: By rank-nullity theorem dim(ker(A)) ≥ n−k and thus, dim (ker(A) \ Span (v1; : : : ; vk+1)) ≥ 1. Let z 2 ker(A) \ Span (v1; : : : ; vk+1) n f0g. Then, 2 2 ∗ ∗ k(A − B)zk2 = kAzk2 = hA Az; zi ≥ min hA Az; zi z2Span(v1;:::;vk+1)nf0g 0 2 ≥ min RA∗A(z ) · kzk 0 2 z 2Span(v1;:::;vk+1)nf0g 2 2 ≥ σk+1 · kzk2 : Thus, there exists a z 6= 0 such that k(A − B)zk2 ≥ σk+1 ·kzk2, which implies kA − Bk2 ≥ σk+1. 3.