VII.E) the Singular Value Decomposition (SVD

(VII.E) The Singular Value Decomposition (SVD) In this section we describe a generalization of the Spectral Theo- rem to non-normal operators, and even to transformations between different vector spaces. This is a computationally useful generalization, with applications to data fitting and data compression in particular. It is used in many branches of science, in statistics, and even machine learning. As we develop the underlying mathematics, we will encounter two closely associated constructions: the polar decomposition and pseudoinverse, in each case presenting the version for operators followed by that for matrices. To begin, let (V, h·, ·i) be an n-dimensional inner-product space, T : V ! V an operator. By Spectral Theorem III, if T is normal then it has a unitary eigenbasis B, with complex eigenvalues s1,..., ss. In particular, T is unitary iff these jsij = 1 (since preserving lengths of a unitary eigenbasis is equivalent to preserving lengths of all vectors), and self-adjoint iff the si are real. So T is both unitary and self-adjoint iff T has only +1 and −1 as eigenvalues, which is to say T is an ? orthogonal reflection, acting on V = W ⊕ W by IdW on W and ? −IdW? on W . But the broader point here is that we should have in mind the following “table of analogies”: operators numbers normal C unitary S1 self-adjoint R ?? R≥0 where S1 ⊂ C is the unit circle. So, how do we complete it? 1 2 (VII.E) THE SINGULAR VALUE DECOMPOSITION (SVD) DEFINITION 1. T is positive (resp. strictly positive) if it is normal, with all eigenvalues in R≥0 (resp. R>0). Since they are unitarily diagonalizable with real eigenvalues, positive operators are self-adjoint. Given a self-adjoint (or normal) operator T, any ~v 2 V is a sum of orthogonal eigenvectors w~ j with eigenvalues sj; so 2 h~v, T~vi = ∑hw~ i, Tw~ ji = ∑ sjhw~ i, w~ ji = ∑ sjjjw~ jjj i,j i.j j is in R≥0 for all ~v if and only if T is positive. If V is complex, the condition that h~v, T~vi ≥ 0 (8~v) on its own gives self-adjointness (Ex- ercise VII.D.7), hence positivity, of T. (If V is real, this condition alone isn’t enough.) EXAMPLE 2. Orthogonal projections are positive, since they are self-adjoint with eigenvalues 0 and 1. Polar decomposition. A nonzero complex number can be writ- iq ten (in “polar form”) as a product re , for unique r 2 R>0 and eiq 2 S1. If T is a normal operator, then it has a unitary eigen- iq iqn basis B = fvˆ1,..., vˆng, with [T]B = diagfr1e 1 ,..., rne g (ri 2 R≥0), and defining jTj, U by [jTj]B = diagfr1,..., rng and [U]B = diagfeiq1 ,..., eiqn g, we have T = UjTj with U unitary and jTj positive, and UjTj = jTjU. One might expect such “polar decompositions” of operators to stop there. But there exists an analogue for arbitrary transformations, one that will lead on to the SVD as well! To formulate this general polar decomposition, let T : V ! V be an arbitrary linear transformation. Notice that T†T is self-adjoint1 and, in fact, positive: h~v, T†T~vi = hT~v, T~vi ≥ 0, † since h·, ·i is an inner product. So we have a unitary B with [T T]B = diagfl1,..., lng, and all li 2 R≥0. (Henceforth we impose the con- vention that l1 ≥ l2 ≥ ... ≥ ln.) Taking nonnegative square 1trivially so: (T†T)† = T†T†† = T†T. (VII.E) THE SINGULAR VALUE DECOMPOSITION (SVD) 3 p roots mi = li 2 R≥0, we define a new operator jTj by [jTj]B = 2 † diagfm1,..., mng; clearly jTj = T T, and jTj is positive. THEOREM 3. There exists a unitary operator U such that T = UjTj. This “polar decomposition” is unique exactly when T is invertible. More- over, U and jTj commute exactly when T is normal. 2 PROOF. We have kT~vk = hT~v, T~vi = h~v, T†T~vi = h~v, jTj2~vi = hjTj~v, jTj~vi = kjTj~vk2 since jTj is self-adjoint. Consequently T and jTj have the same kernel, hence (by Rank + Nullity) the same rank; so while their images (as subspaces of V) may differ, they have the same dimension. ? By Spectral Theorem I, V = imjTj ⊕ kerjTj. So dim(kerjTj) = dim((imT)?), and we fix an invertible transformation U0 : kerjTj ! (imT)? sending some choice of unitary basis to another unitary basis. Writing ~v = jTjw~ +~v0 (with jTj~v0 =~0), we now define ? ? U : imjTj ⊕ kerjTj ! imT ⊕ (imT)? | {z } | {z } V V by U~v := Tw~ + U0~v0. Since kTw~ + U0~v0k2 = kTw~ k2 + kU0~v0k2 = kjTjw~ k2 + k~v0k2 = k~vk2, U is unitary; and taking ~v = jTjw~ gives UjTjw~ = Tw~ (8w~ 2 V). −1 Now if T is invertible, then so is jTj (all mi > 0) =) U = TjTj . If it isn’t, then kerjTj 6= f0g and non-uniqueness enters in the choice of U0. Finally, if U and jTj commute, then they simultaneously unitarily diagonalize, and therefore so does T, making T normal. If T is normal, we have constructed U and jTj above, and they commute. 4 (VII.E) THE SINGULAR VALUE DECOMPOSITION (SVD) EXAMPLE 4. Consider V = R4 with the dot product, and 0 2 0 0 0 1 B C B 0 1 0 0 C A = [T]eˆ = B C . @ 0 1 1 0 A 0 0 0 0 Since A has a Jordan block, we know it is not diagonalizable; and so T is not normal. However, 0 4 0 0 0 1 B C † t B 0 2 1 0 C −1 [T T]eˆ = AA = B C = SBDSB @ 0 1 1 0 A 0 0 0 0 with B := feˆ1, eˆ2 +pj−eˆ3, eˆ2 − j+eˆ3, eˆp4g and D := diagf4, h+, h−, 0g ±1+ 5 3± 5 2 (where j± := 2 and h± := 2 = j±). Taking the positive 1 square root D 2 := diagf2, j+, j−, 0g, we get 0 1 2 0 0 0 B 0 p3 p1 0 C 1 −1 B 5 5 C [jTj]eˆ = jAj := SBD 2 S = B C . B B 0 p1 p2 0 C @ 5 5 A 0 0 0 0 On W := spanfeˆ1, eˆ2, eˆ3g = imT = imjTj, we must have UjW = −1 ? TjW (jTj jW ) ; while on spanfeˆ4g = W , U can be taken to act by any scalar of norm 1. Therefore 0 1 1 0 0 0 B 0 p2 p−1 0 C B 5 5 C [U]eˆ = Q := B C , B 0 p1 p2 0 C @ 5 5 A 0 0 0 eiq and A = QjAj. Geometrically, jAj dilates the B-coordinates of a vector, then Q performs a rotation. (VII.E) THE SINGULAR VALUE DECOMPOSITION (SVD) 5 SVD for endomorphisms2. Now let jTj be the (unique) positive † square root of T T as above, and B = fvˆ1,..., vˆng the unitary basis of V under which [jTj]B = diagfm1,..., mng. DEFINITION 5. These fmig are called the singular values of T. † Writing T = UjTj and wˆ i := Uvî, U U = IdV =) hwˆ i, wˆ ji = † hU Uvî, vˆji = hvî, vˆji = dij =)C := fwˆ 1,..., wˆ ng = U(B) is unitary. For ~v 2 V, we have n n T~v = UjTj ∑hvî,~vivî = ∑hvî,~viUjTjvî i=1 i=1 n n (1) = ∑ mihvî,~viUvî = ∑ mihvî,~viwˆ i. i=1 i=1 Viewing hvî, ·i =: `vî : V ! C as a linear functional, and wˆ i : C ! V as a map (sending a 2 C to awˆ i), (1) becomes n = ◦ ` (2) T ∑ miwˆ i vî . i=1 How does this look in matrix terms? Let A be an arbitrary3 unitary basis of V, and set A := [T]A. Using f1g as a basis of C, (2) yields n n = [ ] [` ] = [ ] [ ]∗ (3) A ∑ mi A wˆ i f1g f1g vî A ∑ mi wˆ i A vî A. i=1 i=1 ∗ Here [wˆ i]A[vî]A is a (n × 1) · (1 × n) matrix product, yielding an n × n matrix of rank one. So (2) (resp. (3)) decompose T (resp. A) into a sum of rank one endomorphisms (resp. matrices) weighted by the singular values. This is a first version of the Singular Value Decom- position. 2The reader may of course replace “unitary” and C by “orthonor- mal”/”orthogonal” and R in what follows. 3What you should actually have in mind here is, in the case where V = Cn with dot product, the standard basis eˆ. 6 (VII.E) THE SINGULAR VALUE DECOMPOSITION (SVD) REMARK 6. If T is positive, we can take U = IdV, so that wˆ i ◦ `vî = vî ◦ `vî =: Prvî is the orthogonal projection onto spanfvîg. Thus (2) merely restates Spectral Theorem I in this case. In fact, if T is nor- † mal, then T, T T, and jTj simultaneously diagonalize. If m˜1,..., mñ are the eigenvalues of T, then we evidently have mi = jmij; U may then be defined by [U]B = diagfm˜i/mig (with 0/0 interpreted as 1). n Thus (2) reads T = ∑i=1 m˜iPrvî , which restates Spectral Theorem III. So in this sense, the SVD generalizes the results in §VII.D. Now since U sends B to C, we have C [T]B = C [U]B[jTj]B = diagfm1,..., mng =: D | {z } In and thus (4) A = [T]A = A[I]CC [T]BB[I]A =: S1DS2, with S1 and S2 unitary matrices (as they change between unitary bases).

VII.E) the Singular Value Decomposition (SVD

Singular Value Decomposition (SVD)

Eigen Values and Vectors Matrices and Eigen Vectors

Banach J. Math. Anal. 2 (2008), No. 2, 59–67 . OPERATOR-VALUED

Singular Value Decomposition (SVD) 2 1.1 Singular Vectors

A Singularly Valuable Decomposition: the SVD of a Matrix Dan Kalman

Limited Memory Block Krylov Subspace Optimization for Computing Dominant Singular Value Decompositions

Computing the Polar Decomposition---With Applications

Geometry of Matrix Decompositions Seen Through Optimal Transport and Information Geometry

Chapter 6 the Singular Value Decomposition Ax=B Version of 11 April 2019

Computing the Polar Decomposition—With Applications

Lecture 15: July 22, 2013 1 Expander Mixing Lemma 2 Singular Value

Singular Value Decomposition