Singular Value Decomposition

Optimization Models EECS 127 / EECS 227AT Laurent El Ghaoui EECS department UC Berkeley Fall 2018 Fa18 1 / 32 LECTURE 6 Singular Value Decomposition The license plate of Gene Golub (1932{2007). Fa18 2 / 32 Outline 1 The singular value decomposition (SVD) Motivation Dyads SVD theorem 2 Matrix properties via SVD Rank, nullspace and range Matrix norms, pseudo-inverses Projectors 3 Low-rank matrix approximation 4 Link with PCA 5 Generalized low-rank models Fa18 3 / 32 Motivation Figure: Votes of US Senators, 2002-2004. The plot is impossible to read. Can we make sense of this data? For example, do the two political parties emerge from the data alone? Could we further provide a \political spectrum score" for each Senator? Fa18 4 / 32 Dyads m;n A matrix A 2 R is called a dyad if it can be written as A = pq> m n for some vectors p 2 R , q 2 R . Element-wise the above reads Aij = pi qi ; 1 ≤ i ≤ m; 1 ≤ j ≤ n: Interpretation: The columns of A are scaled copies the same column p, with scaling factors given in vector q. The rows of A are scaled copies the same row q>, with scaling factors given in vector p. Fa18 5 / 32 Dyads Example: video frames We are given a set of image frames representing a video. Assuming that each image is represented by a row vector of pixels, we can represent the whole video sequence as a matrix A. Each row is an image. ::: Figure: Row vector representation of an image. If the video shows a scene where no movement occurs, then the matrix A is a dyad. Fa18 6 / 32 Sums of dyads The singular value decomposition theorem, seen next, states that any matrix can be written as a sum of dyads: r X > A = pi qi i=1 for vectors pi ; qi that are mutually orthogonal. This allows to intepret data matrices as sums of \simpler" matrices (dyads). Fa18 7 / 32 The singular value decomposition (SVD) SVD theorem The singular value decomposition (SVD) of a matrix provides a three-term factorization which is similar to the spectral factorization, but holds for any, possibly non-symmetric m;n and rectangular, matrix A 2 R . Theorem 1 (SVD decomposition) m;n Any matrix A 2 R can be factored as A = UΣ~V > n;n m;m > > where V 2 R and U 2 R are orthogonal matrices (i.e., U U = Im, V V = In), ~ m;n : and Σ 2 R is a matrix having the first r = rank A diagonal entries (σ1; : : : ; σr ) positive and decreasing in magnitude, and all other entries zero: Σ 0r;n−r Σ~ = ; Σ = diag (σ1; : : : ; σr ) 0: 0m−r;r 0m−r;n−r Fa18 8 / 32 Compact-form SVD Corollary 1 (Compact-form SVD) m;n Any matrix A 2 R can be expressed as r X > > A = σi ui vi = Ur ΣVr i=1 > where r = rank A, Ur = [u1 ··· ur ] is such that Ur Ur = Ir , Vr = [v1 ··· vr ] is such > that Vr Vr = Ir , and σ1 ≥ σ2 ≥ · · · ≥ σr > 0. The positive numbers σi are called the singular values of A, vectors ui are called the left singular vectors of A, and vi the right singular vectors. These quantities satisfy > Avi = σi ui ; ui A = σi vi ; i = 1;:::; r: 2 > > Moreover, σi = λi (AA ) = λi (A A), i = 1;:::; r, and ui , vi are the eigenvectors of A>A and of AA>, respectively. Fa18 9 / 32 Interpretation The singular value decomposition theorem allows to write any matrix can be written as a sum of dyads: r X > A = σi ui vi i=1 where Vectors ui ; vi are normalized, with σi > 0 providing the \strength" of the corresponding dyad; The vectors ui , i = 1;:::; r (resp. vi , i = 1;:::; r) are mutually orthogonal. Fa18 10 / 32 Matrix properties via SVD Rank, nullspace and range The rank r of A is the cardinality of the nonzero singular values, that is the number of nonzero entries on the diagonal of Σ.~ Since r = rank A, by the fundamental theorem of linear algebra the dimension of the nullspace of A is dim N (A) = n − r. An orthonormal basis spanning N (A) is given by the last n − r columns of V , i.e. : N (A) = R(Vnr ); Vnr = [vr+1 ··· vn]: Similarly, an orthonormal basis spanning the range of A is given by the first r columns of U, i.e. : R(A) = R(Ur ); Ur = [u1 ··· ur ]: Fa18 11 / 32 Matrix properties via SVD Matrix norms m;n The squared Frobenius matrix norm of a matrix A 2 R can be defined as n n 2 > X > X 2 kAkF = trace A A = λi (A A) = σi ; i=1 i=1 where σi are the singular values of A. Hence the squared Frobenius norm is nothing but the sum of the squares of the singular values. 2 The squared spectral matrix norm kAk2 is equal to the maximum eigenvalue of A>A, therefore 2 2 kAk2 = σ1 ; i.e., the spectral norm of A coincides with the maximum singular value of A. The so-called nuclear norm of a matrix A is defined in terms of its singular values: r X kAk∗ = σi ; r = rank A: i=1 The nuclear norm appears in several problems related to low-rank matrix completion or rank minimization problems. Fa18 12 / 32 Matrix norm: proof We have 2 ~ > 2 2 kAxk2 kΣV xk2 > kAk2 = max 2 = max 2 (since U U = I ) x6=0 kxk2 x6=0 kxk2 ~ 2 kΣzk2 : > = max 2 (with z = V x) z6=0 kzk2 ~ 2 T = max kΣzk2 : z z = 1 z r X 2 2 T = max σi zi : z z = 1 z i=1 r X 2 : 2 = max σi pi (with pi = zi , i = 1;:::; r) 1>p=1; p≥0 i=1 2 ≤ max σi : 1≤i≤r In the last line the upper bound is attained for some feasible vector p, hence equality holds. Fa18 13 / 32 Matrix properties via SVD Condition number n;n The condition number of an invertible matrix A 2 R is defined as the ratio between the largest and the smallest singular value: σ1 −1 κ(A) = = kAk2 · kA k2: σn This number provides a quantitative measure of how close A is to being singular (the larger κ(A) is, the more close to singular A is). The condition number also provides a measure of the sensitivity of the solution of a system of linear equations to changes in the equation coefficients. Fa18 14 / 32 Matrix properties via SVD Matrix pseudo-inverses m;n y Given A 2 R , a pseudoiverse is a matrix A that satisfies AAyA = A AyAAy = Ay (AAy)> = AAy (AyA)> = AyA: A specific pseudoinverse is the so-called Moore-Penrose pseudoinverse: y y > n;m A = V Σ~ U 2 R where Σ−1 0 1 1 Σ~ y = r;m−r ; Σ−1 = diag ;:::; 0: 0n−r;r 0n−r;m−r σ1 σr Due to the zero blocks in Σ,~ Ay can be written compactly as y −1 > A = Vr Σ Ur : Fa18 15 / 32 Matrix properties via SVD Moore-Penrose pseudo-inverse: special cases If A is square and nonsingular, then Ay = A−1. m;n If A 2 R is full column rank, that is r = n ≤ m, then y > > A A = Vr Vr = VV = In; that is, Ay is a left inverse of A, and it has the explicit expression Ay = (A>A)−1A>: m;n If A 2 R is full row rank, that is r = m ≤ n, then y > > AA = Ur Ur = UU = Im; that is, Ay is a right inverse of A, and it has the explicit expression Ay = A>(AA>)−1: Fa18 16 / 32 Matrix properties via SVD Projectors : y > Matrix PR(A) = AA = Ur Ur is an orthogonal projector onto R(A). This means n that, for any y 2 R , the solution of problem min kz − yk2 z2R(A) ∗ is given by z = PR(A)y. : y Similarly, matrix PN (A>) = (Im − AA ) is an orthogonal projector onto R(A)? = N (A>). : y Matrix PN (A) = In − A A is an orthogonal projector onto N (A). : y ? > Matrix PN (A)? = A A is an orthogonal projector onto N (A) = R(A ). Fa18 17 / 32 Low-rank matrix approximation m;n Let A 2 R be a given matrix, with rank(A) = r > 0. We consider the problem of approximating A with a matrix of lower rank. In particular, we consider the following rank-constrained approximation problem 2 min kA − Ak k m;n F Ak 2R s.t.: rank(Ak ) = k; where 1 ≤ k ≤ r is given. Let r ~ > X > A = UΣV = σi ui vi i=1 be an SVD of A. An optimal solution of the above problem is simply obtained by truncating the previous summation to the k-th term, that is k X > Ak = σi ui vi : i=1 Fa18 18 / 32 Low-rank matrix approximation The ratio 2 2 2 kAk kF σ1 + ··· + σk ηk = 2 = 2 2 kAkF σ1 + ··· + σr indicates what fraction of the total variance (Frobenius norm) in A is explained by the rank k approximation of A. A plot of ηk as a function of k may give useful indications on a good rank level k at which to approximate A. ηk is related to the relative norm approximation error 2 2 2 kA − Ak kF σk+1 + ··· + σr ek = 2 = 2 2 = 1 − ηk : kAkF σ1 + ··· + σr Fa18 19 / 32 Minimum \distance" to rank deficiency m;n Suppose A 2 R , m ≥ n is full rank, i.e., rank(A) = n.

Singular Value Decomposition

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support