Lecture 11 Singular Value Decomposition
Total Page:16
File Type:pdf, Size:1020Kb
Lecture 11 Singular value decomposition Mathématiques appliquées (MATH0504-1) B. Dewals, Ch. Geuzaine 03/12/2020 1 Singular value decomposition (SVD) at a glance … Motivation: the image of the unit sphere S under any m n matrix transformation is a hyperellipse. n x m A A x = Ax m n x Ax S Through the SVD, we will infer important properties of matrix A from the shape of AS! 2 Singular value decomposition (SVD) at a glance … The singular value decomposition (SVD) is a particular matrix factorization. n x m A A x = Ax m n x Ax S Through the SVD, we will infer important properties of matrix A from the shape of AS! 3 Why is the singular value decomposition of particular importance? The reasons for looking at SVD are twofold: 1. The computation of theSVD SVD is used is used as an as an intermediate step in many algorithms of practicalof practical interest. interest. 2. From a conceptual point of view, SVD also enablesthe SVD aenables deeper a understanding deeper understanding of many problemsof many aspects in linear of algebra. linear algebra. 4 Learning objectives & outline Become familiar with the SVD and its geometric interpretation, and get aware of its significance 1. GeometricReminder of observations some fundamentals in linear algebra 2. ReducedGeometric SVD interpretation 3. FFrom “reduced SVD” to “full SVD”, ulland SVD formal definition 4. FormalExistence definition and uniqueness 5 1 - Reminder: fundamentals in linear algebra In this section, we briefly review the concepts of adjoint matrix, matrix rank, unitary matrix as well as matrix norms (Chapters 2 and 3 in Trefethen & Bau, 1997). Adjoint of a matrix The adjoint (or Hermitian conjugate) of an m n matrix A, written A*, is the n m matrix • whose i, j entry • is the complex conjugate of the j, i entry of A. If A = A*, A is Hermitian (or self-adjoint). For a real matrix A, • the adjoint is the transpose: A* = AT, • if the matrix is Hermitian, that is A = AT, then it is symmetric. 7 Matrix rank The rank of a matrix is the number of linearly independent columns (or rows) of a matrix. The numbers of linearly independent columns and rows of a matrix are equal. An m n matrix of full rank is one that has the maximal possible rank (the lesser of m and n). If m ≥ n, such a matrix is characterized by the property that it maps no two distinct vectors to the same vector. 8 Matrix rank 2 0 AS A 1 Image of the unit 1 sphere S by a full-rank 2 matrix: no distinct vectors are mapped to S the same vector. 2 1 A 1 AS Image of the unit 1 sphere S by a rank- 2 deficient matrix: distinct vectors are mapped to S the same vector. 9 Unitary matrix A square matrix Q ℂmm, is unitary (or orthogonal, in the real case), if Q* = Q−1, i.e. Q* Q = I. The columns qi of a unitary matrix form m * an othonormal basis of ℂ : (qi) qj = δij, with δij the Kronecker delta. 10 A rotation matrix is a typical example of a unitary matrix A rotation matrix R may write: cosq sin q R sinq cos q The image of a vector is the same vector, rotated counter clockwise by an angle q. Matrix R • is orthogonal • and R* R = RT R = I. 11 (Induced) matrix norms are defined from the action of the matrix on vectors For a matrix A ℂmn, and given vector norms • ‖ ‧ ‖(n) on the domain of A • ‖ ‧ ‖(m) on the range of A the induced matrix norm ‖ ‧ ‖(n) is the smallest number C for which the following inequality holds for all x ℂn: Ax Cx m n It is the maximum factor by which A can “stretch” a vector x. 12 (Induced) matrix norms are defined from the action of the matrix on vectors The matrix norm can be defined equivalently in terms of the images of the unit vectors under A: Ax Amaxm max Ax mn, n n m xx x x0n x 1 n n This form is convenient for visualizing induced matrix norms, as in this example. A Ax 2.9208 2 2 1 2 A 0 2 x 1 13 2 2 – Geometric interpretation In this section, we introduce conceptually the SVD, by means of a simple geometric interpretation (Chapter 4 in Trefethen & Bau, 1997). Geometric interpretation Let S be the unit sphere in ℝn. Consider any matrix A ℝmn, with m ≥ n. Assume for the moment that A has full rank n. x xn : x 1 A n 1 2 2 x x 2 i S i1 15 Geometric interpretation The image AS is a hyperellipse in ℝm. This fact is not obvious; but let us assume for now that it is true. It will be proved later. x Ax A S AS 16 A “hyperellipse” is the m-dimensional generalization of an ellipse in 2D In ℝm, an hyperellipse is a surface obtained by • stretching the unit sphere in ℝm • by some factors s1, …, sm (possibly zero) m • in some orthogonal directions u1, …, um ℝ For convenience, let us take the ui to be unit vectors, i.e. ‖u ‖ = 1. AS i 2 s2u2 The vectors {si ui} are the principal semiaxes of the hyperellipse. s1u1 17 A “hyperellipse” is the m-dimensional generalization of an ellipse in 2D If A has rank r, exactly r of the lengths si will be nonzero. In particular, if m ≥ n, at most n of them will be nonzero. x AS A s2u2 S s1u1 18 Singular values We stated at the beginning that the SVD enables characterizing properties of matrix A from the shape of AS. Here we go for three definitions … We define the n singular values of matrix A as the lengths of the n principal semiaxes of AS, noted s1, …, sn. s u It is conventional to number 2 2 the singular values in descending order: AS s1u1 s ≥ s ≥ … ≥ s . 1 2 n 19 Left singular vectors We also define the n left singular vectors of matrix A as • the unit vectors {u1, …, un} • oriented in the directions of the principal semiaxes of AS, • numbered to correspond with the singular values. s2u2 Thus, the vector si ui is the AS ith largest principal semiaxis. s1u1 20 Right singular vectors We also define the n right singular vectors of matrix A as the unit vectors {v1, …, vn} S that are the preimages of the principal semiaxes of AS, numbered so that A vj = sj uj. v1 v2 A s2u2 S AS s1u1 21 Important remarks The terms “left” and “right” singular vectors will be understood later as we move forward with a more formal description of the SVD. In the geometric interpretation presented so far, we assumed that matrix A is real and m = n = 2. v1 v2 Actually, the SVD appliesA s2u2 • to both real and complex matrices, S • whatever the number of dimensions. AS s1u1 22 3 – From reduced to full SVD, and formal definition In this section, we distinguish between the so-called “reduced SVD”, often used in practice, and the “full SVD”. We also introduce the formal definition of SVD (Chapter 4 in Trefethen & Bau, 1997). The equations relating right and left singular vectors can be expressed in matrix form We just mentioned that the equations relating right singular vectors {vj} and left singular vectors {uj} can be written A vj = sj uj 1 ≤ j ≤ n This collection of vector equations can be expressed as a matrix equation. s1 A s2 v1 v2 …vn = u1 u2 … un sn 24 The equations relating right and left singular vectors can be expressed in matrix form This matrix equation can be written in a more compact form: to distinguish from U, S in the “full SVD” AV Uˆ Sˆ with • Sˆ an n n diagonal matrix with real entries (as A was assumed to have full rank n) • Uˆ an m n matrix with orthonormal columns • V an n n matrix with orthonormal columnss1 A s2 v1 v2 …vn = u1 u2 … un Thus, V is unitary (i.e. V* = V1), and we obtain: sn ˆ ˆ * A UV S 25 Reduced SVD The factorization of matrix A in the form A UVˆ Sˆ * is called a reduced singular values decomposition, or reduced SVD, of matrix A. Schematically, it looks like this (m ≥ n): n n n n m = m n n * A Uˆ Sˆ V 26 From reduced SVD to … full SVD The columns of Û are n orthonormal vectors in the m-dimensional space ℂm. Unless m = n, they do not form a basis of ℂm, nor is Û a unitary matrix. However, we may “upgrade” Û to a unitary matrix! n n n n m = m n n * A Uˆ Sˆ V 27 From reduced SVD to … full SVD Let us adjoin an additional m n orthonormal columns to matrix Û, so that it becomes unitary. The m n additional orthonormal columns are chosen arbitrarily and the result is noted U. However, S must change too … n n m – n n n m = m Û n n * A U Sˆ V 28 From reduced SVD to … full SVD For the product to remain unaltered, the last m n columns of U should be multiplied by zero. Accordingly, let S be the m n matrix consisting of ^ • S in the upper n n block • together with m n rows of zeros below. n m = “silent” columns “silent” * A U S V 29 From reduced SVD to … full SVD We get a new factorization of A, called full SVD: A = U S V* • U is an m m unitary matrix, • V is an n n unitary matrix, • S is an m n diagonal matrix with real entries n m = * A U S V 30 Generalization to the case of a matrix A which does not have full rank If matrix A is rank-deficient (i.e.