Lecture 4. the Singular Value Decomposition
Total Page:16
File Type:pdf, Size:1020Kb
Lecture The Singular Value Decomp osition The singular value decomp osition SVD is a matrix factorization whose com putation is a step in many algorithms Equally imp ortant is the use of the SVD for conceptual purp oses Many problems of linear algebra can b e b etter ask the question what if we take the SVD understo o d if we rst A Geometric Observation The SVD is motivated by the following geometric fact The image of the unit sphere under any m n matrix is a hyperel lipse The SVD is applicable to b oth real and complex matrices However in de scribing the geometric interpretation we assume as usual that the matrix is real yp erellipse may b e unfamiliar but this is just the mdimen The term h m sional generalization of an ellipse We may dene a hyp erellipse in IR as m the surface obtained by stretching the unit sphere in IR by some factors m p ossibly zero in some orthogonal directions u u IR 1 m 1 m or convenience let us take the u to b e unit vectors The vectors f u g F i i i are the principal semiaxes of the hyp erellipse with lengths If A 1 m has rank r exactly r of the lengths will turn out to b e nonzero and in i particular if m n at most n of them will b e nonzero Part I Fundamentals The italicized statement ab ove has the following meaning By the unit sphere we mean the usual Euclidean sphere in nspace ie the unit sphere in the norm let us denote it by S Then AS the image of S under the mapping A is a hyp erellipse as just dened This geometric fact is not obvious We shall restate it in the language of linear algebra and prove it later For the moment assume it is true v 1 v A 2 u 2 2 AS S u 1 1 Figure SVD of a matrix n mn Let S b e the unit sphere in IR and take any A IR with m n For simplicity supp ose for the moment that A has full rank n The image AS is a m hyp erellipse in IR We now dene some prop erties of A in terms of the shap e of AS The key ideas are indicated in Figure First we dene the n singular values of A These are the lengths of the n principal semiaxes of AS written It is conventional to assume 1 2 n that the singular values are numb ered in descending order 1 2 n Next we dene the n left singular vectors of A These are the unit vectors fu u u g oriented in the directions of the principal semiaxes of AS 1 2 n numb ered to corresp ond with the singular values Thus the vector u is the i i th largest principal semiaxis of AS i Finally we dene the n right singular vectors of A These are the unit vectors fv v v g S that are the preimages of the principal semiaxes 1 2 n u of AS numb ered so that Av j j j The terms left and right in the denitions ab ove are decidedly awk ward They come from the p ositions of the factors U and V in and b elow What is awkward is that in a sketch like Figure the left singular vectors corresp ond to the space on the right and the right singular vectors corresp ond to the space on the left One could resolve this problem by interchanging the two halves of the gure with the map A p ointing from right to left but that would go against deeplyingrained habits Lecture The Singular Value Decomposition Reduced SVD We have just mentioned that the equations relating right singular vectors fv g j and left singular vectors fu g can b e written j Av u j n j j j This collection of vector equations can b e expressed as a matrix equation 2 3 2 3 2 3 2 3 1 6 7 6 7 6 7 6 7 6 7 6 7 6 7 6 7 2 6 7 6 7 6 7 6 7 6 7 6 7 6 7 6 7 v v v 6 7 6 7 1 2 n 6 7 4 5 6 7 6 7 A u u u 6 7 1 2 n 6 7 6 7 4 5 6 7 6 7 n 6 7 6 7 4 5 4 5 or more compactly AV U In this matrix equation is an n n diagonal matrix with p ositive real entries since A was assumed to have full rank n U is an m n matrix with orthonormal columns and V is an n n matrix with orthonormal columns Thus V is unitary and we can multiply on the right by its inverse V to obtain A U V This factorization of A is called a reduced singular value decomposition or reduced SVD of A Schematically it lo oks like this Reduced SVD m n A U V Full SVD In most applications the SVD is used in exactly the form just describ ed However this is not the standard way in which the idea of an SVD is usu ally formulated We have intro duced the awkward term reduced and the unsightly hats on U and in order to distinguish the factorization from the more standard full SVD This reduced vs full terminology and hatted notation will b e maintained throughout the b o ok and we shall make factorizations a similar distinction b etween reduced and full QR The idea is as follows The columns of U are n orthonormal vectors in m the mdimensional space C Unless m n they do not form a basis of Part I Fundamentals m C nor is U a unitary matrix However by adjoining an additional m n orthonormal columns U can b e extended to a unitary matrix Let us do this in an arbitrary fashion and call the result U If U is replaced by U in then will have to change to o For the pro duct to remain unaltered the last m n columns of U should b e multiplied by zero Accordingly let b e the m n matrix consisting of in the upp er n n blo ck together with m n rows of zeros b elow We now have a new factorization the ful l SVD of A A U V Here U is m m and unitary V is n n and unitary and is m n and diagonal with p ositive real entries Schematically Full SVD m n A U V The dashed lines indicate the silent columns of U and rows of that are discarded in passing from to Having describ ed the full SVD we can now discard the simplifying as sumption that A has full rank If A is rankdecient the factorization is still appropriate All that changes is that now not n but only r of the left singular vectors of A are determined by the geometry of the hyp erellipse To construct the unitary matrix U we intro duce m r instead of just m n additional arbitrary orthonormal columns The matrix V will also need n r arbitrary orthonormal columns to extend the r columns determined by the geometry The matrix will now have r p ositive diagonal entries with the remaining n r equal to zero By the same token the reduced SVD also makes sense for matrices A of less than full rank One can take U to b e m n with of dimensions n n with some zeros on the diagonal or further compress the representation so that U is m r and is r r and strictly p ositive on the diagonal Formal Denition mn Let m and n b e arbitrary we do not require m n Given A C not necessarily of full rank a singular value decomposition SVD of A is a factorization A U V Lecture The Singular Value Decomposition where mm U C is unitary nn C is unitary V mn IR is diagonal In addition is assumed to have its diagonal entries nonnegative and in j nonincreasing order that is where p minm n 1 2 p Note that the diagonal matrix has the same shap e as A even when A is not square but U and V are always square unitary matrices n It is clear that the image of the unit sphere in IR under a map A U V m must b e a hyp erellipse in IR The unitary map V preserves the sphere the diagonal matrix stretches the sphere into a hyp erellipse aligned with the canonical basis and the nal unitary map U rotates or reects the hyp erellipse without changing its shap e Thus if we can prove that every matrix has an SVD we shall have proved that the image of the unit sphere under any linear map is a hyp erellipse as claimed at the outset of this lecture Existence and Uniqueness mn Theorem Every matrix A C has a singular value decomposition Furthermore the singular values f g are uniquely determined and if j A is square and the are distinct the left and right singular vectors fu g and j j fv g are uniquely determined up to complex signs ie complex scalar factors j value of absolute Proof To prove existence of the SVD we isolate the direction of the largest action of A and then pro ceed by induction on the dimension of A Ak By a compactness argument there must b e a vector Set k 2 1 n C with kv k and ku k where u Av Consider any v 1 2 1 2 1 1 1 1 n extensions of v to an orthonormal basis fv g of C and of u to an orthonormal 1 j 1 m C and let U and V denote the unitary matrices with columns basis fu g of 1 1 j u and v resp ectively Then we have j j " # w 1 U AV S 1 1 B where is a column vector of dimension m w is a row vector of dimension n and B has dimensions m n Furthermore #" # " " # w 1 1 1 2 2 12 w w w w 1 1 B w w 2 2 2 12 implying kS k w w Since U and V are unitary we know that 1 2 1 1 kS k kAk so this implies w 2 2 1 Part I Fundamentals If n or m we are done Otherwise the submatrix B describ es the action of A on the subspace orthogonal to v By the induction hyp othesis B 1 Now it is easily veried that has an SVD B U V 2 2 2 # #" #" " 1 V A U 1 1 V U 2 2 2 is an SVD of A completing the pro of of existence For the uniqueness claim the geometric justication is straightforward if the semiaxis lengths of a hyp erellipse are distinct then the semiaxes them selves are determined by the geometry up to signs Algebraically we can argue as follows First we note that is uniquely determined by the condition that 1 it is equal to kAk as follows from Now supp ose that in addition to v 2 1 there is another linearly indep endent vector w with kw k and kAw k 2 2 1 Dene a unit vector v orthogonal to v as a linear combination of v and w 2 1 1 w v w v 1 1 v 2 kw v w v k 1 1 2 Since kAk kAv k but this must b e