Matrix Norms and Singular Value Decomposition

Lectures on Dynamic Systems and Control Mohammed Dahleh Munther A. Dahleh George Verghese Department of Electrical Engineering and Computer Science 1 Massachuasetts Institute of Technology 1 c � Chapter 4 Matrix Norms and Singular Value Decomp osition 4.1 Intro duction In this lecture, we intro duce the notion of a norm for matrices. The singular value decomposition or SVD of a matrix is then presented. The SVD exp oses the 2-norm of a matrix, value to us go es much further: it enables the solution of a class of matrix perturbation but its problems that form the basis for the stability robustness concepts intro duced later� it solves - least squares problem, which is a generalization of the least squares estima the so-called total tion problem considered earlier� and it allows us to clarify the notion of conditioning, in the context of matrix inversion. These applications of the SVD are presented at greater length in the next lecture. Example 4.1 To provide some immediate motivation for the study and applica- tion of matrix norms, we b egin with an example that clearly brings out the issue resp ect to inversion. The question of interest is how of matrix conditioning with sensitive the inverse of a matrix is to p erturbations of the matrix. Consider inverting the matrix � � 100 100 A � (4.1) 100:2 100 A quick calculation shows that � � ;5 5 ;1 A � (4.2) 5:01 ;5 Now supp ose we invert the p erturb ed matrix � � 100 100 A + �A � (4.3) 100:1 100 The result now is � � ;10 10 ;1 ;1 ;1 (A + �A) � A + �(A ) � (4.4) 10:01 ;10 ;1 Here �A denotes the p erturbation in A and �(A ) denotes the resulting p er- ;1 turbation in A . Evidently a 0.1% change in one entry of A has resulted in a ;1 . If we want to solve the problem Ax � b where change in the entries of A 100% T ;1 T , then x � A b � [;10 10:01] , while after p erturbation of A we 1] b � [1 ; ;1 T b � [;20 20:01] . Again, we see a 100% change in the + �x � (A + �A) get x entries of the solution with only a 0.1% change in the starting data. The situation seen in the ab ove example is much worse than what can ever arise in the ;1 ;1 )�(a ) � ;da�a, so the fractional change in the d(a scalar case. If a is a scalar, then inverse of a has the same maginitude as the fractional change in a itself. What is seen in the ab ove example, therefore, is a purely matrix phenomenon. It would seem to b e related to dep endent, its the fact that A is nearly singular | in the sense that its columns are nearly determinant is much smaller than its largest element, and so on. In what follows (see next we shall develop a sound way to measure nearness to singularity, and show how this lecture), sensitivity under inversion. measure relates to Before understanding such sensitivity to p erturbations in more detail, we need ways to vectors and matrices. We have already intro duced the notion measure the \magnitudes" of vector norms in Lecture 1, so we now turn to the de�nition of matrix norms. of 4.2 Matrix Norms An m � n complex matrix may b e viewed as an op erator on the (�nite dimensional) normed n vector space C : m�n n m A : ( C � k � k ) ;! ( C � k � k ) (4.5) 2 2 where the norm here is taken to b e the standard Euclidean norm. De�ne the induced 2-norm follows: of A as kAxk 4 2 kAk � sup (4.6) 2 kxk x6�0 2 : (4.7) � max kAxk 2 kxk �1 2 The term \induced" refers to the fact that the de�nition of a norm for vectors such as Ax and ab ove de�nition of a matrix norm. From this de�nition, it follows that x is what enables the amount of \ampli�cation" the matrix A provides to vectors the induced norm measures the n on the unit sphere in C , i.e. it measures the \gain" of the matrix. Rather than measuring the vectors x and Ax using the 2-norm, we could use any p-norm, interesting cases b eing p � 1� 2� 1. Our notation for this is the kAk � max kAxk : (4.8) p p kxk �1 p An imp ortant question to consider is whether or not the induced norm is actually a norm, vectors in Lecture 1. Recall the three conditions that de�ne a norm: in the sense de�ned for 1. kxk � 0, and kxk � 0 () x � 0� 2. k�xk � j�j kxk � 3. kx + y k � kxk + ky k . m�n is a norm on C | using the preceding de�nition: Now let us verify that kAk p 1. kAk � 0 since kAxk � 0 for any x. Futhermore, kAk � 0 () A � 0, since kAk is p p p p calculated from the maximum of kAxk evaluated on the unit sphere. p 2. k�Ak � j�j kAk follows from k�y k � j�j ky k (for any y ). p p p p 3. The triangle inequality holds since: kA + B k � max k(A + B )xk p p kxk �1 p � � � max kAxk + kB xk p p kxk �1 p � kAk + kB k : p p Induced norms have two additional prop erties that are very imp ortant: 1. kAxk � kAk kxk , which is a direct consequence of the de�nition of an induced norm� p p p m�n n�r 2. For A , B , kAB k � kAk kB k (4.9) p p p which is called the submultiplicative property. This also follows directly from the de�nition: kAB xk � kAk kB xk p p p � kAk kB k kxk for any x: p p p Dividing by kxk : p kAB xk p � kAk kB k � p p kxk p from which the result follows. Before we turn to a more detailed study of ideas surrounding the induced 2-norm, which b e the fo cus of this lecture and the next, we make some remarks ab out the other induced will interest, namely the induced 1-norm and induced 1-norm. We shall also norms of practical say something ab out an imp ortant matrix norm that is not an induced norm, namely the Frobenius norm. It is a fairly simple exercise to prove that m X kAk � max ja j (max of absolute column sums of A) � (4.10) ij 1 1�j �n i�1 and n X j (max of absolute kAk � max ja row sums of A) : (4.11) ij 1 1�i�m j �1 (Note that these de�nitions reduce to the familiar ones for the 1-norm and 1-norm of column vectors in the case n � 1.) The pro of for the induced 1-norm involves two stages, namely: 1. Prove that the quantity in Equation (4.11) provides an upp er b ound � : kAxk � � kxk 8x � 1 1 2. Show that this b ound is achievable for some x � x^ : kAx^ k � � kx^ k for some x^ : 1 1 In order to show how these steps can b e implemented, we give the details for the 1-norm n and consider case. Let x 2 C n X kAxk � max j a x j ij j 1 1�i�m j �1 n X � max ja jjx j ij j 1�i�m j �1 0 1 n X @ A � max ja j max jx j ij j 1�i�m 1�j �n j �1 0 1 n X @ A � max ja j kxk ij 1 1�i�m j �1 The ab ove inequalities show that an upp er b ound � is given by n X max kAxk � � � max ja j: ij 1 1�i�m kxk �1 1 j �1 � Now in order to show that this upp er b ound is achieved by some vector x^ , let i b e an index P n at which the expression of � achieves a maximum, that is � � ja j. De�ne the vector x^ � j �1 ij as 2 3 sg n(a ) � i 1 6 7 ) sg n(a 6 7 � i 2 6 7 x^ � : . 6 7 . 4 5 ) sg n(a � in Clearly kx^ k � 1 and 1 n X kAx^ k � ja j � � : � 1 ij j �1 The pro of for the 1-norm pro ceeds in exactly the same way, and is left to the reader. There are matrix norms | i.e. functions that satisfy the three de�ning conditions stated imp ortant example of this for us is the earlier | that are not induced norms. The most Frob enius norm: 1 0 1 2 n m X X 4 2 @ A kAk � ja j (4.12) ij F j �1 i�1 1 ; � 0 2 � trace (A A) (verify) (4.13) In other words, the Frob enius norm is de�ned as the ro ot sum of squares of the entries, i.e. mn the usual Euclidean 2-norm of the matrix when it is regarded simply as a vector in C . b e shown that it is not an induced matrix norm, the Frob enius norm still has Although it can submultiplicative prop erty that was noted for induced norms. Yet other matrix norms the may b e de�ned (some of them without the submultiplicative prop erty), but the ones ab ove interest to us.

Matrix Norms and Singular Value Decomposition

Singular Value Decomposition (SVD)

C H a P T E R § Methods Related to the Norm Al Equations

Eigen Values and Vectors Matrices and Eigen Vectors

Singular Value Decomposition (SVD) 2 1.1 Singular Vectors

A Singularly Valuable Decomposition: the SVD of a Matrix Dan Kalman

The Column and Row Hilbert Operator Spaces

Limited Memory Block Krylov Subspace Optimization for Computing Dominant Singular Value Decompositions

Matrices and Linear Algebra

Lecture 5 Ch. 5, Norms for Vectors and Matrices

Matrix Norms 30.4

5 Norms on Linear Spaces

A Degeneracy Framework for Scalable Graph Autoencoders