Lectures on Dynamic Systems and

Control

Mohammed Dahleh Munther A. Dahleh George Verghese

Department of Electrical Engineering and Computer Science

1

Massachuasetts Institute of Technology

1

c �

Chapter 4

Matrix Norms and

Decomp osition

4.1 Intro duction

In this lecture, we intro duce the notion of a for matrices. The singular value decom-

position or SVD of a is then presented. The SVD exp oses the 2-norm of a matrix,

value to us go es much further: it enables the solution of a class of matrix perturbation but its

problems that form the basis for the stability robustness concepts intro duced later� it solves

- least squares problem, which is a generalization of the least squares estima the so-called total

tion problem considered earlier� and it allows us to clarify the notion of conditioning, in the

context of matrix inversion. These applications of the SVD are presented at greater length in

the next lecture.

Example 4.1 To provide some immediate motivation for the study and applica-

tion of matrix norms, we b egin with an example that clearly brings out the issue

resp ect to inversion. The question of interest is how of matrix conditioning with

sensitive the inverse of a matrix is to p erturbations of the matrix.

Consider inverting the matrix

� �

100 100

A � (4.1)

100:2 100

A quick calculation shows that

� �

;5 5

;1

A � (4.2)

5:01 ;5

Now supp ose we invert the p erturb ed matrix

� �

100 100

A + �A � (4.3)

100:1 100

The result now is

� �

;10 10

;1 ;1 ;1

(A + �A) � A + �(A ) � (4.4)

10:01 ;10

;1

Here �A denotes the p erturbation in A and �(A ) denotes the resulting p er-

;1

turbation in A . Evidently a 0.1% change in one entry of A has resulted in a

;1

. If we want to solve the problem Ax � b where change in the entries of A 100%

T ;1 T

, then x � A b � [;10 10:01] , while after p erturbation of A we 1] b � [1 ;

;1 T

b � [;20 20:01] . Again, we see a 100% change in the + �x � (A + �A) get x

entries of the solution with only a 0.1% change in the starting data.

The situation seen in the ab ove example is much worse than what can ever arise in the

;1 ;1

)�(a ) � ;da�a, so the fractional change in the d(a scalar case. If a is a scalar, then

inverse of a has the same maginitude as the fractional change in a itself. What is seen in the

ab ove example, therefore, is a purely matrix phenomenon. It would seem to b e related to

dep endent, its the fact that A is nearly singular | in the sense that its columns are nearly

determinant is much smaller than its largest element, and so on. In what follows (see next

we shall develop a sound way to measure nearness to singularity, and show how this lecture),

sensitivity under inversion. measure relates to

Before understanding such sensitivity to p erturbations in more detail, we need ways to

vectors and matrices. We have already intro duced the notion measure the \magnitudes" of

vector norms in Lecture 1, so we now turn to the de�nition of matrix norms. of

4.2 Matrix Norms

An m � n complex matrix may b e viewed as an op erator on the (�nite dimensional) normed

n

C :

m�n n m

A : ( C � k � k ) ;! ( C � k � k ) (4.5)

2 2

where the norm here is taken to b e the standard Euclidean norm. De�ne the induced 2-norm

follows: of A as

kAxk

4

2

kAk � sup (4.6)

2

kxk

x�06

2

: (4.7) � max kAxk

2

kxk �1

2

The term \induced" refers to the fact that the de�nition of a norm for vectors such as Ax and

ab ove de�nition of a . From this de�nition, it follows that x is what enables the

amount of \ampli�cation" the matrix A provides to vectors the induced norm measures the

n

on the unit sphere in C , i.e. it measures the \gain" of the matrix.

Rather than measuring the vectors x and Ax using the 2-norm, we could use any p-norm,

interesting cases b eing p � 1� 2� 1. Our notation for this is the

kAk � max kAxk : (4.8)

p p

kxk �1

p

An imp ortant question to consider is whether or not the induced norm is actually a norm,

vectors in Lecture 1. Recall the three conditions that de�ne a norm: in the sense de�ned for

1. kxk � 0, and kxk � 0 () x � 0�

2. k�xk � j�j kxk �

3. kx + y k � kxk + ky k .

m�n

is a norm on C | using the preceding de�nition: Now let us verify that kAk

p

1. kAk � 0 since kAxk � 0 for any x. Futhermore, kAk � 0 () A � 0, since kAk is

p p p p

calculated from the maximum of kAxk evaluated on the unit sphere.

p

2. k�Ak � j�j kAk follows from k�y k � j�j ky k (for any y ).

p p p p

3. The triangle inequality holds since:

kA + B k � max k(A + B )xk

p p

kxk �1

p

� �

� max kAxk + kB xk

p p

kxk �1

p

� kAk + kB k :

p p

Induced norms have two additional prop erties that are very imp ortant:

1. kAxk � kAk kxk , which is a direct consequence of the de�nition of an induced norm�

p p p

m�n n�r

2. For A , B ,

kAB k � kAk kB k (4.9)

p p p

which is called the submultiplicative property. This also follows directly from the de�ni-

tion:

kAB xk � kAk kB xk

p p p

� kAk kB k kxk for any x:

p p p

Dividing by kxk :

p

kAB xk

p

� kAk kB k �

p p

kxk

p

from which the result follows.

Before we turn to a more detailed study of ideas surrounding the induced 2-norm, which

b e the fo cus of this lecture and the next, we make some remarks ab out the other induced will

interest, namely the induced 1-norm and induced 1-norm. We shall also norms of practical

say something ab out an imp ortant matrix norm that is not an induced norm, namely the

Frobenius norm.

It is a fairly simple exercise to prove that

m

X

kAk � max ja j (max of absolute column sums of A) � (4.10)

ij

1

1�j �n

i�1

and

n

X

j (max of absolute kAk � max ja row sums of A) : (4.11)

ij

1

1�i�m

j �1

(Note that these de�nitions reduce to the familiar ones for the 1-norm and 1-norm of column

vectors in the case n � 1.)

The pro of for the induced 1-norm involves two stages, namely:

1. Prove that the quantity in Equation (4.11) provides an upp er b ound � :

kAxk � � kxk 8x �

1 1

2. Show that this b ound is achievable for some x � x^ :

kAx^ k � � kx^ k for some x^ :

1 1

In order to show how these steps can b e implemented, we give the details for the 1-norm

n

and consider case. Let x 2 C

n

X

kAxk � max j a x j

ij j 1

1�i�m

j �1

n

X

� max ja jjx j

ij j

1�i�m

j �1

0 1

n

X

@ A

� max ja j max jx j

ij j

1�i�m 1�j �n

j �1

0 1

n

X

@ A

� max ja j kxk

ij 1

1�i�m

j �1

The ab ove inequalities show that an upp er b ound � is given by

n

X

max kAxk � � � max ja j:

ij 1

1�i�m

kxk �1

1

j �1

Now in order to show that this upp er b ound is achieved by some vector x^ , let i b e an index

P

n

at which the expression of � achieves a maximum, that is � � ja j. De�ne the vector x^

j �1 ij

as

2 3

sg n(a )

i 1

6 7

) sg n(a

6 7 �

i 2

6 7

x^ � :

.

6 7

.

.

4 5

) sg n(a

in

Clearly kx^ k � 1 and

1

n

X

kAx^ k � ja j � � :

1

ij

j �1

The pro of for the 1-norm pro ceeds in exactly the same way, and is left to the reader.

There are matrix norms | i.e. functions that satisfy the three de�ning conditions stated

imp ortant example of this for us is the earlier | that are not induced norms. The most

Frob enius norm:

1

0 1

2

n m

X X

4

2

@ A

kAk � ja j (4.12)

ij

F

j �1 i�1

1

; �

0

2

(A A) (verify) (4.13)

In other words, the Frob enius norm is de�ned as the ro ot sum of squares of the entries, i.e.

mn

the usual Euclidean 2-norm of the matrix when it is regarded simply as a vector in C .

b e shown that it is not an induced matrix norm, the Frob enius norm still has Although it can

submultiplicative prop erty that was noted for induced norms. Yet other matrix norms the

may b e de�ned (some of them without the submultiplicative prop erty), but the ones ab ove

interest to us. are the only ones of

4.3 Singular Value Decomp osition

Before we discuss the singular value decomp osition of matrices, we b egin with some matrix

facts and de�nitions.

Some Matrix Facts:

n�n 0 0

� A matrix U 2 C is unitary if U U � U U � I . Here, as in Matlab, the sup erscript

0

(entry-by-entry) complex conjugate of the transpose, which is also called denotes the

transpose or . the Hermitian

n�n T T T

� A matrix U 2 R is orthogonal if U U � U U � I , where the sup erscript denotes

transp ose. the

� Prop erty: If U is unitary, then kU xk � kxk .

2 2

0

� If S � S (i.e. S equals its Hermitian transp ose, in which case we say S is Hermitian),

0 1

S U � []. such that U then there exists a

0 0

� For any matrix A, b oth A A and AA are Hermitian, and thus can always b e diagonalized

by unitary matrices.

0 0

� For any matrix A, the eigenvalues of A A and AA are always real and non-negative

(proved easily by contradiction).

m�n

Theorem 4.1 (Singular Value Decomp osition, or SVD) Given any matrix A 2 C ,

b e written as A can

n�n

m�m m�n

0

A � V � (4.14)

U �

0 0

where U U � I , V V � I ,

2 3

1

6 7

.

.

6 7

.

0

6 7

6 7

� � � (4.15)

6 7

r

6 7

4 5

0 0

p

0

and � � ith nonzero eigenvalue of A A. The � are termed the singular values of A, and

i i

are arranged in order of descending magnitude, i.e.,

� � � � � � � � � 0 : �

1 2 r

Pro of: We will prove this theorem for the case (A) � m� the general case involves very

0

is Hermitian, and it can AA little more than what is required for this case. The matrix

m�m

, so that b e diagonalized by a unitary matrix U 2 C therefore

0 0

U � U � AA :

1

) has real Note that � � diag (� � � � : : : � � p ositive diagonal entries � due to the fact that

1 2 m 1 i

0 2 2 2 m�n 0 2

AA is p ositive de�nite. We can write � � � � diag (� � � � : : : � � ). De�ne V 2 C

1

m

1 1 2 1

;1

0 0 0

by V � � U A. V has orthonormal rows as can b e seen from the following calculation:

1 1

1

;1 ;1

0 0 0 0

V V � � U AA U � � I . Cho ose the matrix V in such a way that

1

1 2

1 1

" #

0

V

0

1

V �

0

V

2

n�n

is in C and unitary. De�ne the m � n matrix � � [� j0]. This implies that

1

0 0 0

� � V � U A: �V

1

1

0

words we have A � U �V . In other

1

One cannot always diagonalize an arbitrary matrix|cf the Jordan form.

Example 4.2 For the matrix A given at the b eginning of this lecture, the SVD

by writing [u� s� v ] � sv d(A) | is | computed easily in Matlab

� � � � � �

:7068 :7075 200:1 0 :7075 :7068

A � (4.16)

:7075 : ; 7068 0 0:1 ;:7068 :7075

Observations:

i)

0 T 0 0

AA � U �V � U V

T 0

� U �� U

2 3

2

1

6 7

.

.

6 7

.

0

6 7

0

6 7

2

� U U � (4.17)

6 7

r

6 7

4 5

0 0

0

which tells us U diagonalizes AA �

ii)

0 T 0 0

A A � V � U U �V

T 0

� V � �V

2 3

2

1

6 7

.

.

6 7

.

0

6 7

0

6 7

2

� V V � (4.18)

6 7

r

6 7

4 5

0 0

0

which tells us V diagonalizes A A�

i.e., iii) If U and V are expressed in terms of their columns,

h i

U � u u � � � u

1 2 m

and

h i

V � v v � � � v �

1 2 n

then

r

X

0

� (4.19) � u v A �

i i

i

i�1

which is another way to write the SVD. The u are termed the left singular vectors of

i

A, and the v are its right singular vectors. From this we see that we can alternately

i

interpret Ax as

r

X

; �

0

v x � u Ax � � (4.20)

i i

i

| {z }

i�1

pro jection

, where the which is a weighted sum of the u weights are the pro ducts of the singular

i

. values and the pro jections of x onto the v

i

P

r

� g Observation (iii) tells us that Ra(A) � span fu : : : u (b ecause Ax � c u |

i i 1 r

i�1

c are scalar weights). Since the columns of U are indep endent, dim Ra(A) � r � rank (A), where the

i

� g constitute a f u : : : u basis for the range space of A. The null space of A is given by and

1 r

spanfv � : : : � v g. To see this:

r +1 n

0 0

U �V x � 0 () �V x � 0

2 3

0

� v x

1

1

6 7

.

.

() � 0

4 5

.

0

� v x

r

r

0

() v x � 0 � i � 1� : : : � r

i

() x 2 spanfv � : : : � v g:

r +1 n

Example 4.3 One application of singular value decomp osition is to the solution

Supp ose A is an m � n complex matrix and b of a system of algebraic equations.

m

is a vector in C . Assume that the rank of A is equal to k , with k � m. We are

lo oking for a solution of the linear system Ax � b. By applying the singular value

decomp osition pro cedure to A, we get

0

A � U �V

2 3

� 0

1

6 7

0

� U V

4 5

0 0

where � is a k � k non-singular diagonal matrix. We will express the unitary

1

columnwise as matrices U and V

h i

U � u u : : : u

1 2 m

h i

V � v v : : : v :

1 2 n

A necessary and su�cient condition for the solvability of this system of equations

0

b � 0 for all i satisfying k � i � m. Otherwise, the system of equations u is that

i

inconsistent. This condition means that the vector b must b e orthogonal to the is

last m ; k columns of U . Therefore the system of linear equations can b e written

as

2 3

� 0

1

6 7

0 0

x � U b V

4 5

0 0

2 3

0

2 3

b u

0 1

u b

1

6 7

.

.

6 7 6 7

2 3

0

.

u b

6 7 6 7

2

� 0

1

6 7 6 7

0

.

u b

6 7 6 7 6 7

0

.

k

V x � . � :

6 7 6 7

4 5

6 7 6 7

0

.

6 7 6 7

0 0

.

.

6 7

4 5

.

.

4 5

.

0

u b

m

0

, Using the ab ove equation and the invertibility of � we can rewrite the system of

1

equations as

2 3

2 3

0

1

0

v

u b

1

1

1

6 7

0

6 7

1

0

v

6 7

u b

2

6 7

2

6 7

2

x �

6 7

.

6 7

.

4 5

: : :

.

4 5

1

0

0

u b

v

k

k

k

By using the fact that

2 3

0

v

1

6 7

0

h i

v

6 7

2

6 7

v : : : v � I � v

.

2 1

k

6 7

.

.

4 5

0

v

k

we obtain a solution of the form

k

X

1

0

x � u b v :

i

i

i

i�1

From the observations that were made earlier, we know that the vectors v � v � : : : � v

n

k +1 k +2

- kernel of A, and therefore a general solution of the system of linear equa span the

tions is given by

k n

X X

1

0

x � � (u b) v + � v

i i i

i

i

i�1

i�k +1

, with i in the where the co e�cients � interval k + 1 � i � n, are arbitrary complex

i

numb ers.

4.4 Relationship to Matrix Norms

The singular value decomp osition can b e used to compute the induced 2-norm of a matrix A.

Theorem 4.2

kAxk

4

2

kAk � sup

2

kxk

x�06

2

� � (4.21)

1

� (A) � �

max

which tells us that the maximum ampli�cation is given by the maximum singular value.

Pro of:

0

kAxk kU �V xk

2 2

sup � sup

kxk kxk

x�06 x6�0

2 2

0

xk k�V

2

� sup

kxk

x�06

2

k�y k

2

� sup

kV y k

y �06

2

1

� �

P

2

2

r

2

� jy j

i

i�1

i

� sup

1

� �

P

2

y �06 2

r

jy j

i

i�1

� � :

1

T

, and the For y � [1 0 � � � 0] , k�y k � � supremum is attained. (Notice that this correp onds

1

2

. Hence, .) v Av � � u to x �

1 1 1 1

Another application of the singular value decomp osition is in computing the minimal

elements with 2-norm equal to 1. ampli�cation a full rank matrix exerts on

m�n

Theorem 4.3 Given A 2 C , supp ose rank(A) � n. Then

min kAxk � � (A) : (4.22)

n

2

kxk �1

2

Note that if rank(A) � n, then there is an x such that the minimum is zero (rewrite A in

terms of its SVD to see this).

� 1, Pro of: For any kxk

2

0

kAxk � kU �V xk

2 2

0

� k�V xk (invariant under multiplication by unitary matrices)

2

� k�y k

2 2x2 A v A v 2 2 A v 1

v

1

2�2

Figure 4.1: Graphical depiction of the mapping involving A . Note that Av � � u and

1 1 1

Av � � u . that

2 2 2

0

for y � V x. Now

1

� !

n

2

X

2

k�y k � j� y j

i i

2

i�1

� : �

n

T

Note that the minimum is achieved for y � [0 0 � � � 0 1] � thus the pro of is complete.

The Frob enius norm can also b e expressed quite simply in terms of the singular values.

We leave you to verify that

1

0 1

2

n m

X X

4

2

@ A

kAk � ja j

ij

F

j �1 i�1

1

; �

0

2

� trace (A A)

1

� !

r

2

X

2

� � (4.23)

i

i�1

Example 4.4 Matrix Inequality

We say A � B , two square matrices, if

0 0

x Ax � x B x for all x 6� 0:

follows that for any matrix A, not necessarily square, It

0 2

A � � I : kAk � � $ A

2

Exercises

Exercise 4.1 Verify that for any A, an m � n matrix, the following holds:

p

1

p

: mkAk kAk � kAk �

1 1 2

n

0

Exercise 4.2 Supp ose A � A. Find the exact relation b etween the eigenvalues and singular values

A. Do es this hold if A is not conjugate symmetric� of

. Exercise 4.3 Show that if rank(A) � 1, then, kAk � kAk

2 F

Exercise 4.4 This problem leads you through the argument for the existence of the SVD, using an

0

, where U and V are unitary matrices is equivalent to iterative construction. Showing that A � U �V

0

AV � �. showing that U

n

a) Argue from the de�nition of kAk that there exist unit vectors (measured in the 2-norm) x 2 C

2

m

. and y 2 C such that Ax � � y , where � � kAk

2

b) We can extend b oth x and y ab ove to orthonormal bases, i.e. we can �nd unitary matrices

V and U whose �rst columns are x and y resp ectively:

1 1

~ ~

] V � [x V � U � [y U ]

1 1 1 1

Show that one way to do this is via Householder transformations, as follows:

0

hh

0

� I ; 2 V � h � x ; [1� 0� : : : � 0]

1

0

h h

. likewise for U and

1

0

. � Now de�ne A � U AV Why is kA k � kAk c)

1 1 2 1 2

1

d) Note that

� � � �

0 0 0

~

y Ax y AV � w

1

A � �

1

0 0

~ ~ ~

0 B U Ax U AV

1

1 1

lower left element in the ab ove matrix is 0� What is the justi�cation for claiming that the

e) Now show that

� �

2 0

kA k � � + w w

1 2

w

combine this with the fact that kA k � kAk � � to deduce that w � 0, so and

1 2 2

� �

� 0

� A

1

0 B

At the next iteration, we apply the ab ove pro cedure to B , and so on. When the iterations terminate,

we have the SVD.

[The reason that this is only an existence pro of and not an algorithm is that it b egins by invoking

y , but do es not show how to compute them. Very go o d algorithms do exist for the existence of x and

Van Loan's classic, Matrix Computations, Johns Hopkins Press, computing the SVD | see Golub and

numerical computations in a host of applications.] 1989. The SVD is a cornerstone of

Exercise 4.5 Supp ose the m � n matrix A is decomp osed in the form

� �

� 0

0

A � U V

0 0

where U and V are unitary matrices, and � is an invertible r � r matrix (| the SVD could b e used to

pro duce such a decomp osition). Then the \Mo ore-Penrose inverse", or pseudo-inverse of A, denoted

+

, can b e de�ned as the n � m matrix by A

� �

;1

� 0

+ 0

A � V U

0 0

(You can invoke it in Matlab with pinv(A).)

+ + + + + +

a) Show that A A and AA are symmetric, and that AA A � A and A AA � A . (These

alternative de�nition of the pseudo-inverse.) four conditions actually constitute an

+ 0 ;1 0

b) Show that when A has full column rank then A � (A A) A , and that when A has full

+ 0 0 ;1

� A (AA ) . row rank then A

c) Show that, of all x that minimize ky ; Axk (and there will b e many, if A do es not have full

2

+

A y . x � kxk is given by ^ column rank), the one with smallest length

2

Exercise 4.6 All the matrices in this problem are real. Supp ose

� �

R

A � Q

0

with Q b eing an m � m and R an n � n invertible matrix. (Recall that such a

decomp osition exists for any matrix A that has full column rank.) Also let Y b e an m � p matrix of

the form

� �

Y

1

Y � Q

Y

2

where the partitioning in the expression for Y is conformable with the partitioning for A.

^

(a) What choice X of the n � p matrix X minimizes the Frob enius norm, or equivalently the squared

Frob enius norm, of Y ; AX � In other words, �nd

2

^

X � argmin kY ; AX k

F

2

^

Also determine the value of kY ; AX k . (Your answers should b e expressed in terms of the

F

.) Q, R , Y and Y matrices

2 1

0 ;1 0 + +

^

(b) Can your X in (a) also b e written as (A A) A Y � Can it b e written as A Y , where A denotes

(Mo ore-Penrose) pseudo-inverse of A � the

(c) Now obtain an expression for the choice X of X that minimizes

2 2

kY ; AX k + kZ ; B X k

F F

where Z and B are given matrices of appropriate . (Your answer can b e expressed in

A, B , Y , and Z .) terms of

Exercise 4.7 Structured Singular Values

Given a complex A, de�ne the structured singular value as follows.

1

� (A) �

(�) j min f� det(I ; �A) � 0g

�2� max

where � is some set of matrices.

a) If � � f�I : � 2 C g, show that � (A) � �(A), where � is the of A, de�ned

j and the 's are the �(A) � max j� � eigenvalues of A. as:

i i i

n�n

b) If � � f� 2 C g, show that � (A) � � (A)

� max

c) If � � � fdiag (� � � � � � ) j � 2 C g, show that

1 n i

;1 ;1

�(A) � � AD ) � � (D AD ) (A) � � (D

� max �

where

� 2 fdiag(d � � � � d ) j d � 0g D

1 n i

Exercise 4.8 Consider again the structured singular value function of a complex square matrix A

p ossible to compute de�ned in the preceding problem. If has more structure, it is sometimes � A (A)

0

exactly. In this problem, assume A is a rank-one matrix, so that we can write A � uv where u� v are

� Compute vectors of n. complex (A) when

(a) � � diag (� : : : � � )� � 2 C . �

1 n i

(b) � � diag (� : : : � � )� � 2 R . �

1 n i

To simplify the computation, minimize the Frob enius norm of � in the de�ntion of � (A). � MIT OpenCourseWare http://ocw.mit.edu

6.241J / 16.338J Dynamic Systems and Control Spring 2011

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.