<<

Chapter 6

Eigenvalues and Eigenvectors

Po-Ning Chen, Professor

Department of Electrical and Computer Engineering

National Chiao Tung University

Hsin Chu, Taiwan 30010, R.O.C. 6.1 Introduction to eigenvalues 6-1

Motivations • The static system problem of Ax = b has now been solved, e.g., by Gauss- Jordan method or Cramer’s rule.

• However, a dynamic system problem such as

Ax = λx

cannot be solved by the static system method.

• To solve the dynamic system problem, we need to find the static feature of A that is “unchanged” with the mapping A. In other words, Ax maps to itself with possibly some stretching (λ>1), shrinking (0 <λ<1), or being reversed (λ<0).

• These invariant characteristics of A are the eigenvalues and eigenvectors. Ax maps a vector x to its column space C(A). We are looking for a v ∈ C(A) such that Av aligns with v. The collection of all such vectors is the set of eigen- vectors. 6.1 Eigenvalues and eigenvectors 6-2

Conception (Eigenvalues and eigenvectors): The eigenvalue-eigenvector pair (λi, vi) of a square A satisfies

Avi = λivi (or equivalently (A − λiI)vi = 0) where vi = 0 (but λi can be zero), where v1, v2,..., are linearly independent.

How to derive eigenvalues and eigenvectors?

• For 3 × 3 matrix A, we can obtain 3 eigenvalues λ1,λ2,λ3 by solving

3 2 det(A − λI)=−λ + c2λ + c1λ + c0 =0.

• If all λ1,λ2,λ3 are unequal, then we continue to derive:       Av1 = λ1v1 (A − λ1I)v1 = 0 v1 =theonly of the nullspace of (A − λ1I) Av λ v ⇔ A − λ I v ⇔ v A − λ I  2 = 2 2 ( 2 ) 2 = 0  2 =theonly basis of the nullspace of ( 2 )    Av3 = λ3v3 (A − λ3I)v3 = 0 v3 =theonly basis of the nullspace of (A − λ3I)

and the resulting v1, v2 and v3 are linearly independent to each other.

• If A is symmetric,thenv1, v2 and v3 are orthogonal to each other. 6.1 Eigenvalues and eigenvectors 6-3

• If, for example, λ1 = λ2 = λ3, then we derive:   A − λ I v 0 {v , v } A − λ I ( 1 ) 1 = ⇔ 1 2 =thebasis of the nullspace of ( 1 ) (A − λ3I)v3 = 0 v3 =theonly basis of the nullspace of (A − λ3I)

and the resulting v1, v2 and v3 are linearly independent to each other when the nullspace of (A − λ1I)istwo dimensional.

• If A is symmetric,thenv1, v2 and v3 are orthogonal to each other.

– Yet, it is possible that the nullspace of (A − λ1I)isone dimensional, then we can only have v1 = v2.

– In such case, we say the repeated eigenvalue λ1 only have one eigenvector. In other words, we say A only have two eigenvectors.

– If A is symmetric,thenv1 and v3 are orthogonal to each other. 6.1 Eigenvalues and eigenvectors 6-4

• If λ1 = λ2 = λ3, then we derive:

(A − λ1I)v1 = 0 ⇔{v1, v2, v3} =thebasis of the nullspace of (A − λ1I)

• When the nullspace of (A − λ1I)isthree dimensional, then v1, v2 and v3 are linearly independent to each other.

• When the nullspace of (A − λ1I)istwo dimensional, then we say A only have two eigenvectors.

• When the nullspace of (A − λ1I)isone dimensional, then we say A only have one eigenvector. In such case, the “matrix-form eigensystem” becomes:   λ     1 00   A v1 v1 v1 = v1 v1 v1 0 λ1 0 00λ1 • If A is symmetric, then distinct eigenvectors are orthogonal to each other. 6.1 Eigenvalues and eigenvectors 6-5

Invariance of eigenvectors and eigenvalues. A • Property 1: The eigenvectors stay the same for every power of . The eigenvalues equal the same power of the respective eigenvalues. n n I.e., A vi = λi vi.

2 2 Avi = λivi =⇒ A vi = A(λivi)=λiAvi = λi vi

A • Property 2: If the nullspace of n×n consists of non-zero vectors, then 0 is the eigenvalue of A.

Accordingly, there exists non-zero vi to satisfy Avi =0· vi = 0. 6.1 Eigenvalues and eigenvectors 6-6

Property 3: Assume with no loss of generality |λ1| > |λ2| > ··· > |λk|.For any vector x = a1v1 + a2v2 + ···+ akvk that is a of all P 1 A eigenvectors, the normalized mapping = λ1 (namely,

• 1 1 1 P v1 = A v1 = (Av1)= λ1v1 = v1) λ1 λ1 λ1 converges to the eigenvector with the largest absolute eigenvalue (when being applied repeatedly). I.e.,   k 1 k 1 k k k lim P x = lim A x = lim a1A v1 + a2A v2 + ···+ akA vk k→∞ k→∞ λk k→∞ λk 1 1   1 k k k k = lim a1λ v1 + a2λ v2 + ···+ λkA vk k→∞ λk  1   2   1 steady state transient states = a1v1. 6.1 How to determine the eigenvalues? 6-7

We wish to find a non-zero vector v to satisfy Av = λv;then (A − λI)v =0, where v = 0 ⇔ det(A − λI)=0.

So by solving det(A − λI) = 0, we can obtain all the eigenvalues of A.

  0.50.5 Example. Find the eigenvalues of A = . 0.50.5 Solution.   0.5 − λ 0.5 det(A − λI)=det 0.50.5 − λ =(0.5 − λ)2 − 0.52 = λ2 − λ = λ(λ − 1) = 0 ⇒ λ =0 or 1. 6.1 How to determine the eigenvalues? 6-8

Proposition: (defined in Section 4.2) has eigenvalues either 1 or 0. Proof: • A projection matrix always satisfies P 2 = P .SoP 2v = P v = λv. • By definition of eigenvalues and eigenvectors, we have P 2v = λ2v. • Hence, λv = λ2v for non-zero vector v, which immediately implies λ = λ2. • Accordingly, λ is either 1 or 0. 2

Proposition: has eigenvalues satisfying λk =1forsome integer k. Proof: • A purmutation matrix always satisfies P k+1 = P for some integer k.     001 v3     3 Example. P = 100. Then, P v = v1 and P v = v. Hence, k =3. 010 v2 • Accordingly, λk+1v = λv,whichgivesλk = 1 since the eigenvalue of P cannot be zero. 2 6.1 How to determine the eigenvalues? 6-9

Proposition: Matrix

m m−1 αmA + αm−1A + ···+ α1A + α0I has the same eigenvectors as A, but its eigenvalues become

m m−1 αmλ + αm−1λ + ···+ α1λ + α0, where λ is the eigenvalue of A. Proof:

• Let vi and λi be the eigenvector and eigenvalue of A. Then,     m m−1 m m−1 αmA + αm−1A + ···+ α0I vi = αmλ + αm−1λ + ···+ α0 vi   m m−1 • Hence, vi and αmλ + αm−1λ + ···+ α0 are the eigenvector and eigen- value of the . 2 6.1 How to determine the eigenvalues? 6-10

Theorem (Cayley-Hamilton): For a A,define

n n−1 f(λ)=det(A − λI)=λ + αn−1λ + ···+ α0. (Suppose A has linearly independent eigenvectors.) Then

n n−1 f(A)=A + αn−1A + ···+ α0I = all-zero n × n matrix.

Proof: The eigenvalues of f(A)areall zeros and the eigenvectors of f(A)remain the same as A. By definition of eigen-system, we have   f(λ1)0··· 0        0 f(λ2) ··· 0  f(A) v v ··· vn = v v ··· vn   . 1 2 1 2  ......  00··· f(λn) 2 Corollary (Cayley-Hamilton): (Suppose A has linearly independent eigen- vectors.) (λ1I − A)(λ2I − A) ···(λnI − A) = all-

Proof: f(λ) can be re-written as f(λ)=(λ1 − λ)(λ2 − λ) ···(λn − λ). 2 6.1 How to determine the eigenvalues? 6-11

(Problem 11, Section 6.1) Here is a strange fact about 2 by 2 matrices with eigen- values λ1 = λ2:ThecolumnsofA − λ1I are multiples of the eigenvector x2.Any idea why this should be?   00 Hint: (λ I − A)(λ I − A)= implies 1 2 00

(λ1I − A)w1 = 0 and (λ1I − A)w2 = 0   where (λ2I − A)= w1 w2 . Hence,

the columns of (λ2I − A) give the eigenvectors of λ1 if they are non-zero vectors and

the columns of (λ1I − A) give the eigenvectors of λ2 if they are non-zero vectors .

So, the (non-zero) columns of A − λ1I are (multiples of) the eigen- vector x2. 6.1 Why Gauss-Jordan cannot solve Ax = λx? 6-12

• The forward elimination may change the eigenvalues and eigenvectors?   1 −2 Example. Check eigenvalues and eigenvectors of A = . 25 Solution. A A − λI λ − 2 . – The eigenvalues of satisfy det( )=( 3) =0 10 1 −2 – A = LU = . The eigenvalues of U apparently satisfy 21 09 det(U − λI)=(1− λ)(9 − λ)=0.

– Suppose u1 and u2 are the eigenvectors of U, respectively corresponding to eigenvalues 1 and 9. Then, they cannot be the eigenvectors of A since if they were,     10 3u1 = Au1 = LUu1 = Lu1 = u1 =⇒ u1 = 0 21   2   10 3u2 = Au2 = LUu2 =9Lu2 =9 u2 =⇒ u2 = 0. 21

Hence, the eigenvalues are nothing to do with pivots (except for a triangular A). 6.1 How to determine the eigenvalues? (Revisited) 6-13

Solve det(A − λI)=0. • det(A − λI) is a polynomial of order n.   a − λa a ··· a  1,1 1,2 1,3 1,n   a a − λa ··· a   2,1 2,2 2,3 2,n  f λ A − λI  a a a − λ ··· a  ( )=det( )=det  3,1 3,2 3,3 3,n   ......  an,1 an,2 an,3 ··· an,n − λ

=(a1,1 − λ)(a2,2 − λ) ···(an,n − λ)+terms of order at most (n − 2) (By Leibniz formula)

=(λ1 − λ)(λ2 − λ) ···(λn − λ) Observations • The coefficient of λn is (−1)n,providedn ≥ 1. • λn−1 n λ n a A n ≥ The coefficient of is i=1 i = i=1 i,i = ( ),provided 2. • ... • λ0 n λ f A . The coefficient of is i=1 i = (0) = det( ) 6.1 How to determine the eigenvalues? (Revisited) 6-14

These observations make easy the finding of the eigenvalues of 2 × 2 matrix.   11 Example. Find the eigenvalues of A = . 41

Solution. λ λ 1 + 2 =1+1=2 2 2 • =⇒ (λ1 − λ2) =(λ1 + λ2) − 4λ1λ2 =16 λ1λ2 =1− 4=−3

=⇒ λ1 − λ2 =4 =⇒ λ =3, −1. 2   11 Example. Find the eigenvalues of A = . 22

Solution. λ + λ =3 • 1 2 =⇒ λ =3, 0. 2 λ1λ2 =0 6.1 Imaginary eigenvalues 6-15

In some cases, we have to allow imaginary eigenvalues.

• In order to solve polynomial equations f(λ) = 0, the mathematician was forced to image that there exists a number x satisfying x2 = −1 .

• By this technique, a polynomial equations of order n have exactly n (possibly complex,notreal) solutions.

Example. Solve λ2 +1=0. =⇒ λ = ±i. 2

• Based on this, to solve the eigenvalues, we were forced to accept imaginary eigenvalues.   01 Example. Find the eigenvalues of A = . −10 Solution. det(A − λI)=λ2 +1=0 =⇒ λ = ±i. 2 6.1 Imaginary eigenvalues 6-16

Proposition: The eigenvalues of a A (with real entries) are real, and the eigenvalues of a skew-symmetric (or antisymmetric) matrix B are pure imaginary. Proof: • Suppose Av = λv.Then, Av∗ = λ∗v∗ (A real) =⇒ (Av∗)Tv = λ∗(v∗)Tv (Equivalently, v · Av∗ = v · λ∗(v∗)) =⇒ (v∗)TATv = λ∗(v∗)Tv =⇒ (v∗)TAv = λ∗(v∗)Tv ( means AT = A) =⇒ (v∗)Tλv = λ∗(v∗)Tv (Av = λv) =⇒ λ v 2 = λ∗ v 2 (Eigenvector must be non-zero, i.e., v 2 =0) =⇒ λ = λ∗ =⇒ λ real 6.1 Imaginary eigenvalues 6-17

• Suppose Bv = λv. Then, Bv∗ = λ∗v∗ (B real) =⇒ (Bv∗)Tv =(λ∗v∗)Tv =⇒ (v∗)TBTv = λ∗(v∗)Tv =⇒ (v∗)T(−B)v = λ∗(v∗)Tv (Skew-symmetry means BT = −B) =⇒ (v∗)T(−λ)v = λ∗(v∗)Tv (Bv = λv) =⇒ (−λ) v 2 = λ∗ v 2 (Eigenvector must be non-zero, i.e., v 2 =0) =⇒−λ = λ∗ =⇒ λ pure imaginary 2 6.1 Eigenvalues/eigenvectors of inverse 6-18

For invertible A, the between eigenvalues and eigenvectors of A and A−1 can be well-determined. Proposition: The eigenvalues and eigenvectors of A−1 are

1 1 1 , v1 , , v2 ,..., , vn , λ1 λ2 λn { λ , v }n A where ( i i) i=1 are the eigenvalues and eigenvectors of . Proof: • A A n λ  The eigenvalues of invertible must be non-zero because det( )= i=1 i =0. • Suppose Av = λv,wherev = 0 and λ =0.(I.e., λ and v are eigenvalue and eigenvector of A.)

−1 −1 −1 1 −1 • So, Av = λv =⇒ A (Av)=A (λv)=⇒ v = λA v =⇒ λv = A v. 2 Note: The eigenvalues of Ak and A−1 are respectively λk and λ−1 with the same eigenvectors as A,whereλ is the eigenvalue of A. 6.1 Eigenvalues/eigenvectors of inverse 6-19

Proposition: The eigenvalues of AT are the same as the eigenvalues of A.(But they may have different eigenvectors.) Proof: det(A − λI)=det ((A − λI)T)=det (AT − λIT)=det (AT − λI). 2

Corollary: The eigenvalues of an invertible (real-valued) matrix A satisfying A−1 = AT are on the unit of the complex plain. Proof: Suppose Av = λv. Then, Av∗ = λ∗v∗ (A real) =⇒ (Av∗)Tv = λ∗(v∗)Tv =⇒ (v∗)TATv = λ∗(v∗)Tv =⇒ (v∗)TA−1v = λ∗(v∗)Tv (AT = A−1) ∗ T 1 ∗ ∗ T −1 1 =⇒ (v ) λv = λ (v ) v (A v = λv) ⇒ 1 v 2 λ∗ v 2 v 2  = λ = (Eigenvector must be non-zero, i.e., =0) =⇒ λλ∗ = |λ|2 =1   2 01 Example. Find the eigenvalues of A = that satisfies AT = A−1. −10 Solution. det(A − λI)=λ2 +1=0 =⇒ λ = ±i. 2 6.1 Determination of eigenvectors 6-20

• After the identification of eigenvalues via det(A−λI) = 0, we can determine the respective eigenvectors using the nullspace technique.

• Recall (from Slide 3-42) how we completely solve

Bn×nvn×1 =(An×n − λIn×n)vn×1 = 0n×1.

Answer:   Ir×r Fr× n−r R = rref(B)= ( ) (with no row exchange) 0(n−r)×r 0(n−r)×(n−r)   −Fr×(n−r) ⇒ Nn×(n−r) = I(n−r)×(n−r) Then, every solution v for Bv = 0 is of the form v(null) N w w ∈ n−r. n×1 = n×(n−r) (n−r)×1 for any

Here, we usually find the (n − r) orthonormal bases for the null space as the representative eigenvectors, which are exactly the (n − r) columns of Nn×(n−r) (with proper normalization and ). 6.1 Determination of eigenvectors 6-21

(Problem 12, Section 6.1) Find three eigenvectors for this matrix P (projection matrices have λ =1and0):   0.20.40 Projection matrix P = 0.40.80 . 001 If two eigenvectors share the same λ, so do all their linear combinations. Find an eigenvector of P with no zero components. Solution. • det(P − λI)=0givesλ(1 − λ)2 =0;soλ =0, 1, 1.   −1 1 2 0 • λ =1:rref(P − I)=000; 000      1 1   2 0 2 0 F −1 N       . so 1×2 = 2 0 ,and 3×2 = 10, which implies eigenvectors = 1 and 0 01 0 1 6.1 Determination of eigenvectors 6-22   1 2 0 • λ =0:rref(P )=000 (with no row exchange);   0 0 1    2 −2 −2       so F = − ,andN3×1 = 1 , which implies eigenvector = 1 . 0 0 0 2 6.1 MATLAB revisited 6-23

In MATLAB:

• We can find the eigenvalues and eigenvectors of a matrix A by: [V D]= eig(A); % Find the eigenvalues/vectors of A • The columns of V are the eigenvectors. • The diagonals of D are the eigenvalues.

Proposition: • The eigenvectors corresponding non-zero eigenvalues are in C(A). • The eigenvectors corresponding zero eigenvalues are in N(A).

Proof: The first one can be seen from Av = λv and the second one can be proved by Av = 0. 2 6.1 Some discussions on problems 6-24

(Problem 25, Section 6.1) Suppose A and B have the same eigenvalues λ1,...,λn with the same independent eigenvectors x1,...,xn.ThenA = B. Reason: Any vector x is a combintion c1x1 + ···+ cnxn.WhatisAx?WhatisBx? Thinking over Problem 25: Suppose A and B havethesameeigenvaluesand eigenvectors (not necessarily independent). Can we claim that A = B. Answer to the thinking: Not necessarily.     − − A 1 1 B 2 2 , As a counterexample, both = − and = − have eigenvalues 0 0   1 1 2 2 1 and single eigenvector but they are not equal. 1 If however the eigenvectors span the n-dimensional space (such as there are n of them and they are linearly independent),thenA = B.   λ1 0 ··· 0        0 λ2 ··· 0  Hint for Problem 25: A x ··· xn = x ··· xn  . 1 1  ......  00··· λn This important fact will be re-emphasized in Section 6.2. 6.1 Some discussions on problems 6-25

(Problem 26, Section 6.1) The block B has eigenvalues 1, 2andC has eigenvalues 3, 4andD has eigenvalues 5, 7. Find the eigenvalues of the 4 by 4 matrix A:   0130     BC  −2304 A = =   . 0 D  0061 0016

  BC Thinking over Problem 26: The eigenvalues of are the eigenvalues of B 0 D and D because we can show   BC det = det(B) · det(D). 0 D

(See Problems 23 and 25 in Section 5.2.) 6.1 Some discussions on problems 6-26

(Problem 23, Section 5.2) With 2 by 2 blocks in 4 by 4 matrices, you cannot always use block : AB AB = |A||D| but = |A||D|−|C||B|. 0 D CD (a) Why is the first statement true? Somehow B doesn’t enter. Hint: Leibniz formula. (b) Show by example that equality fails (as shown) when C enters. (c) Show by example that the answer det(AD − CB)isalsowrong. 6.1 Some discussions on problems 6-27

−1 (Problem 25, Section 5.2) Block elimination subtracts CA times the first row AB from the second row CD. This leaves the Schur complement D − CA−1B in the corner:      I 0 AB AB = . −CA−1 I CD 0 D − CA−1B Take determinants of these block matrices to prove correct rules if A−1 exists: AB |A||D − CA−1B| |AD − CB| AC CA. CD = = provided =   Hint: det(A) · det(D − CA−1B)=det A(D − CA−1B) . 6.1 Some discussions on problems 6-28

(Problem 37, Section 6.1) (a) Find the eigenvalues and eigenvectors of A. They depend on c:   .41− c A = . .6 c

(b) Show that A has just one of eigenvectors when c =1.6. (c) This is a Markov matrix when c = .8. Then An will approach what matrix A∞?

Definition (Markov matrix): A Markov matrix is a matrix with positive entries, for which every column adds to one.

• Note that some researchers define the Markov matrix by replacing positive by non-negative; thus, they will use the term positive Markov matrix.Thebelow observations hold for Markov matrices with “non-negative entries.” 6.1 Some discussions on problems 6-29

• 1 must be one of the eigenvalues of a Markov matrix.     1 1 Proof: A and AT have the same eigenvalues, and AT . = .. 2 1 1

• The eigenvalues of a Markov matrix satisfy |λ|≤1. ATv λv n a v λv v Proof: = implies i=1 i,j i = j; hence, by letting k satisfying |vk| =max1≤i≤n |vi|,wehave n n n |λv | a v ≤ a |v |≤ a |v | |v | k = i,k i i,k i i,k k = k i=1 i=1 i=1 which implies the desired result. 2 6.1 Some discussions on problems 6-30

(Problem 31, Section 6.1) If we exchange rows 1 and 2 and columns 1 and 2, the eigenvalues don’t change. Find eigenvectors of A and B for λ1 = 11. one gives λ2 = λ3 =0.     121 633 A = 363 and B = PAPT = 211 . 484 844

Thinking over Problem 31: This is sometimes useful in determining the eigenval- ues and eigenvectors.

• If Av = λv and B = PAP−1 (for any invertible P ), then λ and u = P v are the eigenvalue and eigenvector of B.

Proof: Bu = PAP−1u = PAv = P (λv)=λP v = λu. 2   010 For example, P = 100.Insuchcase,P −1 = P = P T. 001 6.1 Some discussions on problems 6-31

With the fact above, we can further claim that Proposition. The eigenvalues of AB and BA are equal if one of A and B is invertible. Proof: Thiscanbeprovedby(AB)=A(BA)A−1 (or (AB)=B−1(BA)B). Note that AB has eigenvector Av or (B−1v)ifv is the eigenvector of BA. 2 6.1 Some discussions on problems 6-32   λ 0 ··· 0  1   0 λ ··· 0  The eigenvectors of a Λ =  2  are apparently  ......  00··· λn   0   .   0   e   i ,i , ,...,n i = 1 position =1 2   0 . 0

−1 Hence, A = SΛS have the same eigenvalues as Λ and have eigenvectors vi = Sei.Thisimpliesthat   S = v1 v2 ··· vn where {vi} are the eigenvectors of A. What if S is not invertible? Then, we have the next theorem. 6.2 Diagnolizing a matrix 6-33

A convenience for the eigen-system analysis is that we can diagonalize a matrix. Theorem. A matrix A can be written as   λ1 0 ··· 0        0 λ2 ··· 0  A v v ··· vn = v v ··· vn    1 2  1 2  ......  =S 00··· λn    =Λ { λ , v }n A where ( i i) i=1 are the eigenvalue-eigenvector pairs of . Proof: The theorem holds by definition of eigenvalues and eigenvectors. 2 Corollary.IfS is invertible, then S−1AS = Λ equivalently A = SΛS−1. 6.2 Diagnolizing a matrix 6-34

This makes easy the computation of polynomial with argument A. Proposition (recall Slide 6-9): Matrix

m m−1 αmA + αm−1A + ···+ α0I has the same eigenvectors as A, but its eigenvalues become

m m−1 αmλ + αm−1λ + ···+ α0, where λ is the eigenvalue of A.

Proposition:   m m−1 m m−1 −1 αmA + αm−1A + ···+ α0I = S αmΛ + αm−1Λ + ···+ α0 S

Proposition: Am = SΛmS−1 6.2 Diagnolizing a matrix 6-35

Exception • It is possible that an n × n matrix does not have n eigenvectors.   1 −1 Example. A = . 1 −1

Solution.

– λ1 = 0 is apparently an eigenvalue for a non-. λ A − λ − − λ – The second eigenvalue is 2 = trace( ) 1 =[1+( 1)] 1 =0. 1 – This matrix however only has one eigenvector (or some researchers may 1 say “two repeated eigenvectors”). – In such case, we still have         1 −1 11 11 0 0 − = 1  1 11  11  00 A S S Λ but S has no inverse. 2 In such case, we cannot preform SAS−1;sowesayA cannot be diagonalized. 6.2 Diagnolizing a matrix 6-36

When will S be guaranteed to be invertible? One easy answer: When all eigenvalues are distinct (and there are n eigenvalues).

Proof: Suppose for a matrix A,thefirstk eigenvectors v1,...,vk are linearly inde- pendent, but the (k +1)th eigenvector is dependent on the previous k eigenvectors. Then, for some unique a1,...,ak,

vk+1 = a1v1 + ···+ akvk, which implies  λk+1vk+1 = Avk+1 = a1Av1 + ...+ akAvk = a1λ1v1 + ...+ akλkvk

λk+1vk+1 = a1λk+1v1 + ...+ akλk+1vk

Accordingly, λk+1 = λi for 1 ≤ i ≤ k; i.e., that vk+1 is linearly dependent on v1,...,vk only occurs when they have the same eigenvalues. (The proof is not yet completed here! See the discussion and Example below.)

Theaboveproofsaidthataslongaswehavethe(k + 1)th eigenvector, then it is linearly dependent on the previous k eigenvectors only when all eigenvalues are equal! But sometimes, we do not guarantee to have the (k + 1)th eigenvector. 6.2 Diagnolizing a matrix 6-37   5421    01−1 −1 Example. A =   . −1 −13 0 11−12 Solution. • We have four eigenvalues 1, 2, 4, 4. • But we can only have three eigenvectors for 1, 2and4. • From the proof in the previous page, the eigenvectors for 1, 2 and 4 are linearly indepedent because 1, 2 and 4 are not equal. 2

Proof (Continue): One condition that guarantees to have n eigenvectors is that we have n distinct eigenvalues, which now complete the proof. 2

Definition (GM and AM): The number of linearly independent eigenvectors for an eigenvalue λ is called its geometric multiplicity;thenumberofits appearance in solving det(A − λI) is called its algebraic multiplicity. Example. In the above example, GM=1 and AM=2 for λ =4. 6.2 Fibonacci series 6-38

Eigen-decomposition is useful in solving Fibonacci series

Problem: Suppose Fk+2 = Fk+1 + Fk with initially F1 =1andF0 =0.FindF100. Answer:         k+1   Fk 11 Fk Fk 11 F • +2 = +1 =⇒ +2 = 1 Fk+1 10 Fk Fk+1 10 F0 • So,    99     F 11 1 1 100 = = SΛ99S−1 F99 10 0 0     √ √  √ 99  √ √  1+ 5 −1   1+ 5 1− 5  0  1+ 5 1− 5 1 2 2 2   2 2 =  √ 99 11 1− 5 11 0 0 2     √ √  √ 99  √  1+ 5   1+ 5 1− 5  0  1 1 −1− 5 1 2 2 2       √2 =  √ 99 √ √ 11 1− 5 1+ 5 − 1− 5 −1 1+ 5 0 0 2 2 2 2 6.2 Fibonacci series 6-39    √   √  100 − 100    1+ 5 1 5   2   2    1   1 =  √ 99 √ 99  √ √ 1+ 5 1− 5 1+ 5 − 1− 5 −1 2 2  2 2   √   √  100 − 100  1+ 5 − 1 5    1    2   2  = √ √  √ 99 √ 99  . 1+ 5 − 1− 5 1+ 5 − 1− 5 2 2 2 2 2 k 6.2 Generalization to uk = A u0 6-40 • A generalization of the solution to Fibonacci series is as follows.

Suppose u0 = a1v1 + a2v2 + ···+ anvn. k k k k Then uk = A u0 = a1A v1 + a2A v2 + ···+ anA vn a λkv a λkv ... a λkv = 1 1 1 + 2 2 2 + + n n n. Examination:    k   Fk+1 11 F1 k uk = = = A u0 Fk 10 F0 and    √   √  1 1 1+ 5 1 1− 5     2     2 u0 = = √ √ − √ √ 0 1+ 5 − 1− 5 1 1+ 5 − 1− 5 1  2  2   2  2  a1 a2

So  √   √    99  √  − 99  √  1+ 5 1 5 − F 2 1+ 5 2 1 5 100     2     2 u99 = = √ √ − √ √ . F99 1+ 5 − 1− 5 1 1+ 5 − 1− 5 1 2 2 2 2 6.2 More applications of eigen-decomposition 6-41

(Problem 30, Section 6.2) Suppose the same S diagonalizes both A and B.They −1 −1 have the same eigenvectors in A = SΛ1S and B = SΛ2S . Prove that AB = BA. Proposition: Suppose both A and B can be diagnosed, and one of A and B have distinct eigenvalues. Then, A and B have the same eigenvectors if, and only if, AB = BA. Proof: • (See Problem 30 above) If A and B have the same eigenvectors, then

−1 −1 −1 −1 −1 −1 AB =(SΛAS )(SΛBS )=SΛAΛBS = SΛBΛAS =(SΛBS )(SΛAS )=BA.

• Suppose without loss of generality that the eigenvalues of A are all distinct.

Then, if AB = BA,wehaveforgivenAv = λv and u = Bv,

Au = A(Bv)=(AB)v =(BA)v = B(Av)=B(λv)=λ(Bv)=λu.

Hence, u = Bv and v are both the eigenvectors of A corresponding to the same eigenvalue λ. 6.2 More applications of eigen-decomposition 6-42 u n a v {v }n A Let  = i=1 i i,where i i=1 are linearly independent eigenvectors of .  Au n a Av n a λ v  = i=1 i i = i=1 i i i =⇒ ai(λi − λ)=0for1≤ i ≤ n. Au λ u n a λ v  =   = i=1 i  i This implies u = aivi = av = Bv .

i:λi=λ Thus, v is an eigenvector of B. 2 6.2 Problem discussions 6-43

−1 (Problem 32, Section 6.2) Substitute A = SΛS into the product (A − λ1I)(A − λ2I) ···(A − λnI) and explain why this produces the zero matrix. We are sub- stituting the matrix A for the number λ in the polynomial p(λ)=det(A − λI). The Cayley-Hamilton Theorem says that this product is always p(A)=zero matrix,evenifA is not diagonalizable. Thinking over Problem 32: The Cayley-Hamilton Theorem can be easily proved if A is diagonalizable. Corollary (Cayley-Hamilton): Suppose A has linearly independent eigen- vectors. (λ1I − A)(λ2I − A) ···(λnI − A) = all-zero matrix

Proof:

(λ1I − A)(λ2I − A) ···(λnI − A) −1 −1 −1 = S(λ1I − Λ)S S(λ2I − Λ)S ···S(λnI − Λ)S S λ I − λ I − ··· λ I − S−1 = ( 1  Λ) ( 2  Λ) ( n  Λ) (1, 1)th entry= 0 (2, 2)th entry= 0 (n, n)th entry= 0 = S all-zero entries S−1 = all-zero entries 2 6.3 Applications to differential equations 6-44

We can use eigen-decomposition to solve     u (t) a , u (t)+a , u (t)+···+ a ,nun(t)  1   1 1 1 1 2 2 1  d u (t) du a , u (t)+a , u (t)+···+ a ,nun(t)  2  = = Au =  2 1 1 2 2 2 2  dt  .  dt  .  un(t) an,1u1(t)+an,2u2(t)+···+ an,nun(t) where A is called the . By differential equation technique, we know that the solution is of the form u = eλtv for some constant λ and constant vector v. du λ v Au Question: What are all and that satisfy dt = ?

du u eλtv λeλtv Au Aeλtv Solution: Take = into the equation: dt = = = =⇒ λv = Av.

The answer are all eigenvalues and eigenvectors. 6.3 Applications to differential equations 6-45 du Au Since dt = is a linear system, a linear combination of solutions is still a solution.

Hence, if there are n eigenvectors, the complete solution can be represented as

λ1t λ2t λnt c1e v1 + c2e v2 + ···+ cne vn where c1,c2,...,cn are determined by the initial conditions.

Here, for convenience of discussion at this moment, we assume that A gives us exactly n eigenvectors. 6.3 Applications to differential equations 6-46 du Au It is sometimes convenient to re-express the solution of dt = for a given initial condition u(0) as eAtu(0), i.e.,

λ1t λ2t λnt c e v + c e v + ···+ cne vn 1 1 2 2     eλ1t 0 ··· 0 c    1   eλ2t ··· c  0 0   2 Λt −1 At = v v ··· vn     = SeS Sc = e u(0)  1 2   ......   .  eAt u(0) S λnt 00··· e cn where we define     λ1t  e 0 ··· 0      eλ2t ···   t −  0 0  − − SeΛ S 1 = S   S 1, if S 1 exists ...... eAt   . . .   ··· eλnt  00 ∞  1 k −  (At) , holds no matter whether S 1 exist or not k! k=0 6.3 Applications to differential equations 6-47 and hence u(0) = Sc.Notethat ∞ ∞ ∞ ∞ 1 tk tk tk (At)k = Ak = (SΛkS−1)=S Λk S−1 = SeΛtS−1. k! k! k! k! k=0 k=0 k=0 k=0

Key to remember: Again, we define by convention that   f(λ )0··· 0  1   0 f(λ ) ··· 0  f(A)=S  2  S−1.  ......  00··· f(λn)   eλ1t 0 ··· 0    0 eλ2t ··· 0  So, eAt = S   S−1.  ......  00··· eλnt 6.3 Applications to differential equations 6-48

We can solve the second-order equation in the same manner. Example. Solve my + by + ky =0. Answer: • Let z = y . Then, the problem is reduced to  y = z du =⇒ = Au z − k y − b z dt = m m     y u A 01 with = z and = − k − b . m m • The complete solution for u is therefore

λ1t λ2t c1e v1 + c2e v2. 2 6.3 Applications to differential equations 6-49

Example. Solve y + y = 0 with initial y(0) = 1 and y (0) = 0. Solution. • z y Let = . Then, the problem is reduced to y = z du =⇒ = Au z = −y dt     y 01 with u = and A = . z −10 • The eigenvalues and eigenvectors of A are     −ı ı ı, and −ı, . 1 1 • The complete solution for u is therefore             y −ı ı 1 −ı ı = c eıt + c e−ıt with initially = c + c . y 1 1 2 1 0 1 1 2 1 c ı c − ı So, 1 = 2 and 2 = 2.Thus, ı ı 1 1 y(t)= eıt(−ı) − e−ıtı = eıt + e−ıt =cos(t). 2 2 2 2 2 6.3 Applications to differential equations 6-50

In practice, we will use a discrete approximation to approximate a continuous . There are however three different discrete approximations. For example, how to approximate y (t)=−y(t) by a discrete system?

−Yn−1 Forward approximation Y − Y Y Yn+1−Yn − Yn−Yn−1 n+1 2 n + n−1 ∆t ∆t = = −Yn Centered approximation (∆t)2 ∆t −Yn+1 Backward approximation Let’s take forward approximation as an example, i.e., Yn+1−Yn − Yn−Yn−1  ∆t   ∆t  Zn Zn−1 = −Yn− . ∆t 1 Thus,   Yn − Yn        +1 Z y (t)=z(t) = n Yn 1∆t Yn ≈ ∆t ≡ +1 = Zn − Zn Z − t Z z (t)=−y(t)  +1 −Y n+1 ∆ 1 n t = n          ∆ un+1 A un 6.3 Applications to differential equations 6-51

Then, we obtain      11 (1 + ı∆t)n 0 1 − ı 1 u Anu 2 2 n = 0 = ı −ı − ı t n 1 ı 0(1∆ ) 2 2 0 and ı t n − ı t n   (1 + ∆ ) +(1 ∆ ) 2 n/2 Yn = = 1+(∆t) cos(n · ∆θ) 2 where ∆θ =tan−1(∆t).

Problem: |Yn|→∞as n large. 6.3 Applications to differential equations 6-52

Backward approximation will leave to Yn → 0asn large.

  Yn − Yn        +1 Z y (t)=z(t) = n+1 1 −∆t Yn Yn ≈ ∆t ≡ +1 = Zn − Zn t Z Z z (t)=−y(t)  +1 −Y ∆ 1 n+1 n t = n+1          ∆ A˜−1 un+1 un

A solution to the problem: Interleave the forward approximation with the backward approximation.   Yn − Yn        +1 Z y (t)=z(t) = n 10 Yn 1∆t Yn ≈ ∆t ≡ +1 = Zn − Zn t Z Z z (t)=−y(t)  +1 −Y ∆ 1 n+1 01 n t = n+1       ∆ un+1 un

We then perform this so-called leapfrog method.(SeeProblems 28 and 29.)

   −1        Yn+1 10 1∆t Yn 1∆t Yn Z = t Z = − t − t 2 Z  n+1  ∆ 1 01 n   ∆ 1 (∆ )   n  un+1 un |eigenvalues|=1 if ∆t≤2 un 6.3 Applications to differential equations 6-53

(Problem 28, Section 6.3) Centering y = −y in Example 3 will produce Yn+1 − 2 2Yn + Yn−1 = −(∆t) Yn. This can be written as a one-step difference equation for U =(Y,Z):       Yn = Yn +∆tZn 10 Yn 1∆t Yn +1 +1 = . Zn+1 = Zn − ∆tYn+1 ∆t 1 Zn+1 01 Zn

Invert the matrix on the left side to write this as Un+1 = AUn. Show that detA = ¯ 1. Choose the large time step ∆t = 1 and find the eigenvalues λ1 and λ2 = λ1 of A:   11 A = has |λ | = |λ | =1. Show that A6 is exactly I. −10 1 2

After 6 steps to t =6,U6 equals U0.Theexacty =cos(t) returns to 1 at t =2π. 6.3 Applications to differential equations 6-54

(Problem 29, Section 6.3) That centered choice (leapfrog method)inProblem28is√ very successful for small time steps ∆t. But find the eigenvalues of A for ∆t = 2 and 2:  √    1 2 12 A = √ and A = . − 2 −1 −2 −3 Both matrices have |λ| = 1. Compute A4 in both cases and find the eigenvectors of A. That value ∆t = 2 is at the border of instability. Time steps ∆t>2 will n lead to |λ| > 1, and the powers in Un = A U0 will explode. Note You might say that nobody would compute with ∆t>2. But if an atom vibrates with y = −1000000y,then∆t>.0002 will give instability. Leapfrog has a very strict stability limit. Yn+1 = Yn +3Zn and Zn+1 = Zn − 3Yn+1 will explode because ∆t = 3 is too large. 6.3 Applications to differential equations 6-55

A better solution to the problem: Mix the forward approximation with the backward approximation.   Y − Y Z Z  n+1 n n+1 + n       y (t)=z(t) = −∆t Y ∆t Y t 1 2 n+1 1 2 n ≈ ∆ 2 ≡ t = t Zn+1 − Zn Yn+1 + Yn ∆ Z −∆ Z z (t)=−y(t)  − 2 1 n+1 2 1 n t =       ∆ 2 un+1 un

We then perform this so-called trapezoidal method.(SeeProblem 30.)      −        Y −∆t 1 ∆t Y − ∆t 2 t Y n+1 1 2 1 2 n 1 1 2 ∆ n = t t =     Z ∆ −∆ Z ∆t 2 ∆t 2 Z n+1 2 1 2 1 n 1+ −∆t 1 − n        2  2     u u u n+1 n |eigenvalues|=1 for all ∆t>0 n 6.3 Applications to differential equations 6-56

(Problem 30, Section 6.3) Another good idea for y = −y is the trapezoidal method (half forward/half back): Thismaybethebestwaytokeep(Yn,Zn)exactlyona circle.       1 −∆t/2 Yn 1∆t/2 Yn Trapezoidal +1 = . ∆t/21 Zn+1 −∆t/21 Zn

(a) Invert the left matrix to write this equation as Un+1 = AUn. Show that A is an T : A A = I. These points Un never leave the circle. A =(I − B)−1(I + B) is always an orthogonal matrix if BT = −B (See the proof on next page).

(b) (Optional MATLAB) Take 32 steps from U0 =(1, 0) to U32 with ∆t =2π/32. Is U32 = U0? I think there is a small error. 6.3 Applications to differential equations 6-57

Proof: A =(I − B)−1(I + B) ⇒ (I − B)A =(I + B) ⇒ AT(I − B)T =(I + B)T ⇒ AT(I − BT)=(I + BT) ⇒ AT(I + B)=(I − B) ⇒ AT =(I − B)(I + B)−1

⇒ ATA =(I − B)(I + B)−1(I − B)−1(I + B) =(I − B)[(I − B)(I + B)]−1(I + B) =(I − B)[(I + B)(I − B)]−1(I + B) =(I − B)(I − B)−1(I + B)−1(I + B) = I 6.3 Applications to differential equations 6-58

Question: What if the number of eigenvectors is smaller than n? Recall that it is convenient to say that the solution of du/dt = Au is eAtu(0) for some constant vecor u(0). Conveniently, we can presume that eAtu(0) is the solution of du/dt = Au even if A does not have n eigenvectors. Example. Solve y − 2y + y = 0 with initially y(0) = 1 and y (0) = 0. Answer. • Let z = y . Then, the problem is reduced to        y = z du y 01 y =⇒ = = = Au z −y z dt z − z = +2  12 A • The solution is still eAtu(0) but eAt = SeΛtS−1 because there is only one eigenvector for A (it doesn’t matter whether we regard this case as “S does not exist” or we regard this case as “S exists but has no inverse”). 6.3 Applications to differential equations 6-59         01 11 11 10 eAt eλIte(A−λI)t eλIte(A−I)t − = and = =  12 11  11  01  A S S Λ

eAt eIt+(A−I)t ? eIte(A−I)t It A − I t A − I t It . . = = (since ( )(( ) ) = (( ) )( ) See below ) ∞ ∞ 1 1 = Iktk (A − I)ktk k! k! k=0 k=0 ∞ ∞ 1 1 1 1 = tk I (A − I)ktk Can we use eAt = Aktk? k! k! k! k=0 k=0 k=0 where we know Ik = I for every k ≥ 0and(A − I)k =all-zero matrix for k ≥ 2. This gives   eAt = etI (I +(A − I)t)=et (I +(A − I)t) and       1 1 1 − t u(t)=eAtu(0) = eAt = et (I +(A − I)t) = et 0 0 −t 2 Note that eAeB, eBeA and eA+B may not be equal except AB = BA! 6.3 Applications to differential equations 6-60

Properities of eAt • The eigenvalues of eAt are eλt for all eigenvalues λ of A. For example, eAt = et (I +(A − I)t) has repeated eigenvalues et, et. At • The eigenvectors of e remain the same as A.   1 For example, eAt = et (I +(A − I)t) has only one eigenvector . 1 • The inverse of eAt always exist (hint: eλt = 0), and is equal to e−At. For example, e−At = e−Ite(I−A)t = e−t(I − (A − I)t). • The of eAt is   ∞ T ∞ T 1 1 T eAt = Aktk = (AT)ktk = eA t. k! k! k=0 k=0   T Hence, if AT = −A (skew-symmetric), then eAt T eAt = eA teAt = I;so,eAt is an orthogonal matrix (Recall that a matrix Q is orthogonal if QTQ = I). 6.3 Quick tip 6-61   a a • × A 1,1 1,2 For a 2 2 matrix = a a , the eigenvector corresponding to the   2,1 2,2 a1,2 eigenvalue λ1 is . λ1 − a1,1        a1,2 a1,1 − λ1 a1,2 a1,2 0 (A − λ1I) = = λ1 − a1,1 a2,1 a2,2 − λ1 λ1 − a1,1 ?

where ?= 0 because the two row vectors of (A − λ1I) are parallel.

This is especially useful when solving the differential equation because a1,1 =0 and a1,2 = 1; hence, the eigenvector v1 corresponding to the eigenvalue λ1 is   1 v1 = . λ1 Solve y − 2y + y =0.Letz = y . Then, the problem is reduced to          y = z du y 01 y 1 =⇒ = = = Au =⇒ v = dt z − z 1 z = −y +2z  12 1 A 6.3 Problem discussions 6-62

Problem 6.3B and Problem 31: Aconvenientwaytosolved2u/dt = −A2u.

(Problem 31, Section 6.3) The cosine of a matrix is defined like eA,bycopying the series for cos t: 1 1 1 1 cos t =1− t2 + t4 −··· cos A = I − A2 + A4 −··· 2! 4! 2! 4! (a) If Ax = λx, multiply each term times x to find the eigenvalue of cos A.   ππ A , , − (b) Find the eigenvalues of = ππ with eigenvectors (1 1) and (1 1). From the eigenvalues and eigenvectors of cos A, find that matrix C =cosA. (c) The second derivative of cos(At)is−A2 cos(At). d2u u(t)=cos(At)u(0) solve = −A2u starting from u (0) = 0. dt2 Construct u(t)=cos(At)u(0) by the usual three steps for that specific A:

1. Expand u(0) = (4, 2) = c1x1 + c2x2 in the eigenvectors. 2. Multiply those eigenvectors by and (instead of eλt).

3. Add up the solution u(t)=c1 x1 + c2 x2. (Hint: See Slide 6-48.) 6.3 Problem discussions 6-63

(Problem 6.3B,√ Section 6.3) Find the eigenvalues and eigenvectors of A and write u(0) = (0, 2 2, 0) as a combination of the eigenvectors. Solve both equations u = Au and u = Au:     − − du 21 0 d2u 21 0 du =  1 −21 u and =  1 −21 u with (0) = 0. dt dt2 dt 01−2 01−2 The 1, −2, 1 diagonals make A into a second difference matrix (like a second derivative). u = Au is like the heat equation ∂u/∂t = ∂2u/∂x2. Its solution u(t) will decay (negative eigenvalues). u = Au islikethewaveequation∂2u/∂t2 = ∂2u/∂x2. Its solution will oscillate (imaginary eigenvalues). 6.3 Problem discussions 6-64

d2u Au Thinking over Problem 6.3B and Problem 31: How about solving dt = . 1/2 Solution. The answer should be eA tu(0). Note that A1/2 has the same eigenvec- tors as A but its eigenvalues are the square root of A.   λ1/2t e 1 ···  0 0   λ1/2t  A1/2t  0 e 2 ··· 0  −1 e = S   S .  ......  1/2 00··· eλn t   −21 0 As an example from Problem 6.3B, A =  1 −21. − √ 01 2 The eigenvalues of A are −2and−2 ± 2. So,  √  eı 2 t 0  √ √  A1/2t  ı − t  −1 e = S 0 e 2 2 0 S .  √ √  00eı 2+ 2 t 6.3 Problem discussions 6-65

Final reminder on the approach delivered in the textbook du t • ( ) At u In solving dt = with initial (0), it is convenient to find the solution as  u(0) = Sc = c1v1 + ···+ cnvn Λt λ1 λn u(t)=Se c = c1e v1 + ···+ cne vn This is what the textbook always does in Problems. 6.3 Problem discussions 6-66

You should practice these problems by yourself!

du Au − b A Problem 15: How to solve dt = (for invertible )?

−1 (Problem 15, Section 6.3) A particular solution to du/dt = Au − b is up = A b, if A is invertible. The usual solutions to du/dt = Au give un. Find the complete solution u = up + un:     du du 10 4 (a) = u − 4(b) = u − . dt dt 11 6

du Au − ectb c A Problem 16: How to solve dt = (for non-eigenvalue of )? (Problem 16, Section 6.3) If c is not an eigenvalue of A, substitute u = ectv and find a particular solution to du/dt = Au−ectb. How does it break down when c is an eigenvalue of A? The “nullspace” of du/dt = Au contains the usual solutions λ t e i xi. Hint: The particular solution vp satisfies (A − cI)vp = b. 6.3 Problem discussions 6-67

Problem 23: eAeB, eBeA and eA+B are not necessarily equal. (They are equal when AB = BA.) (Problem 23, Section 6.3) Generally eAeB is different from eBeA. They are both different from eA+B. Check this using Problems 21-2 and 19. (If AB = BA,all these are the same.)       14 0 −4 10 A = B = A + B = 00 00 00

Problem 27: An interesting brain-storming problem! (Problem 27, Section 6.3) Find a solution x(t), y(t) that gets large as t →∞.To avoid this instability a scientist exchanges the two equations: dx/dt =0x − 4y dy/dt = −2x +2y becomes dy/dt = −2x +2y dx/dt =0x − 4y   −22 Now the matrix is stable. It has negative eigenvalues. How can this be? 0 −4 Hint: The matrix is not the right one to be used to describe the transformed linear equations. 6.4 Symmetric matrices 6-68

Definition (Symmetric matrix): A matrix A is symmetric if A = AT.

• When A is symmetric, – its eigenvalues are all reals; – it has n orthogonal eigenvectors (so it can be diagonalized). We can then normalize these orthogonal eigenvectors to obtain an .

Definition (Symmetric diagonalization): A symmetric matrix A can be written as A = QΛQ−1 = QΛQT with Q−1 = QT where Λ is a diagonal matrix with real eigenvalues at the diagonal. 6.4 Symmetric matrices 6-69

Theorem ( or principal axis theorem) A symmetric matrix A with distinct eigenvalues can be written as A = QΛQ−1 = QΛQT with Q−1 = QT where Λ is a diagonal matrix with real eigenvalues at the diagonal. Proof: • We have proved on Slide 6-16 that a symmetric matrix has real eigenvalues and a skew-symmetric matrix has pure imaginary eigenvalues. • vTv Next, we prove that eigenvectors are orthogonal, i.e., 1 2 =0,forunequal eigenvalues λ1 = λ2.

Av1 = λ1v1 and Av2 = λ2v2 ⇒ λ v Tv Av Tv vTATv vTAv vTλ v = ( 1 1) 2 =( 1) 2 = 1 2 = 1 2 = 1 2 2 which implies λ − λ vTv . ( 1 2) 1 2 =0 vTv λ  λ 2 Therefore, 1 2 = 0 because 1 = 2. 6.4 Symmetric matrices 6-70

• A = QΛQT implies that A λ q qT ··· λ q qT = 1 1 1 + + n n n = λ1P1 + ···+ λnPn

Recall from Slide 4-37, the projection matrix Pi onto a qi is

T −1 T T Pi = qi(qi qi) qi = qiqi So, Ax is the sum of projections of vector x onto each eigenspace.

Ax = λ1P1x + ···+ λnPnx.

The spectral theorem can be extended to a symmetric matrix with repeated eigenvalues. To prove this, we need to introduce the famous Schur’s theorem. 6.4 Symmetric matrices 6-71

Theorem (Schur’s Theorem): Every square matrix A can be factorized into A = QT Q−1 with T upper triangular and Q† = Q−1, where “†” denotes Hermitian transpose operation (and Q and T can generally have complex entries!) Further, if A has real eigenvalues (and hence has real eigenvectors), then Q and T can be chosen real. Proof: The existence of Q and T such that A = QT Q† and Q† = Q−1 can be proved by induction. • The theorem trivially holds when A is a 1 × 1 matrix. • Suppose that Schur’s Theorem is valid for all (n−1)×(n−1) matrices. Then, we claim that Schur’s Theorem will hold true for all n × n matrices. t q – This is because we can take 1,1 and 1 to be the eigenvalue and unit eigenvector of An×n (as there must exist at least one pair of eigenvalue and A p ... p eigenvector for n×n). Then, choose any unit 2, , n such that they together with q span the n-dimensional space, and 1   P q p ··· p = 1 2 n is a (unitary) orthonormal matrix, satisfying P † = P −1. 6.4 Symmetric matrices 6-72

– We derive   q†  1 p†   † 2 P AP =   Aq Ap ··· Apn  .  1 2 p†  n q†  1 p†   2 =   t , q Ap ··· Apn (since A q = t , q )  .  1 1 1 2 1 1 1 1 p† eigenvector eigenvalue  n  t1,1 t1,2 ···t1,n  0  =  . ˜  . A(n−1)×(n−1) 0 t q†Ap where 1,j = 1 j and   p†Ap ··· p†Ap 2 2 2 n ˜  . . .  A(n−1)×(n−1) =  . .. .  . p† Ap ··· p† Ap n 2 n n 6.4 Symmetric matrices 6-73

– Since Schur’s Theorem is true for any (n − 1) × (n − 1) matrix, we can find ˜ ˜ Q(n−1)×(n−1) and T(n−1)×(n−1) such that A˜ Q˜ T˜ Q˜† (n−1)×(n−1) = (n−1)×(n−1) (n−1)×(n−1) (n−1)×(n−1) and Q˜† Q˜−1 . (n−1)×(n−1) = (n−1)×(n−1) – Finally, define   T Q P 1 0 n×n = ˜ 0 Q(n−1)×(n−1) and   ˜ t1,1 [t1,2 ··· t1,n]Q(n−1)×(n−1)   T  0 . n×n = . ˜ . T(n−1)×(n−1) 0 They satisfy that     1 0T T Q† Q P †P 1 0 I n×n n×n = Q˜† Q˜ = n×n 0 (n−1)×(n−1) 0 (n−1)×(n−1) 6.4 Symmetric matrices 6-74

and   ˜   t1,1 [t1,2 ··· t1,n]Q(n−1)×(n−1) 1 0T   Q T P  0  n×n n×n = n×n ˜ . ˜ 0 Q(n−1)×(n−1) . T(n−1)×(n−1) 0   ˜ t1,1 [t1,2 ··· t1,n]Q(n−1)×(n−1)   P  0  = n×n . ˜ ˜ . Q(n−1)×(n−1)T(n−1)×(n−1) 0   ˜ t1,1 [t1,2 ··· t1,n]Q(n−1)×(n−1)   P  0  = n×n . ˜ ˜ . A(n−1)×(n−1)Q(n−1)×(n−1) 0   t1,1 t1,2 ···t1,n     1 0T P  0  = n×n . ˜ ˜ . A(n−1)×(n−1) 0 Q(n−1)×(n−1) 0   T P P † A P 1 0 A Q = n×n( n×n n×n n×n) ˜ = n×n n×n 0 Q(n−1)×(n−1) 6.4 Symmetric matrices 6-75

• It remains to prove that if A has real eigenvalues, then Q and T can be chosen real, which can be similarly proved by induction. Suppose “If A has real eigenvalues, then Q and T can be chosen real.” is true for all (n − 1) × (n − 1) matrices. Then, the claim should be true for all n × n matrices.

– For a given An×n with real eigenvalues (and real eigenvectors), we can t q p ,...,p certainly have the real 1,1 and 1,andsoare 2 n. This makes real ˜ the A(n−1)×(n−1) and t1,2,...,t1,n.

The eigenvector associated with a real eigenvalue can be chosen real. For complex v,andrealλ and A, by its definition, Av = λv is equivalent to A · Re{v} = A · λ · Re{v} and A · Im{v} = A · λ · Im{v}. 6.4 Symmetric matrices 6-76

˜ – The proof is completed by noting that the eigenvalues of A(n−1)×(n−1), satisfying   t1,1 t1,2 ···t1,n   P †AP 0 , =  . ˜  . A(n−1)×(n−1) 0

are also the eigenvalues of An×n; hence, they are all reals. So, by the validity of the claimed statement for (n − 1) × (n − 1) matrices, the existence of ˜ real Q(n−1)×(n−1) and real T(n−1)×(n−1) satisfying A˜ Q˜ T˜ Q˜† (n−1)×(n−1) = (n−1)×(n−1) (n−1)×(n−1) (n−1)×(n−1) and Q˜† Q˜−1 (n−1)×(n−1) = (n−1)×(n−1) is confirmed. 2 6.4 Symmetric matrices 6-77

Two important facts: • P †AP and A have the same eigenvalues but possibly different eigenvectors. A simple proof is that for v = P v , (P †AP )v = P †Av = P †(λv)=λP †v = λv .

• (Section 6.1: Problem 26 or see slide 6-25)   BC det(A)=det = det(B) · det(D) 0 D Thus,   t1,1 t1,2 ···t1,n   P †AP  0  = . ˜ . A(n−1)×(n−1) 0 ˜ implies that the eigenvalues of A(n−1)×(n−1) should be the eigenvalues of † P An×nP . 6.4 Symmetric matrices 6-78

Theorem (Spectral theorem or principal axis theorem) A symmetric matrix A (not necessarily with distinct eigenvalues) can be written as A = QΛQ−1 = QΛQT with Q−1 = QT where Λ is a diagonal matrix with real eigenvalues at the diagonal. Proof:

• A symmetric matrix certainly satisfies Schur’s Theorem. A = QT Q† with T upper triangular and Q† = Q−1.

• A has real eigenvalues. So, both T and Q are reals. • By AT = A,wehave AT = QT TQT = QT QT = A. This immediately gives T T = T which implies the off-diagonals are zeros. 6.4 Symmetric matrices 6-79

• By AQ = QT ,equivalently,  Aq = t , q  1 1 1 1 Aq t q 2 = 2,2 2 . .  Aqn = tn,nqn we know that Q is the matrix of eigenvectors (and there are n of them) and T is the matrix of eigenvalues. 2

This result immediately indicates that a symmetric A can always be diagonalized. Summary • A symmetric matrix has real eigenvalues and n real orthogonal eigenvec- tors. 6.4 Problem discussions 6-80

(Problem 21, Section 6.4)(Recommended) This matrix M is skew-symmetric and also . Then all its eigenvalues are pure imaginary and they also have |λ| =1.( Mx = x for every x so λx = x for eigenvectors.) Find all four eigenvalues from the trace of M:   0111   1 −10−11 M = √   can only have eigenvalues ı or − ı. 3 −11 0−1 −1 −11 0

Thinking over Problem 21: The eigenvalues of an orthogonal matrix satisfies |λ| =1. Proof: Qv 2 =(Qv)†Qv = v†Q†Qv = v†v = v 2 implies λv = v . 2

|λ| =1andλ pure imaginary implies λ = ±ı. 6.4 Problem discussions 6-81

(Problem 15, Section 6.4) Show that A (symmetric but complex) has only one line of eigenvectors:   ı 1 A = is not even diagonalizable: eigenvalues λ =0, 0. 1 −ı AT = A is not such a special property for complex matrices. The good property is A¯T = A (Section 10.2). Then all λ’s are real and eigenvectors are orthogonal. Thinking over Problem 15: That a symmetric matrix A satisfying AT = A has real eigenvalues and n orthogonal eigenvectors is only true for real symmetric matrices. For a complex matrix A, we need to rephrase it as “A Hermitian symmetric matrix A satisfying A† = A has real eigenvalues and n orthogonal eigenvectors.”

Inner product and for complex vectors: v · w = w†v and v 2 = v†v 6.4 Problem discussions 6-82

Proof of the red-color claim (in the previous slide): Suppose Av = λv. Then, Av = λv =⇒ (Av)†v =(λv)†v (i.e., (A¯v¯)Tv =(λ¯v¯)Tv) =⇒ v†A†v = λ∗v†v =⇒ v†Av = λ∗v†v =⇒ v†λv = λ∗v†v =⇒ λ v 2 = λ∗ v 2 =⇒ λ = λ∗ 2 (Problem 28, Section 6.4) For complex matrices, the symmetry AT = A that ¯T produces real eigenvalues changes to A = A.From det(A − λI)=0,findthe eigenvalues of the 2 by 2 “Hermitian” matrix A = 42+ı;2− ı 0 = A¯T.To see why eigenvalues are real when A¯T = A, adjust equation (1) of the text to A¯x¯ = λ¯x¯. (See the green box above.) 6.4 Problem discussions 6-83

(Problem 27, Section 6.4) (MATLAB) Take two symmetric matrices with different 10 81 eigenvectors, say A = and B = . Graph the eigenvalues λ (A + tB) 02 10 1 and λ2(A + tB)for−8

Correction for Problem 27: The problem should be “... Graph the eigenvalues λ1 and λ2 of A + tB for −8

Hint: Draw the pictures of λ1(t)andλ2(t) with respect to t and check λ1(t)−λ2(t). 6.4 Problem discussions 6-84

(Problem 29, Section 6.4) matrices have A¯TA = AA¯T. For real matrices, ATA = AAT includes symmetric, skew symmetric, and orthogonal matrices. Those have real λ, imaginary λ and |λ| = 1. Other normal matrices can have any complex eigenvalues λ.

Key : Normal matrices have n orthonormal eigenvectors. Those vectors xi T probably will have complex components. In that complex case means x¯ i xj =0 as Chapter 10 explains. Inner products (dot products) become x¯ Ty. The test for n orthonormal columns in Q becomes Q¯TQ = I instead of QTQ = I.

A has n orthonormal eigenvectors (A = QΛQ¯T) if and only if A is normal.

(a) Start from A = QΛQ¯T with Q¯TQ = I. Show that A¯TA = AA¯T. (b) Now start from A¯TA = AA¯T. Schur found A = QT Q¯T for every matrix A, with a triangular T . For normal matrices we must show (in 3 steps) that this T will actually be diagonal. Then T =Λ. A QT Q¯T A¯TA AA¯T T¯TT T T¯T Step 1. Put =  into = to find = . ab Step 2: Suppose T = has T¯TT = T T¯T. Prove that b =0. 0 d Step 3: Extend Step 2 to size n. A normal triangular T must be diagonal. 6.4 Problem discussions 6-85

Important conclusion from Problem 29: A matrix A has n orthogonal eigenvec- tors if, and only if, A is normal.

Definition (): A matrix A is normal if A†A = A†A. Proof: • Schur’s theorem: A = QT Q† with T upper triangular and Q† = Q−1. • A†A = AA† =⇒ T †T = TT† =⇒ T diagonal =⇒ Q is the matrix of eigenvectors by AQ = QT . 2 6.5 Positive definite matrices 6-86

Definition (Positive definite): A symmetric matrix A is positive defi- nite if its eigenvalues are all positive.

• The above definition only applies to a symmetric matrix because a non-symmetric matrix may have complex eigenvalues (which cannot be compared with zero)!

Properties of positive definite (symmetric) matrices A • Equivalent Definition: (symmetric) is positive definite if, and only if, xTAx > 0 for all non-zero x.

xTAx is usually referred to as the !

The proofs can be found in Problems 18 and 19.

(Problem 18, Section 6.5) If Ax = λx then xTAx = .IfxTAx > 0, prove that λ>0. 6.5 Positive definite matrices 6-87

(Problem 19, Section 6.5) Reverse Problem 18 to show that if all λ>0 then xTAx > 0. We must do this for every nonzero x, not just the eigenvectors. So write x as a combination of the eigenvectors and explain why all “cross T T terms” are xi xj =0.Thenx Ax is c x ··· c x T c λ x ··· c λ x c2λ xTx ··· c2 λ xTx > . ( 1 1+ + n n) ( 1 1 1+ + n n n)= 1 1 1 1+ + n n n n 0

Proof (Problems 18):

T T – (only if part : Problem 19) Avi = λivi implies vi Avi = λivi vi > 0 for all {λ }n {v }n eigenvalues i i=1 and eigenvectors i i=1. The proof is completed by x n c v {v }n noting that with = i=1 i i and i i=1 orthogonal,   n n n T T   2 2 x (Ax)= civi cjλjvj = ci λi vi > 0. i=1 j=1 i=1

T T – (if part : Problem 18) Taking x = vi, we obtain that vi (Avi)=vi (λvi)= 2 λi vi > 0, which implies λi > 0. 2 6.5 Positive definite matrices 6-88

• Based on the above proposition, we can conclude similar to two positive scalars (as “if a and b are both positive, so is a + b”) that

Proposition: If A and B are both positive definite, so is A + B.

• The next property provides an easy way to construct positive definite matrices.

T Proposition: If An×n = Rn×mRm×n and Rm×n has linearly independent columns, then A is positive definite.

Proof: – x non-zero =⇒ Rx non-zero. – Then, xT(RTR)x =(Rx)T(Rx)= Rx 2 > 0. 2 6.5 Positive definite matrices 6-89

Since (symmetric) A = QΛQT,wecanchooseR = QΛ1/2QT,whichrequires R to be a square matrix. This observation gives another equivalent definition of positive definite matrices.

Equivalent Definition: An×n is positive definite if, and only if, there exists T Rm×n with independent columns such that A = R R. 6.5 Positive definite matrices 6-90

A • Equivalent Definition: n×n is positive definite if, and only if, all pivots are positive. Proof: – (By LDU decomposition,) A = LDLT,whereD is a diagonal matrix with pivots as diagonals, and L is a lower with 1 as diagonals. Then, xTAx xT LDLT x LTx TD LTx = ( ) =( ) ( )  lTx 1 = yTDy where y = LTx =  .  T lnx d y2 ··· d y2 = 1 1 + + n n     d lTx 2 ··· d lTx 2 . = 1 1 + + n n – So, if all pivots are positive, xTAx > 0 for all non-zero x.Conversely,if xTAx > 0 for all non-zero x, which in turns implies yTDy > 0 for all non-zero y (as L is invertible), then pivot di must be positive for the choice of yj = 0 except for j = i . 2 6.5 Positive definite matrices 6-91

An extension of the previous proposition is:

Suppose A = BCBT with B invertible and C diagonal. Then, A is positive definite if, and only if, diagonals of C are all positive! See Problem 35.

(Problem 35, Section 6.5) Suppose C is positive definite (so yTCy > 0 when- ever y = 0)andA has independent columns (so Ax = 0 whenever x = 0). Apply the energy test to xTATCAx to show that ATCA is positive definite: the crucial matrix in engineering. 6.5 Positive definite matrices 6-92

Equivalent Definition: An×n is positive definite if, and only if, the n upper left determinants (i.e., the (n − 1) leading principle minors and det(A)) are all positive. Definition (Minors, Principle minors and leading principle mi- nors): – A of a matrix A is the of some smaller square matrix, obtained by removing one or more of its rows or columns. – The first-order (respectively, second-order, etc) minor is a minor, obtained by removing (n − 1) (respectively, (n − 2),etc)rowsorcolumns. – A principle minor of a matrix is a minor, obtained by removing the same rows and columns. – A leading principle minor of a matrix is a minor, obtaining by removing the last few rows and columns. 6.5 Positive definite matrices 6-93

The below example gives three upper left determinants (or two leading prin- ciple minors and det(A)), det(A1×1), det(A2×2)anddet(A3×3).

a1,1 a1,2 a1,3 a2,1 a2,2 a2,3

a3,1 a3,2 a3,3

Proof of the equivalent definition: This can be proved based on Slide 5-18: By LU decomposition,     100··· 0 d u , u , ··· u ,n    1 1 2 1 3 1  l , 10··· 0  0 d u , ··· d ,n  2 1   2 2 3 2  A l l ···   d ··· d  =  3,1 3,2 1 0  00 3 3,n  ......   ......  ln,1 ln,2 ln,3 ··· 1 00 0··· dn

det(Ak×k) =⇒ dk = det(A(k−1)×(k−1)) 2 6.5 Positive definite matrices 6-94

n Equivalent Definition: An×n is positive definite if, and only if, all (2 − 2) principle minors and det(A) are positive.

• Basedontheaboveequivalent definitions, a positive definite matrix cannot have either zero or negative value in its main diagonals. See Problem 16.

(Problem 16, Section 6.5) A positive definite matrix cannot have a zero (or even worse, a negative number) on its diagonal. Show that this matrix fails to have xTAx > 0:     x   411 1     x1 x2 x3 102 x2 is not positive when (x1,x2,x3)=( ,,). 125 x3

With the above discussion of positive definite matrices, we can proceed to de- fine similar notions like negative definite, positive semidefinite, negative semidefinite matrices. 6.5 Positive semidefinite (or nonnegative definite) 6-95

Definition (Positive semidefinite): A symmetric matrix A is positive semidefinite if its eigenvalues are all nonnegative. Equivalent Definition: A is positive semidefinite if, and only if, xTAx≥0for all non-zero x.

Equivalent Definition: An×n is positive semidefinite if, and only if, there exists T Rm×n (perhaps with dependent columns) such that A = R R.

Equivalent Definition: An×n is positive semidefinite if, and only if, all pivots are nonnegative. n Equivalent Definition: An×n is positive semidefinite if, and only if, all (2 −2) principle minors and det(A)arenon-negative.   1 00 Example. Non-“positive semidefinite” A=0 0 0  has (23−2) principle minors. 00−1       1 0 1 0 0 0       det , det , det , det 1 , det 0 , det −1 . 0 0 0 −1 0 −1 Note: It is not sufficient to define the positive semidefinite based on the non- negativity of leading principle minors and det(A). Check the above example. 6.5 Negative definite, negative semidefinite, indefinite 6-96

We can similarly define negative definite. Definition (Negative definite): A symmetric matrix A is negative def- inite if its eigenvalues are all negative.

Equivalent Definition: A is negative definite if, and only if, xTAx<0 for all non-zero x.

Equivalent Definition: An×n is negative definite if, and only if, there exists T Rm×n with independent columns such that A = −R R.

Equivalent Definition: An×n is negative definite if, and only if, all pivots are negative.

Equivalent Definition: An×n is negative definite if, and only if, all odd-order leading principle minors are negative and all even-order leading principle minors and det(A)arepositive.

Equivalent Definition: An×n is negative definite if, and only if, all odd-order principle minors are negative and all even-order principle minors and det(A)are positive. 6.5 Negative definite, negative semidefinite, indefinite 6-97

We can also similarly define negative semidefinite. Definition (Negative definite): A symmetric matrix A is negative semidefinite if its eigenvalues are all non-positive.

Equivalent Definition: A is negative semidefinite if, and only if, xTAx≤0for all non-zero x.

Equivalent Definition: An×n is negative semidefinite if, and only if, there T exists Rm×n (possibly with dependent columns) such that A = −R R.

Equivalent Definition: An×n is negative semidefinite if, and only if, all pivots are non-positive.

Equivalent Definition: An×n is negative definite if, and only if, all odd-order principle minors are non-positive and all even-order principle minors and det(A) are non-negative.

Note: It is not sufficient to define the negative semidefinite based on the non- positivity of leading principle minors and det(A). Finally, if some of the eigenvalues of a symmetric matrix are positive, and some are negative, the matrix will be referred to as indefinite. 6.5 Quadratic form 6-98

• For a positive definite matrix A2×2, xTAx xT Q QT x QTx T QTx = ( Λ ) =( ) Λ(  )  qTx yT y y QTx 1 = Λ where = = qTx     2 λ qTx 2 λ qTx 2 = 1 1 + 2 2 So, xTAx = c gives an if c>0.   54 Example. A = 45     1 1 1 1 – λ1 =9andλ2 =1,andq = √ and q = √ . 1 2 1 2 2 −1

x x 2 x − x 2 T T 2 T 2 1 + 2 1 2 – x Ax = λ1 (q x) + λ2 (q x) =9 √ + √ 1 2 2 2  qTx is an axis to q . (Not along q as the textbook said.) – 1 1 1 qTx q . 2 is an axis perpendicular to 2 Tip: A = QΛQT is called the principal axis theorem (cf. Slide 6-69) because xTAx = yTΛy = c is an ellipse with axes along the eigenvectors. 6.5 Problem discussions 6-99

(Problem 13, Section 6.5) Find a matrix with a>0andc>0anda + c>2b that has a negative eigenvalue.   ab Missing point in Problem 13: Thematrixtobedeterminedisoftheshape bc. 6.6 Similar matrices 6-100

Definition (Similarity): Two matrices A and B are similar if B = M −1AM for some invertible M. Theorem: A and M −1AM have the same eigenvalues. Proof: For eigenvalue λ and eigenvector v of A,definev = M −1v.Wethenderive Bv =(M −1AM)v = M −1AM(M −1v)=M −1Av = M −1λv = λM −1v = λv So, λ is also the eigenvalue of M −1AM (associated with eigenvector v = M −1v). 2

Notes: • The LU decomposition over a symmetric matrix A gives A = LDLT but A and D (pivot matrix) apparently may have different eigenvalues. Why? Because (LT)−1 = L. Similarity is defined based on M −1, not M T. • The converse to the above theorem is wrong!. In other words, we cannot say that “two matrices  with the same eigenvalues are similar.” 01 00 Example. A = and B = have the same eigenvalues 0, 0, but 00 00 they are not similar. 6.6 Jordan form 6-101

• Why introducing matrix similarity?

Answer: We can then extend the diagonalization of a matrix with less than n eigenvectors to the Jordan form.

• For an un- A, we can find invertible M such that A = MJM−1, where   J 0 ··· 0  1   0 J ··· 0  J =  2   ......  00··· Js

and s is the number of distinct eigenvalues, and the size of Ji is equal to the multiplicity of eigenvalue λi,andJi is of the form   λi 10··· 0    0 λi 1 ··· 0    J  λ ··· .  i =  00 i .   ...... 1  000··· λi 6.6 Jordan form 6-102

• The idea behind the Jordan form is that A is similar to J. • Based on this, we can now compute   J 100 0 ··· 0  1   0 J 100 ··· 0  A100 = MJ100M −1 = M  2  M −1.  ......  100 00··· Js

• How to find Mi corresponding to Ji?   Answer: By AM = MJ with M = M1 M2 ··· Ms ,weknow

AMi = MiJi for 1 ≤ i ≤ s.

Specifically, assume that the multiplicity of λi is two. Then,   λ A (1) (2) (1) (2) i 1 vi vi = vi vi 0 λi which is equivalent to   (1) (1) (1) Av = λiv (A − λiI)v =0 i i ⇒ i (2) (1) (2) = (2) (1) Avi = vi + λivi (A − λiI)vi = vi 6.6 Problem discussions 6-103

(Problem 14, Section 6.6) Prove that AT is always similar to A (we know the λ’s are the same):

−1 T 1. For one Jordan block Ji:FindMi so that Mi JiMi = Ji . J J M M −1JM J T 2. For any with blocks i: Build 0 from blocks so that 0 0 = . 3. For any A = MJM−1: Show that AT is similar to J T and so to J and to A. 6.6 Problem discussions 6-104

Thinking over Problem 14: AT and A are always similar. Answer: • It can be easily checked that         u1,1 u1,2 ··· u1,n 0 ··· 01 un,n ··· un,2 un,1 0 ··· 01   .   . . . .  .  u2,1 u2,2 ··· u2,n . ··· 10  . .. . .  . ··· 10  . . . .  = . . .  .  . . .  . . .. .  . ··· . .  . ··· u2,2 u2,1 . ··· . . un, un, ··· un,n 1 ··· 00 u ,n ··· u , u , 1 ··· 00 1 2  1  1 2 1 1 un,n ··· un,2 un,1  . . . .  −  . .. . .  = V 1  .  V  . ··· u2,2 u2,1 u1,n ··· u1,2 u1,1

where here V represents a matrix with zero entries except vi,n+1−i =1for 1 ≤ i ≤ n.WenotethatV −1 = V T = V .

So, we can use the proper size of Mi = V to obtain

T −1 Ji = Mi JiMi. 6.6 Problem discussions 6-105

Define   M 0 ··· 0  1   0 M ··· 0  M =  2   ......  00··· Ms We have J T = MJM−1, where   J 0 ··· 0  1   0 J ··· 0  J =  2   ......  00··· Js

and s is the number of distinct eigenvalues, and the size of Ji is equal to the multiplicity of eigenvalue λi,andJi is of the form   λi 10··· 0    0 λi 1 ··· 0    J  λ ··· .  i =  00 i .   ...... 1  000··· λi 6.6 Problem discussions 6-106

Finally, we know – A is similar to J; – AT is similar to J T; – J T is similar to J. So, AT is similar to A. 2

(Problem 19, Section 6.6) If A is 6 by 4 and B is 4 by 6, AB and BA have different sizes. But with blocks,       I −A AB 0 IA 00 M −1FM = = = G. 0 I B 0 0 I BBA (a) What sizes are the four blocks (the same four sizes in each matrix)? (b) This equation is M −1FM = G,soF and G have the same eigenvalues. F has the 6 eigenvalues of AB plus 4 zeros; G has the 4 eigenvalues of BA plus 6 zeros. AB has the same eigenvalues as BA plus zeros. 6.6 Problem discussions 6-107

Thinking over Problem 19: Am×nBn×m and Bn×mAm×n have the same eigenval- ues except for additional (m − n) zeros. Solution: The example shows the usefulness of the similarity.       Im×m −Am×n Am×nBn×m 0m×n Im×m Am×n 0m×m 0m×n • = 0n×m In×n Bn×m 0n×n 0n×m In×n Bn×m Bn×mAm×n     Am×nBn×m 0m×n 0m×m 0m×n • Hence, and are similar and have the Bn×m 0n×n Bn×m Bn×mAm×n same eigenvalues. • From Problem 26 in Section 6.1 (see Slide 6-25), the desired claim (as indicated above by red-color text) is proved. 2 6.6 Problem discussions 6-108

(Problem 21, Section 6.6) If J is the 5 by 5 Jordan block with λ =0,findJ 2 and count its eigenvectors (are these the eigenvectors?) and find its Jordan form (there will be two blocks).

Problem 21 : Find the Jordan form of A = J 2,where   01000   00100   J   . = 00010 00001 00000

Solution.   00100   00010   • A J 2   A = = 00001, and the eigenvalues of are0,0,0,0,0. 00000 00000 6.6 Problem discussions 6-109       v v v  1,1  3,1  1,1 v  v  v   2,1  4,1  2,1 • Av A v  v  ⇒ v v v ⇒ v   1 =  3,1 =  5,1 = 0 = 3,1 = 4,1 = 5,1 =0 = 1 =  0 . v4,1  0   0  v5,1 0 0       v v  v  1,2  3,2   1,2 v  v  v3,2 = v1,1 v   2,2  4,2  2,2 • Av2 = A v ,  = v ,  = v1 =⇒ v , = v , =⇒ v2 = v ,   3 2  5 2  4 2 2 1  1 1 v ,   0   v ,  4 2 v5,2 =0 2 1 v5,2 0 0        v v v 1,3 3,3 v , = v , 1,3      3 3 1 2   v2,3 v4,3  v2,3     v , = v ,   • Av A v  v  v ⇒ 4 3 2 2 ⇒ v v  3 =  3,3 =  5,3 = 2 =  = 3 =  1,2 v5,3 = v1,1 v4,3  0   v2,2 v v5,3 0 2,1 =0 v1,1 6.6 Problem discussions 6-110

• To summarize,       v v v  1,1  1,2  1,3 v  v  v   2,1  2,2  2,3 v   ⇒ v v  ⇒ v , v v  1 =  0  = 2 =  1,1 = If 2,1 =0 then 3 =  1,2  0  v2,1 v2,2 0 0 v1,1         (1) (1)  1 v , v ,     1 2  1 3    v(1) v(1)  0  2,2  2,3  (1)   (1)   (1)  (1) v = 0 =⇒ v =  1  =⇒ v = v ,   1   2   3  1 2      v(1)  0  0   2,2  0  0  1  (2)  0 v ,     1 2    v(2)  1  2,2  (2)   (2)   v = 0 =⇒ v =  0   1   2         0  1   0 0 6.6 Problem discussions 6-111

v(1) v(2) v(1) Sincewewishtochooseeachof 2 and 2 to be orthogonal to both 1 and v(2) 1 , v(1) v(1) v(2) v(2) , 1,2 = 2,2 = 1,2 = 2,2 =0 i.e.,         (1)  1 0 v ,       1 3      v(1)  0 0  2,3  (1)   (1)   (1)   v = 0 =⇒ v = 1 =⇒ v =  0   1   2   3           0 0  0   0 0 1   0 0            1 0  (2)   (2)   v = 0 =⇒ v = 0  1   2         0 1  0 0 v(1) v(1) v(2) v(1) v(2) Since we wish to choose 3 to be orthogonal to all of 1 , 1 , 2 and 2 , v(1) v(1) . 1,3 = 2,3 =0 6.6 Problem discussions 6-112

As a result,   01000   00100   A v(1) v(1) v(1) v(2) v(2) = v(1) v(1) v(1) v(2) v(2) 00000 1 2 3 1 2 1 2 3 1 2   00001 00000 2

• v(1), v(1), v(1), v(2) v(2) A Av λv Are all 1 2 3 1 and 2 eigenvectors of (satisfying = )?

Hint: A does not have 5 eigenvectors. More specifically, A only has 2 eigen- vectors? Which two? Think of it. • Hence, the problem statement may not be accurate as I mark “(are these the eigenvectors?)”. 6.7 Singular value decomposition (SVD) 6-113

• Now we can represent all square matrix in the Jordan form: A = M −1JM.

• What if the matrix Am×n is not a square one?

– Problem: Av = λv is not possible! Specifically, v cannot have two different dimensionalities.

Am×nvn×1 = λvn×1 infeasible if n = m.

– So, we can only have Am×nvn×1 = σum×1. 6.7 Singular value decomposition (SVD) 6-114

• If we can find enough numbers of orthogonal u and v such that   σ ··· 00··· 0  1   ......         ··· σ ···  A v v ··· v u u ··· u  0 r 0 0 . m×n 1 2 n n×n = 1 2 m m×m  ··· ···         0 00 0 V U  ......  ··· ··· 0 00 0 m×n where r is the rank of A, then we can perform the so-called singular value decomposition (SVD)

A = UΣV −1.

Note again that the required enough number of orthogonal u and v may be impossible when A has repeated “eigenvalues.” 6.7 Singular value decomposition (SVD) 6-115

• If V is chosen to be an orthogonal matrix satisfying V −1 = V T,thenwehave the so-called reduced SVD A = UΣV T r T = σiuivi i=1     σ ··· vT   1 0 1 u ··· u  . .. .   .  = 1 r m×r . . . . ··· σ vT 0 r r×r r r×n

• Usually, we prefer to choose an orthogonal V (as well as orthogonal U). • In the sequel, we will assume the found U and V are orthogonal matrices in the first place; later, we will confirm that orthogonal U and V can always be found. 6.7 How to determine U, V and {σi}? 6-116

T • Am×n = Um×mΣm×nVn×n     T T T T =⇒ An×mAm×n = Um×mΣm×nVn×n Um×mΣm×nVn×n T T T 2 T = Vn×nΣn×mUm×mUm×mΣm×nVn×n = Vn×nΣn×nVn×n. So, V is the (orthogonal) matrix of n eigenvectors of symmetric ATA.

T • Am×n = Um×mΣm×nVn×n    T T T T =⇒ Am×nAn×m = Um×mΣm×nVn×n Um×mΣm×nVn×n T T T 2 T = Um×mΣm×nVn×nVn×nΣn×mUm×m = Um×mΣm×mUm×m. So, U is the (orthogonal) matrix of m eigenvectors of symmetric AAT. • Remember that ATA and AAT have the same eigenvalues except for additional (m − n) zeros.

Section 6.6, Problem 19 (See Slide 6-107): Am×nBn×m and Bn×mAm×n have the same eigenvalues except for additional (m − n) zeros.

In fact, there are only r non-zero eigenvalues for ATA and AAT, which satisfy

2 λi = σi , where 1 ≤ i ≤ r. 6.7 How to determine U, V and {σi}? 6-117

Example (Problem 21 in Section 6.6): Find the SVD of A = J 2,where   01000   00100   J   . = 00010 00001 00000

Solution.     00000 10000     00000 01000     • ATA   AAT   = 00100 and = 00100. 00010 00000 00001 00000       00010 10000 10000       00001 01000 01000       • V   U   2   So = 10000 and = 00100 for Λ = Σ = 00100. 01000 00010 00000 00100 00001 00000 6.7 How to determine U, V and {σi}? 6-118

• Hence,         00100 10000 10000 00100         00010 01000 01000 00010                 00001 = 00100 00100 00001 00000 00010 00000 10000 00000 00001 00000 01000             A U Σ V T 2 We may compare this with the Jordan form.         00100 10000 01000 10000         00010 00010 00100 00100                 00001 = 01000 00000 00001 00000 00001 00001 01000 00000 00100 00000 00010             A M J M−1 6.7 How to determine U, V and {σi}? 6-119

Remarks on the previous example   10000   01000   • 2   In the above example, we only know Λ = Σ = 00100. 00000 00000   ±10 000    0 ±1000   •  ±  Hence, Σ =  00 100.  00000 00000

• However, by Avi = σiui =(−σi)(−ui), we can always choose positive σi by adjusting the sign of ui. 6.6 SVD 6-120

Important notes:

• In terminology, σi,where1≤ i ≤ r, is called the singular value. Hence, the name of singular value decomposition is used.

• The singular value is always non-zero (even though the eigenvalues of A can be zeros).     00100 10000     00010 01000     A     Example. = 00001 but Σ = 00100. 00000 00000 00000 00000 • It is good to know that: The first r columns of V ∈ row space R(A)andarebases of R(A) The last (n − r) columns of V ∈ null space N(A)andarebases of N(A) The first r columns of U ∈ column space C(A)andarebases of C(A) The last (m − r) columns of U ∈ left null space N(AT)andarebases of N(AT) How useful the above facts are can be seen from the next example. 6.6 SVD 6-121

A x yT Example. Find the SVD of 4×3 = 4×1 1×3.

Solution.   σ 00  1   000 • A isarank-1matrix.So,r =1andΣ= ,whereσ = 0. (Actually,  000 1 000

σ1 =1.) y • The base of the row space is v1 = y ; pick up perpendicular v2 and v3 that span the null space. x • The base of the column space is u1 = x ; pick up perpendicular u2, u3, u4 that span the left null space. • The SVD is then    T x  y row space u u u y  2 3 4   A4×3 = x Σ4×3  vT   left null space 2 4×4 null space column space vT 3 3×3 2