Chapter 6
Eigenvalues and Eigenvectors
Po-Ning Chen, Professor
Department of Electrical and Computer Engineering
National Chiao Tung University
Hsin Chu, Taiwan 30010, R.O.C. 6.1 Introduction to eigenvalues 6-1
Motivations • The static system problem of Ax = b has now been solved, e.g., by Gauss- Jordan method or Cramer’s rule.
• However, a dynamic system problem such as
Ax = λx
cannot be solved by the static system method.
• To solve the dynamic system problem, we need to find the static feature of A that is “unchanged” with the mapping A. In other words, Ax maps to itself with possibly some stretching (λ>1), shrinking (0 <λ<1), or being reversed (λ<0).
• These invariant characteristics of A are the eigenvalues and eigenvectors. Ax maps a vector x to its column space C(A). We are looking for a v ∈ C(A) such that Av aligns with v. The collection of all such vectors is the set of eigen- vectors. 6.1 Eigenvalues and eigenvectors 6-2
Conception (Eigenvalues and eigenvectors): The eigenvalue-eigenvector pair (λi, vi) of a square matrix A satisfies
Avi = λivi (or equivalently (A − λiI)vi = 0) where vi = 0 (but λi can be zero), where v1, v2,..., are linearly independent.
How to derive eigenvalues and eigenvectors?
• For 3 × 3 matrix A, we can obtain 3 eigenvalues λ1,λ2,λ3 by solving
3 2 det(A − λI)=−λ + c2λ + c1λ + c0 =0.
• If all λ1,λ2,λ3 are unequal, then we continue to derive: Av1 = λ1v1 (A − λ1I)v1 = 0 v1 =theonly basis of the nullspace of (A − λ1I) Av λ v ⇔ A − λ I v ⇔ v A − λ I 2 = 2 2 ( 2 ) 2 = 0 2 =theonly basis of the nullspace of ( 2 ) Av3 = λ3v3 (A − λ3I)v3 = 0 v3 =theonly basis of the nullspace of (A − λ3I)
and the resulting v1, v2 and v3 are linearly independent to each other.
• If A is symmetric,thenv1, v2 and v3 are orthogonal to each other. 6.1 Eigenvalues and eigenvectors 6-3
• If, for example, λ1 = λ2 = λ3, then we derive: A − λ I v 0 {v , v } A − λ I ( 1 ) 1 = ⇔ 1 2 =thebasis of the nullspace of ( 1 ) (A − λ3I)v3 = 0 v3 =theonly basis of the nullspace of (A − λ3I)
and the resulting v1, v2 and v3 are linearly independent to each other when the nullspace of (A − λ1I)istwo dimensional.
• If A is symmetric,thenv1, v2 and v3 are orthogonal to each other.
– Yet, it is possible that the nullspace of (A − λ1I)isone dimensional, then we can only have v1 = v2.
– In such case, we say the repeated eigenvalue λ1 only have one eigenvector. In other words, we say A only have two eigenvectors.
– If A is symmetric,thenv1 and v3 are orthogonal to each other. 6.1 Eigenvalues and eigenvectors 6-4
• If λ1 = λ2 = λ3, then we derive:
(A − λ1I)v1 = 0 ⇔{v1, v2, v3} =thebasis of the nullspace of (A − λ1I)
• When the nullspace of (A − λ1I)isthree dimensional, then v1, v2 and v3 are linearly independent to each other.
• When the nullspace of (A − λ1I)istwo dimensional, then we say A only have two eigenvectors.
• When the nullspace of (A − λ1I)isone dimensional, then we say A only have one eigenvector. In such case, the “matrix-form eigensystem” becomes: λ 1 00 A v1 v1 v1 = v1 v1 v1 0 λ1 0 00λ1 • If A is symmetric, then distinct eigenvectors are orthogonal to each other. 6.1 Eigenvalues and eigenvectors 6-5
Invariance of eigenvectors and eigenvalues. A • Property 1: The eigenvectors stay the same for every power of . The eigenvalues equal the same power of the respective eigenvalues. n n I.e., A vi = λi vi.
2 2 Avi = λivi =⇒ A vi = A(λivi)=λiAvi = λi vi
A • Property 2: If the nullspace of n×n consists of non-zero vectors, then 0 is the eigenvalue of A.
Accordingly, there exists non-zero vi to satisfy Avi =0· vi = 0. 6.1 Eigenvalues and eigenvectors 6-6
Property 3: Assume with no loss of generality |λ1| > |λ2| > ··· > |λk|.For any vector x = a1v1 + a2v2 + ···+ akvk that is a linear combination of all P 1 A eigenvectors, the normalized mapping = λ1 (namely,
• 1 1 1 P v1 = A v1 = (Av1)= λ1v1 = v1) λ1 λ1 λ1 converges to the eigenvector with the largest absolute eigenvalue (when being applied repeatedly). I.e., k 1 k 1 k k k lim P x = lim A x = lim a1A v1 + a2A v2 + ···+ akA vk k→∞ k→∞ λk k→∞ λk 1 1 1 k k k k = lim a1λ v1 + a2λ v2 + ···+ λkA vk k→∞ λk 1 2 1 steady state transient states = a1v1. 6.1 How to determine the eigenvalues? 6-7
We wish to find a non-zero vector v to satisfy Av = λv;then (A − λI)v =0, where v = 0 ⇔ det(A − λI)=0.
So by solving det(A − λI) = 0, we can obtain all the eigenvalues of A.
0.50.5 Example. Find the eigenvalues of A = . 0.50.5 Solution. 0.5 − λ 0.5 det(A − λI)=det 0.50.5 − λ =(0.5 − λ)2 − 0.52 = λ2 − λ = λ(λ − 1) = 0 ⇒ λ =0 or 1. 6.1 How to determine the eigenvalues? 6-8
Proposition: Projection matrix (defined in Section 4.2) has eigenvalues either 1 or 0. Proof: • A projection matrix always satisfies P 2 = P .SoP 2v = P v = λv. • By definition of eigenvalues and eigenvectors, we have P 2v = λ2v. • Hence, λv = λ2v for non-zero vector v, which immediately implies λ = λ2. • Accordingly, λ is either 1 or 0. 2
Proposition: Permutation matrix has eigenvalues satisfying λk =1forsome integer k. Proof: • A purmutation matrix always satisfies P k+1 = P for some integer k. 001 v3 3 Example. P = 100. Then, P v = v1 and P v = v. Hence, k =3. 010 v2 • Accordingly, λk+1v = λv,whichgivesλk = 1 since the eigenvalue of P cannot be zero. 2 6.1 How to determine the eigenvalues? 6-9
Proposition: Matrix
m m−1 αmA + αm−1A + ···+ α1A + α0I has the same eigenvectors as A, but its eigenvalues become
m m−1 αmλ + αm−1λ + ···+ α1λ + α0, where λ is the eigenvalue of A. Proof:
• Let vi and λi be the eigenvector and eigenvalue of A. Then, m m−1 m m−1 αmA + αm−1A + ···+ α0I vi = αmλ + αm−1λ + ···+ α0 vi m m−1 • Hence, vi and αmλ + αm−1λ + ···+ α0 are the eigenvector and eigen- value of the polynomial matrix. 2 6.1 How to determine the eigenvalues? 6-10
Theorem (Cayley-Hamilton): For a square matrix A,define
n n−1 f(λ)=det(A − λI)=λ + αn−1λ + ···+ α0. (Suppose A has linearly independent eigenvectors.) Then
n n−1 f(A)=A + αn−1A + ···+ α0I = all-zero n × n matrix.
Proof: The eigenvalues of f(A)areall zeros and the eigenvectors of f(A)remain the same as A. By definition of eigen-system, we have f(λ1)0··· 0 0 f(λ2) ··· 0 f(A) v v ··· vn = v v ··· vn . 1 2 1 2 ...... 00··· f(λn) 2 Corollary (Cayley-Hamilton): (Suppose A has linearly independent eigen- vectors.) (λ1I − A)(λ2I − A) ···(λnI − A) = all-zero matrix
Proof: f(λ) can be re-written as f(λ)=(λ1 − λ)(λ2 − λ) ···(λn − λ). 2 6.1 How to determine the eigenvalues? 6-11
(Problem 11, Section 6.1) Here is a strange fact about 2 by 2 matrices with eigen- values λ1 = λ2:ThecolumnsofA − λ1I are multiples of the eigenvector x2.Any idea why this should be? 00 Hint: (λ I − A)(λ I − A)= implies 1 2 00
(λ1I − A)w1 = 0 and (λ1I − A)w2 = 0 where (λ2I − A)= w1 w2 . Hence,
the columns of (λ2I − A) give the eigenvectors of λ1 if they are non-zero vectors and
the columns of (λ1I − A) give the eigenvectors of λ2 if they are non-zero vectors .
So, the (non-zero) columns of A − λ1I are (multiples of) the eigen- vector x2. 6.1 Why Gauss-Jordan cannot solve Ax = λx? 6-12
• The forward elimination may change the eigenvalues and eigenvectors? 1 −2 Example. Check eigenvalues and eigenvectors of A = . 25 Solution. A A − λI λ − 2 . – The eigenvalues of satisfy det( )=( 3) =0 10 1 −2 – A = LU = . The eigenvalues of U apparently satisfy 21 09 det(U − λI)=(1− λ)(9 − λ)=0.
– Suppose u1 and u2 are the eigenvectors of U, respectively corresponding to eigenvalues 1 and 9. Then, they cannot be the eigenvectors of A since if they were, 10 3u1 = Au1 = LUu1 = Lu1 = u1 =⇒ u1 = 0 21 2 10 3u2 = Au2 = LUu2 =9Lu2 =9 u2 =⇒ u2 = 0. 21
Hence, the eigenvalues are nothing to do with pivots (except for a triangular A). 6.1 How to determine the eigenvalues? (Revisited) 6-13
Solve det(A − λI)=0. • det(A − λI) is a polynomial of order n. a − λa a ··· a 1,1 1,2 1,3 1,n a a − λa ··· a 2,1 2,2 2,3 2,n f λ A − λI a a a − λ ··· a ( )=det( )=det 3,1 3,2 3,3 3,n ...... an,1 an,2 an,3 ··· an,n − λ
=(a1,1 − λ)(a2,2 − λ) ···(an,n − λ)+terms of order at most (n − 2) (By Leibniz formula)
=(λ1 − λ)(λ2 − λ) ···(λn − λ) Observations • The coefficient of λn is (−1)n,providedn ≥ 1. • λn−1 n λ n a A n ≥ The coefficient of is i=1 i = i=1 i,i = trace( ),provided 2. • ... • λ0 n λ f A . The coefficient of is i=1 i = (0) = det( ) 6.1 How to determine the eigenvalues? (Revisited) 6-14
These observations make easy the finding of the eigenvalues of 2 × 2 matrix. 11 Example. Find the eigenvalues of A = . 41
Solution. λ λ 1 + 2 =1+1=2 2 2 • =⇒ (λ1 − λ2) =(λ1 + λ2) − 4λ1λ2 =16 λ1λ2 =1− 4=−3
=⇒ λ1 − λ2 =4 =⇒ λ =3, −1. 2 11 Example. Find the eigenvalues of A = . 22
Solution. λ + λ =3 • 1 2 =⇒ λ =3, 0. 2 λ1λ2 =0 6.1 Imaginary eigenvalues 6-15
In some cases, we have to allow imaginary eigenvalues.
• In order to solve polynomial equations f(λ) = 0, the mathematician was forced to image that there exists a number x satisfying x2 = −1 .
• By this technique, a polynomial equations of order n have exactly n (possibly complex,notreal) solutions.
Example. Solve λ2 +1=0. =⇒ λ = ±i. 2
• Based on this, to solve the eigenvalues, we were forced to accept imaginary eigenvalues. 01 Example. Find the eigenvalues of A = . −10 Solution. det(A − λI)=λ2 +1=0 =⇒ λ = ±i. 2 6.1 Imaginary eigenvalues 6-16
Proposition: The eigenvalues of a symmetric matrix A (with real entries) are real, and the eigenvalues of a skew-symmetric (or antisymmetric) matrix B are pure imaginary. Proof: • Suppose Av = λv.Then, Av∗ = λ∗v∗ (A real) =⇒ (Av∗)Tv = λ∗(v∗)Tv (Equivalently, v · Av∗ = v · λ∗(v∗)) =⇒ (v∗)TATv = λ∗(v∗)Tv =⇒ (v∗)TAv = λ∗(v∗)Tv (Symmetry means AT = A) =⇒ (v∗)Tλv = λ∗(v∗)Tv (Av = λv) =⇒ λ v 2 = λ∗ v 2 (Eigenvector must be non-zero, i.e., v 2 =0) =⇒ λ = λ∗ =⇒ λ real 6.1 Imaginary eigenvalues 6-17
• Suppose Bv = λv. Then, Bv∗ = λ∗v∗ (B real) =⇒ (Bv∗)Tv =(λ∗v∗)Tv =⇒ (v∗)TBTv = λ∗(v∗)Tv =⇒ (v∗)T(−B)v = λ∗(v∗)Tv (Skew-symmetry means BT = −B) =⇒ (v∗)T(−λ)v = λ∗(v∗)Tv (Bv = λv) =⇒ (−λ) v 2 = λ∗ v 2 (Eigenvector must be non-zero, i.e., v 2 =0) =⇒−λ = λ∗ =⇒ λ pure imaginary 2 6.1 Eigenvalues/eigenvectors of inverse 6-18
For invertible A, the relation between eigenvalues and eigenvectors of A and A−1 can be well-determined. Proposition: The eigenvalues and eigenvectors of A−1 are
1 1 1 , v1 , , v2 ,..., , vn , λ1 λ2 λn { λ , v }n A where ( i i) i=1 are the eigenvalues and eigenvectors of . Proof: • A A n λ The eigenvalues of invertible must be non-zero because det( )= i=1 i =0. • Suppose Av = λv,wherev = 0 and λ =0.(I.e., λ and v are eigenvalue and eigenvector of A.)
−1 −1 −1 1 −1 • So, Av = λv =⇒ A (Av)=A (λv)=⇒ v = λA v =⇒ λv = A v. 2 Note: The eigenvalues of Ak and A−1 are respectively λk and λ−1 with the same eigenvectors as A,whereλ is the eigenvalue of A. 6.1 Eigenvalues/eigenvectors of inverse 6-19
Proposition: The eigenvalues of AT are the same as the eigenvalues of A.(But they may have different eigenvectors.) Proof: det(A − λI)=det ((A − λI)T)=det (AT − λIT)=det (AT − λI). 2
Corollary: The eigenvalues of an invertible (real-valued) matrix A satisfying A−1 = AT are on the unit circle of the complex plain. Proof: Suppose Av = λv. Then, Av∗ = λ∗v∗ (A real) =⇒ (Av∗)Tv = λ∗(v∗)Tv =⇒ (v∗)TATv = λ∗(v∗)Tv =⇒ (v∗)TA−1v = λ∗(v∗)Tv (AT = A−1) ∗ T 1 ∗ ∗ T −1 1 =⇒ (v ) λv = λ (v ) v (A v = λv) ⇒ 1 v 2 λ∗ v 2 v 2 = λ = (Eigenvector must be non-zero, i.e., =0) =⇒ λλ∗ = |λ|2 =1 2 01 Example. Find the eigenvalues of A = that satisfies AT = A−1. −10 Solution. det(A − λI)=λ2 +1=0 =⇒ λ = ±i. 2 6.1 Determination of eigenvectors 6-20
• After the identification of eigenvalues via det(A−λI) = 0, we can determine the respective eigenvectors using the nullspace technique.
• Recall (from Slide 3-42) how we completely solve
Bn×nvn×1 =(An×n − λIn×n)vn×1 = 0n×1.
Answer: Ir×r Fr× n−r R = rref(B)= ( ) (with no row exchange) 0(n−r)×r 0(n−r)×(n−r) −Fr×(n−r) ⇒ Nn×(n−r) = I(n−r)×(n−r) Then, every solution v for Bv = 0 is of the form v(null) N w w ∈ n−r. n×1 = n×(n−r) (n−r)×1 for any
Here, we usually find the (n − r) orthonormal bases for the null space as the representative eigenvectors, which are exactly the (n − r) columns of Nn×(n−r) (with proper normalization and orthogonalization). 6.1 Determination of eigenvectors 6-21
(Problem 12, Section 6.1) Find three eigenvectors for this matrix P (projection matrices have λ =1and0): 0.20.40 Projection matrix P = 0.40.80 . 001 If two eigenvectors share the same λ, so do all their linear combinations. Find an eigenvector of P with no zero components. Solution. • det(P − λI)=0givesλ(1 − λ)2 =0;soλ =0, 1, 1. −1 1 2 0 • λ =1:rref(P − I)=000; 000 1 1 2 0 2 0 F −1 N . so 1×2 = 2 0 ,and 3×2 = 10, which implies eigenvectors = 1 and 0 01 0 1 6.1 Determination of eigenvectors 6-22 1 2 0 • λ =0:rref(P )=000 (with no row exchange); 0 0 1 2 −2 −2 so F = − ,andN3×1 = 1 , which implies eigenvector = 1 . 0 0 0 2 6.1 MATLAB revisited 6-23
In MATLAB:
• We can find the eigenvalues and eigenvectors of a matrix A by: [V D]= eig(A); % Find the eigenvalues/vectors of A • The columns of V are the eigenvectors. • The diagonals of D are the eigenvalues.
Proposition: • The eigenvectors corresponding non-zero eigenvalues are in C(A). • The eigenvectors corresponding zero eigenvalues are in N(A).
Proof: The first one can be seen from Av = λv and the second one can be proved by Av = 0. 2 6.1 Some discussions on problems 6-24
(Problem 25, Section 6.1) Suppose A and B have the same eigenvalues λ1,...,λn with the same independent eigenvectors x1,...,xn.ThenA = B. Reason: Any vector x is a combintion c1x1 + ···+ cnxn.WhatisAx?WhatisBx? Thinking over Problem 25: Suppose A and B havethesameeigenvaluesand eigenvectors (not necessarily independent). Can we claim that A = B. Answer to the thinking: Not necessarily. − − A 1 1 B 2 2 , As a counterexample, both = − and = − have eigenvalues 0 0 1 1 2 2 1 and single eigenvector but they are not equal. 1 If however the eigenvectors span the n-dimensional space (such as there are n of them and they are linearly independent),thenA = B. λ1 0 ··· 0 0 λ2 ··· 0 Hint for Problem 25: A x ··· xn = x ··· xn . 1 1 ...... 00··· λn This important fact will be re-emphasized in Section 6.2. 6.1 Some discussions on problems 6-25
(Problem 26, Section 6.1) The block B has eigenvalues 1, 2andC has eigenvalues 3, 4andD has eigenvalues 5, 7. Find the eigenvalues of the 4 by 4 matrix A: 0130 BC −2304 A = = . 0 D 0061 0016
BC Thinking over Problem 26: The eigenvalues of are the eigenvalues of B 0 D and D because we can show BC det = det(B) · det(D). 0 D
(See Problems 23 and 25 in Section 5.2.) 6.1 Some discussions on problems 6-26
(Problem 23, Section 5.2) With 2 by 2 blocks in 4 by 4 matrices, you cannot always use block determinants: AB AB = |A||D| but = |A||D|−|C||B|. 0 D CD (a) Why is the first statement true? Somehow B doesn’t enter. Hint: Leibniz formula. (b) Show by example that equality fails (as shown) when C enters. (c) Show by example that the answer det(AD − CB)isalsowrong. 6.1 Some discussions on problems 6-27
−1 (Problem 25, Section 5.2) Block elimination subtracts CA times the first row AB from the second row CD. This leaves the Schur complement D − CA−1B in the corner: I 0 AB AB = . −CA−1 I CD 0 D − CA−1B Take determinants of these block matrices to prove correct rules if A−1 exists: AB |A||D − CA−1B| |AD − CB| AC CA. CD = = provided = Hint: det(A) · det(D − CA−1B)=det A(D − CA−1B) . 6.1 Some discussions on problems 6-28
(Problem 37, Section 6.1) (a) Find the eigenvalues and eigenvectors of A. They depend on c: .41− c A = . .6 c
(b) Show that A has just one line of eigenvectors when c =1.6. (c) This is a Markov matrix when c = .8. Then An will approach what matrix A∞?
Definition (Markov matrix): A Markov matrix is a matrix with positive entries, for which every column adds to one.
• Note that some researchers define the Markov matrix by replacing positive by non-negative; thus, they will use the term positive Markov matrix.Thebelow observations hold for Markov matrices with “non-negative entries.” 6.1 Some discussions on problems 6-29
• 1 must be one of the eigenvalues of a Markov matrix. 1 1 Proof: A and AT have the same eigenvalues, and AT . = .. 2 1 1
• The eigenvalues of a Markov matrix satisfy |λ|≤1. ATv λv n a v λv v Proof: = implies i=1 i,j i = j; hence, by letting k satisfying |vk| =max1≤i≤n |vi|,wehave n n n |λv | a v ≤ a |v |≤ a |v | |v | k = i,k i i,k i i,k k = k i=1 i=1 i=1 which implies the desired result. 2 6.1 Some discussions on problems 6-30
(Problem 31, Section 6.1) If we exchange rows 1 and 2 and columns 1 and 2, the eigenvalues don’t change. Find eigenvectors of A and B for λ1 = 11. Rank one gives λ2 = λ3 =0. 121 633 A = 363 and B = PAPT = 211 . 484 844
Thinking over Problem 31: This is sometimes useful in determining the eigenval- ues and eigenvectors.
• If Av = λv and B = PAP−1 (for any invertible P ), then λ and u = P v are the eigenvalue and eigenvector of B.
Proof: Bu = PAP−1u = PAv = P (λv)=λP v = λu. 2 010 For example, P = 100.Insuchcase,P −1 = P = P T. 001 6.1 Some discussions on problems 6-31
With the fact above, we can further claim that Proposition. The eigenvalues of AB and BA are equal if one of A and B is invertible. Proof: Thiscanbeprovedby(AB)=A(BA)A−1 (or (AB)=B−1(BA)B). Note that AB has eigenvector Av or (B−1v)ifv is the eigenvector of BA. 2 6.1 Some discussions on problems 6-32 λ 0 ··· 0 1 0 λ ··· 0 The eigenvectors of a diagonal matrix Λ = 2 are apparently ...... 00··· λn 0 . 0 e i ,i , ,...,n i = 1 position =1 2 0 . 0
−1 Hence, A = SΛS have the same eigenvalues as Λ and have eigenvectors vi = Sei.Thisimpliesthat S = v1 v2 ··· vn where {vi} are the eigenvectors of A. What if S is not invertible? Then, we have the next theorem. 6.2 Diagnolizing a matrix 6-33
A convenience for the eigen-system analysis is that we can diagonalize a matrix. Theorem. A matrix A can be written as λ1 0 ··· 0 0 λ2 ··· 0 A v v ··· vn = v v ··· vn 1 2 1 2 ...... =S 00··· λn =Λ { λ , v }n A where ( i i) i=1 are the eigenvalue-eigenvector pairs of . Proof: The theorem holds by definition of eigenvalues and eigenvectors. 2 Corollary.IfS is invertible, then S−1AS = Λ equivalently A = SΛS−1. 6.2 Diagnolizing a matrix 6-34
This makes easy the computation of polynomial with argument A. Proposition (recall Slide 6-9): Matrix
m m−1 αmA + αm−1A + ···+ α0I has the same eigenvectors as A, but its eigenvalues become
m m−1 αmλ + αm−1λ + ···+ α0, where λ is the eigenvalue of A.
Proposition: m m−1 m m−1 −1 αmA + αm−1A + ···+ α0I = S αmΛ + αm−1Λ + ···+ α0 S
Proposition: Am = SΛmS−1 6.2 Diagnolizing a matrix 6-35
Exception • It is possible that an n × n matrix does not have n eigenvectors. 1 −1 Example. A = . 1 −1
Solution.
– λ1 = 0 is apparently an eigenvalue for a non-invertible matrix. λ A − λ − − λ – The second eigenvalue is 2 = trace( ) 1 =[1+( 1)] 1 =0. 1 – This matrix however only has one eigenvector (or some researchers may 1 say “two repeated eigenvectors”). – In such case, we still have 1 −1 11 11 0 0 − = 1 1 11 11 00 A S S Λ but S has no inverse. 2 In such case, we cannot preform SAS−1;sowesayA cannot be diagonalized. 6.2 Diagnolizing a matrix 6-36
When will S be guaranteed to be invertible? One easy answer: When all eigenvalues are distinct (and there are n eigenvalues).
Proof: Suppose for a matrix A,thefirstk eigenvectors v1,...,vk are linearly inde- pendent, but the (k +1)th eigenvector is dependent on the previous k eigenvectors. Then, for some unique a1,...,ak,
vk+1 = a1v1 + ···+ akvk, which implies λk+1vk+1 = Avk+1 = a1Av1 + ...+ akAvk = a1λ1v1 + ...+ akλkvk
λk+1vk+1 = a1λk+1v1 + ...+ akλk+1vk
Accordingly, λk+1 = λi for 1 ≤ i ≤ k; i.e., that vk+1 is linearly dependent on v1,...,vk only occurs when they have the same eigenvalues. (The proof is not yet completed here! See the discussion and Example below.)
Theaboveproofsaidthataslongaswehavethe(k + 1)th eigenvector, then it is linearly dependent on the previous k eigenvectors only when all eigenvalues are equal! But sometimes, we do not guarantee to have the (k + 1)th eigenvector. 6.2 Diagnolizing a matrix 6-37 5421 01−1 −1 Example. A = . −1 −13 0 11−12 Solution. • We have four eigenvalues 1, 2, 4, 4. • But we can only have three eigenvectors for 1, 2and4. • From the proof in the previous page, the eigenvectors for 1, 2 and 4 are linearly indepedent because 1, 2 and 4 are not equal. 2
Proof (Continue): One condition that guarantees to have n eigenvectors is that we have n distinct eigenvalues, which now complete the proof. 2
Definition (GM and AM): The number of linearly independent eigenvectors for an eigenvalue λ is called its geometric multiplicity;thenumberofits appearance in solving det(A − λI) is called its algebraic multiplicity. Example. In the above example, GM=1 and AM=2 for λ =4. 6.2 Fibonacci series 6-38
Eigen-decomposition is useful in solving Fibonacci series
Problem: Suppose Fk+2 = Fk+1 + Fk with initially F1 =1andF0 =0.FindF100. Answer: k+1 Fk 11 Fk Fk 11 F • +2 = +1 =⇒ +2 = 1 Fk+1 10 Fk Fk+1 10 F0 • So, 99 F 11 1 1 100 = = SΛ99S−1 F99 10 0 0 √ √ √ 99 √ √ 1+ 5 −1 1+ 5 1− 5 0 1+ 5 1− 5 1 2 2 2 2 2 = √ 99 11 1− 5 11 0 0 2 √ √ √ 99 √ 1+ 5 1+ 5 1− 5 0 1 1 −1− 5 1 2 2 2 √2 = √ 99 √ √ 11 1− 5 1+ 5 − 1− 5 −1 1+ 5 0 0 2 2 2 2 6.2 Fibonacci series 6-39 √ √ 100 − 100 1+ 5 1 5 2 2 1 1 = √ 99 √ 99 √ √ 1+ 5 1− 5 1+ 5 − 1− 5 −1 2 2 2 2 √ √ 100 − 100 1+ 5 − 1 5 1 2 2 = √ √ √ 99 √ 99 . 1+ 5 − 1− 5 1+ 5 − 1− 5 2 2 2 2 2 k 6.2 Generalization to uk = A u0 6-40 • A generalization of the solution to Fibonacci series is as follows.
Suppose u0 = a1v1 + a2v2 + ···+ anvn. k k k k Then uk = A u0 = a1A v1 + a2A v2 + ···+ anA vn a λkv a λkv ... a λkv = 1 1 1 + 2 2 2 + + n n n. Examination: k Fk+1 11 F1 k uk = = = A u0 Fk 10 F0 and √ √ 1 1 1+ 5 1 1− 5 2 2 u0 = = √ √ − √ √ 0 1+ 5 − 1− 5 1 1+ 5 − 1− 5 1 2 2 2 2 a1 a2
So √ √ 99 √ − 99 √ 1+ 5 1 5 − F 2 1+ 5 2 1 5 100 2 2 u99 = = √ √ − √ √ . F99 1+ 5 − 1− 5 1 1+ 5 − 1− 5 1 2 2 2 2 6.2 More applications of eigen-decomposition 6-41
(Problem 30, Section 6.2) Suppose the same S diagonalizes both A and B.They −1 −1 have the same eigenvectors in A = SΛ1S and B = SΛ2S . Prove that AB = BA. Proposition: Suppose both A and B can be diagnosed, and one of A and B have distinct eigenvalues. Then, A and B have the same eigenvectors if, and only if, AB = BA. Proof: • (See Problem 30 above) If A and B have the same eigenvectors, then
−1 −1 −1 −1 −1 −1 AB =(SΛAS )(SΛBS )=SΛAΛBS = SΛBΛAS =(SΛBS )(SΛAS )=BA.
• Suppose without loss of generality that the eigenvalues of A are all distinct.
Then, if AB = BA,wehaveforgivenAv = λv and u = Bv,
Au = A(Bv)=(AB)v =(BA)v = B(Av)=B(λv)=λ(Bv)=λu.
Hence, u = Bv and v are both the eigenvectors of A corresponding to the same eigenvalue λ. 6.2 More applications of eigen-decomposition 6-42 u n a v {v }n A Let = i=1 i i,where i i=1 are linearly independent eigenvectors of . Au n a Av n a λ v = i=1 i i = i=1 i i i =⇒ ai(λi − λ)=0for1≤ i ≤ n. Au λ u n a λ v = = i=1 i i This implies u = aivi = av = Bv .
i:λi=λ Thus, v is an eigenvector of B. 2 6.2 Problem discussions 6-43
−1 (Problem 32, Section 6.2) Substitute A = SΛS into the product (A − λ1I)(A − λ2I) ···(A − λnI) and explain why this produces the zero matrix. We are sub- stituting the matrix A for the number λ in the polynomial p(λ)=det(A − λI). The Cayley-Hamilton Theorem says that this product is always p(A)=zero matrix,evenifA is not diagonalizable. Thinking over Problem 32: The Cayley-Hamilton Theorem can be easily proved if A is diagonalizable. Corollary (Cayley-Hamilton): Suppose A has linearly independent eigen- vectors. (λ1I − A)(λ2I − A) ···(λnI − A) = all-zero matrix
Proof:
(λ1I − A)(λ2I − A) ···(λnI − A) −1 −1 −1 = S(λ1I − Λ)S S(λ2I − Λ)S ···S(λnI − Λ)S S λ I − λ I − ··· λ I − S−1 = ( 1 Λ) ( 2 Λ) ( n Λ) (1, 1)th entry= 0 (2, 2)th entry= 0 (n, n)th entry= 0 = S all-zero entries S−1 = all-zero entries 2 6.3 Applications to differential equations 6-44
We can use eigen-decomposition to solve u (t) a , u (t)+a , u (t)+···+ a ,nun(t) 1 1 1 1 1 2 2 1 d u (t) du a , u (t)+a , u (t)+···+ a ,nun(t) 2 = = Au = 2 1 1 2 2 2 2 dt . dt . un(t) an,1u1(t)+an,2u2(t)+···+ an,nun(t) where A is called the companion matrix. By differential equation technique, we know that the solution is of the form u = eλtv for some constant λ and constant vector v. du λ v Au Question: What are all and that satisfy dt = ?
du u eλtv λeλtv Au Aeλtv Solution: Take = into the equation: dt = = = =⇒ λv = Av.
The answer are all eigenvalues and eigenvectors. 6.3 Applications to differential equations 6-45 du Au Since dt = is a linear system, a linear combination of solutions is still a solution.
Hence, if there are n eigenvectors, the complete solution can be represented as
λ1t λ2t λnt c1e v1 + c2e v2 + ···+ cne vn where c1,c2,...,cn are determined by the initial conditions.
Here, for convenience of discussion at this moment, we assume that A gives us exactly n eigenvectors. 6.3 Applications to differential equations 6-46 du Au It is sometimes convenient to re-express the solution of dt = for a given initial condition u(0) as eAtu(0), i.e.,
λ1t λ2t λnt c e v + c e v + ···+ cne vn 1 1 2 2 eλ1t 0 ··· 0 c 1 eλ2t ··· c 0 0 2 Λt −1 At = v v ··· vn = SeS Sc = e u(0) 1 2 ...... . eAt u(0) S λnt 00··· e cn where we define λ1t e 0 ··· 0 eλ2t ··· t − 0 0 − − SeΛ S 1 = S S 1, if S 1 exists ...... eAt . . . ··· eλnt 00 ∞ 1 k − (At) , holds no matter whether S 1 exist or not k! k=0 6.3 Applications to differential equations 6-47 and hence u(0) = Sc.Notethat ∞ ∞ ∞ ∞ 1 tk tk tk (At)k = Ak = (SΛkS−1)=S Λk S−1 = SeΛtS−1. k! k! k! k! k=0 k=0 k=0 k=0
Key to remember: Again, we define by convention that f(λ )0··· 0 1 0 f(λ ) ··· 0 f(A)=S 2 S−1. ...... 00··· f(λn) eλ1t 0 ··· 0 0 eλ2t ··· 0 So, eAt = S S−1. ...... 00··· eλnt 6.3 Applications to differential equations 6-48
We can solve the second-order equation in the same manner. Example. Solve my + by + ky =0. Answer: • Let z = y . Then, the problem is reduced to y = z du =⇒ = Au z − k y − b z dt = m m y u A 01 with = z and = − k − b . m m • The complete solution for u is therefore
λ1t λ2t c1e v1 + c2e v2. 2 6.3 Applications to differential equations 6-49
Example. Solve y + y = 0 with initial y(0) = 1 and y (0) = 0. Solution. • z y Let = . Then, the problem is reduced to y = z du =⇒ = Au z = −y dt y 01 with u = and A = . z −10 • The eigenvalues and eigenvectors of A are −ı ı ı, and −ı, . 1 1 • The complete solution for u is therefore y −ı ı 1 −ı ı = c eıt + c e−ıt with initially = c + c . y 1 1 2 1 0 1 1 2 1 c ı c − ı So, 1 = 2 and 2 = 2.Thus, ı ı 1 1 y(t)= eıt(−ı) − e−ıtı = eıt + e−ıt =cos(t). 2 2 2 2 2 6.3 Applications to differential equations 6-50
In practice, we will use a discrete approximation to approximate a continuous function. There are however three different discrete approximations. For example, how to approximate y (t)=−y(t) by a discrete system?
−Yn−1 Forward approximation Y − Y Y Yn+1−Yn − Yn−Yn−1 n+1 2 n + n−1 ∆t ∆t = = −Yn Centered approximation (∆t)2 ∆t −Yn+1 Backward approximation Let’s take forward approximation as an example, i.e., Yn+1−Yn − Yn−Yn−1 ∆t ∆t Zn Zn−1 = −Yn− . ∆t 1 Thus, Yn − Yn +1 Z y (t)=z(t) = n Yn 1∆t Yn ≈ ∆t ≡ +1 = Zn − Zn Z − t Z z (t)=−y(t) +1 −Y n+1 ∆ 1 n t = n ∆ un+1 A un 6.3 Applications to differential equations 6-51
Then, we obtain 11 (1 + ı∆t)n 0 1 − ı 1 u Anu 2 2 n = 0 = ı −ı − ı t n 1 ı 0(1∆ ) 2 2 0 and ı t n − ı t n (1 + ∆ ) +(1 ∆ ) 2 n/2 Yn = = 1+(∆t) cos(n · ∆θ) 2 where ∆θ =tan−1(∆t).
Problem: |Yn|→∞as n large. 6.3 Applications to differential equations 6-52
Backward approximation will leave to Yn → 0asn large.
Yn − Yn +1 Z y (t)=z(t) = n+1 1 −∆t Yn Yn ≈ ∆t ≡ +1 = Zn − Zn t Z Z z (t)=−y(t) +1 −Y ∆ 1 n+1 n t = n+1 ∆ A˜−1 un+1 un
A solution to the problem: Interleave the forward approximation with the backward approximation. Yn − Yn +1 Z y (t)=z(t) = n 10 Yn 1∆t Yn ≈ ∆t ≡ +1 = Zn − Zn t Z Z z (t)=−y(t) +1 −Y ∆ 1 n+1 01 n t = n+1 ∆ un+1 un
We then perform this so-called leapfrog method.(SeeProblems 28 and 29.)
−1 Yn+1 10 1∆t Yn 1∆t Yn Z = t Z = − t − t 2 Z n+1 ∆ 1 01 n ∆ 1 (∆ ) n un+1 un |eigenvalues|=1 if ∆t≤2 un 6.3 Applications to differential equations 6-53
(Problem 28, Section 6.3) Centering y = −y in Example 3 will produce Yn+1 − 2 2Yn + Yn−1 = −(∆t) Yn. This can be written as a one-step difference equation for U =(Y,Z): Yn = Yn +∆tZn 10 Yn 1∆t Yn +1 +1 = . Zn+1 = Zn − ∆tYn+1 ∆t 1 Zn+1 01 Zn
Invert the matrix on the left side to write this as Un+1 = AUn. Show that detA = ¯ 1. Choose the large time step ∆t = 1 and find the eigenvalues λ1 and λ2 = λ1 of A: 11 A = has |λ | = |λ | =1. Show that A6 is exactly I. −10 1 2
After 6 steps to t =6,U6 equals U0.Theexacty =cos(t) returns to 1 at t =2π. 6.3 Applications to differential equations 6-54
(Problem 29, Section 6.3) That centered choice (leapfrog method)inProblem28is√ very successful for small time steps ∆t. But find the eigenvalues of A for ∆t = 2 and 2: √ 1 2 12 A = √ and A = . − 2 −1 −2 −3 Both matrices have |λ| = 1. Compute A4 in both cases and find the eigenvectors of A. That value ∆t = 2 is at the border of instability. Time steps ∆t>2 will n lead to |λ| > 1, and the powers in Un = A U0 will explode. Note You might say that nobody would compute with ∆t>2. But if an atom vibrates with y = −1000000y,then∆t>.0002 will give instability. Leapfrog has a very strict stability limit. Yn+1 = Yn +3Zn and Zn+1 = Zn − 3Yn+1 will explode because ∆t = 3 is too large. 6.3 Applications to differential equations 6-55
A better solution to the problem: Mix the forward approximation with the backward approximation. Y − Y Z Z n+1 n n+1 + n y (t)=z(t) = −∆t Y ∆t Y t 1 2 n+1 1 2 n ≈ ∆ 2 ≡ t = t Zn+1 − Zn Yn+1 + Yn ∆ Z −∆ Z z (t)=−y(t) − 2 1 n+1 2 1 n t = ∆ 2 un+1 un
We then perform this so-called trapezoidal method.(SeeProblem 30.) − Y −∆t 1 ∆t Y − ∆t 2 t Y n+1 1 2 1 2 n 1 1 2 ∆ n = t t = Z ∆ −∆ Z ∆t 2 ∆t 2 Z n+1 2 1 2 1 n 1+ −∆t 1 − n 2 2 u u u n+1 n |eigenvalues|=1 for all ∆t>0 n 6.3 Applications to differential equations 6-56
(Problem 30, Section 6.3) Another good idea for y = −y is the trapezoidal method (half forward/half back): Thismaybethebestwaytokeep(Yn,Zn)exactlyona circle. 1 −∆t/2 Yn 1∆t/2 Yn Trapezoidal +1 = . ∆t/21 Zn+1 −∆t/21 Zn
(a) Invert the left matrix to write this equation as Un+1 = AUn. Show that A is an T orthogonal matrix: A A = I. These points Un never leave the circle. A =(I − B)−1(I + B) is always an orthogonal matrix if BT = −B (See the proof on next page).
(b) (Optional MATLAB) Take 32 steps from U0 =(1, 0) to U32 with ∆t =2π/32. Is U32 = U0? I think there is a small error. 6.3 Applications to differential equations 6-57
Proof: A =(I − B)−1(I + B) ⇒ (I − B)A =(I + B) ⇒ AT(I − B)T =(I + B)T ⇒ AT(I − BT)=(I + BT) ⇒ AT(I + B)=(I − B) ⇒ AT =(I − B)(I + B)−1
⇒ ATA =(I − B)(I + B)−1(I − B)−1(I + B) =(I − B)[(I − B)(I + B)]−1(I + B) =(I − B)[(I + B)(I − B)]−1(I + B) =(I − B)(I − B)−1(I + B)−1(I + B) = I 6.3 Applications to differential equations 6-58
Question: What if the number of eigenvectors is smaller than n? Recall that it is convenient to say that the solution of du/dt = Au is eAtu(0) for some constant vecor u(0). Conveniently, we can presume that eAtu(0) is the solution of du/dt = Au even if A does not have n eigenvectors. Example. Solve y − 2y + y = 0 with initially y(0) = 1 and y (0) = 0. Answer. • Let z = y . Then, the problem is reduced to y = z du y 01 y =⇒ = = = Au z −y z dt z − z = +2 12 A • The solution is still eAtu(0) but eAt = SeΛtS−1 because there is only one eigenvector for A (it doesn’t matter whether we regard this case as “S does not exist” or we regard this case as “S exists but has no inverse”). 6.3 Applications to differential equations 6-59 01 11 11 10 eAt eλIte(A−λI)t eλIte(A−I)t − = and = = 12 11 11 01 A S S Λ
eAt eIt+(A−I)t ? eIte(A−I)t It A − I t A − I t It . . = = (since ( ) (( ) ) = (( ) )( ) See below ) ∞ ∞ 1 1 = Iktk (A − I)ktk k! k! k=0 k=0 ∞ ∞ 1 1 1 1 = tk I (A − I)ktk Can we use eAt = Aktk? k! k! k! k=0 k=0 k=0 where we know Ik = I for every k ≥ 0and(A − I)k =all-zero matrix for k ≥ 2. This gives eAt = etI (I +(A − I)t)=et (I +(A − I)t) and 1 1 1 − t u(t)=eAtu(0) = eAt = et (I +(A − I)t) = et 0 0 −t 2 Note that eAeB, eBeA and eA+B may not be equal except AB = BA! 6.3 Applications to differential equations 6-60
Properities of eAt • The eigenvalues of eAt are eλt for all eigenvalues λ of A. For example, eAt = et (I +(A − I)t) has repeated eigenvalues et, et. At • The eigenvectors of e remain the same as A. 1 For example, eAt = et (I +(A − I)t) has only one eigenvector . 1 • The inverse of eAt always exist (hint: eλt = 0), and is equal to e−At. For example, e−At = e−Ite(I−A)t = e−t(I − (A − I)t). • The transpose of eAt is ∞ T ∞ T 1 1 T eAt = Aktk = (AT)ktk = eA t. k! k! k=0 k=0 T Hence, if AT = −A (skew-symmetric), then eAt T eAt = eA teAt = I;so,eAt is an orthogonal matrix (Recall that a matrix Q is orthogonal if QTQ = I). 6.3 Quick tip 6-61 a a • × A 1,1 1,2 For a 2 2 matrix = a a , the eigenvector corresponding to the 2,1 2,2 a1,2 eigenvalue λ1 is . λ1 − a1,1 a1,2 a1,1 − λ1 a1,2 a1,2 0 (A − λ1I) = = λ1 − a1,1 a2,1 a2,2 − λ1 λ1 − a1,1 ?
where ?= 0 because the two row vectors of (A − λ1I) are parallel.
This is especially useful when solving the differential equation because a1,1 =0 and a1,2 = 1; hence, the eigenvector v1 corresponding to the eigenvalue λ1 is 1 v1 = . λ1 Solve y − 2y + y =0.Letz = y . Then, the problem is reduced to y = z du y 01 y 1 =⇒ = = = Au =⇒ v = dt z − z 1 z = −y +2z 12 1 A 6.3 Problem discussions 6-62
Problem 6.3B and Problem 31: Aconvenientwaytosolved2u/dt = −A2u.
(Problem 31, Section 6.3) The cosine of a matrix is defined like eA,bycopying the series for cos t: 1 1 1 1 cos t =1− t2 + t4 −··· cos A = I − A2 + A4 −··· 2! 4! 2! 4! (a) If Ax = λx, multiply each term times x to find the eigenvalue of cos A. ππ A , , − (b) Find the eigenvalues of = ππ with eigenvectors (1 1) and (1 1). From the eigenvalues and eigenvectors of cos A, find that matrix C =cosA. (c) The second derivative of cos(At)is−A2 cos(At). d2u u(t)=cos(At)u(0) solve = −A2u starting from u (0) = 0. dt2 Construct u(t)=cos(At)u(0) by the usual three steps for that specific A:
1. Expand u(0) = (4, 2) = c1x1 + c2x2 in the eigenvectors. 2. Multiply those eigenvectors by and (instead of eλt).
3. Add up the solution u(t)=c1 x1 + c2 x2. (Hint: See Slide 6-48.) 6.3 Problem discussions 6-63
(Problem 6.3B,√ Section 6.3) Find the eigenvalues and eigenvectors of A and write u(0) = (0, 2 2, 0) as a combination of the eigenvectors. Solve both equations u = Au and u = Au: − − du 21 0 d2u 21 0 du = 1 −21 u and = 1 −21 u with (0) = 0. dt dt2 dt 01−2 01−2 The 1, −2, 1 diagonals make A into a second difference matrix (like a second derivative). u = Au is like the heat equation ∂u/∂t = ∂2u/∂x2. Its solution u(t) will decay (negative eigenvalues). u = Au islikethewaveequation∂2u/∂t2 = ∂2u/∂x2. Its solution will oscillate (imaginary eigenvalues). 6.3 Problem discussions 6-64
d2u Au Thinking over Problem 6.3B and Problem 31: How about solving dt = . 1/2 Solution. The answer should be eA tu(0). Note that A1/2 has the same eigenvec- tors as A but its eigenvalues are the square root of A. λ1/2t e 1 ··· 0 0 λ1/2t A1/2t 0 e 2 ··· 0 −1 e = S S . ...... 1/2 00··· eλn t −21 0 As an example from Problem 6.3B, A = 1 −21. − √ 01 2 The eigenvalues of A are −2and−2 ± 2. So, √ eı 2 t 0 √ √ A1/2t ı − t −1 e = S 0 e 2 2 0 S . √ √ 00eı 2+ 2 t 6.3 Problem discussions 6-65
Final reminder on the approach delivered in the textbook du t • ( ) At u In solving dt = with initial (0), it is convenient to find the solution as u(0) = Sc = c1v1 + ···+ cnvn Λt λ1 λn u(t)=Se c = c1e v1 + ···+ cne vn This is what the textbook always does in Problems. 6.3 Problem discussions 6-66
You should practice these problems by yourself!
du Au − b A Problem 15: How to solve dt = (for invertible )?
−1 (Problem 15, Section 6.3) A particular solution to du/dt = Au − b is up = A b, if A is invertible. The usual solutions to du/dt = Au give un. Find the complete solution u = up + un: du du 10 4 (a) = u − 4(b) = u − . dt dt 11 6
du Au − ectb c A Problem 16: How to solve dt = (for non-eigenvalue of )? (Problem 16, Section 6.3) If c is not an eigenvalue of A, substitute u = ectv and find a particular solution to du/dt = Au−ectb. How does it break down when c is an eigenvalue of A? The “nullspace” of du/dt = Au contains the usual solutions λ t e i xi. Hint: The particular solution vp satisfies (A − cI)vp = b. 6.3 Problem discussions 6-67
Problem 23: eAeB, eBeA and eA+B are not necessarily equal. (They are equal when AB = BA.) (Problem 23, Section 6.3) Generally eAeB is different from eBeA. They are both different from eA+B. Check this using Problems 21-2 and 19. (If AB = BA,all these are the same.) 14 0 −4 10 A = B = A + B = 00 00 00
Problem 27: An interesting brain-storming problem! (Problem 27, Section 6.3) Find a solution x(t), y(t) that gets large as t →∞.To avoid this instability a scientist exchanges the two equations: dx/dt =0x − 4y dy/dt = −2x +2y becomes dy/dt = −2x +2y dx/dt =0x − 4y −22 Now the matrix is stable. It has negative eigenvalues. How can this be? 0 −4 Hint: The matrix is not the right one to be used to describe the transformed linear equations. 6.4 Symmetric matrices 6-68
Definition (Symmetric matrix): A matrix A is symmetric if A = AT.
• When A is symmetric, – its eigenvalues are all reals; – it has n orthogonal eigenvectors (so it can be diagonalized). We can then normalize these orthogonal eigenvectors to obtain an orthonormal basis.
Definition (Symmetric diagonalization): A symmetric matrix A can be written as A = QΛQ−1 = QΛQT with Q−1 = QT where Λ is a diagonal matrix with real eigenvalues at the diagonal. 6.4 Symmetric matrices 6-69
Theorem (Spectral theorem or principal axis theorem) A symmetric matrix A with distinct eigenvalues can be written as A = QΛQ−1 = QΛQT with Q−1 = QT where Λ is a diagonal matrix with real eigenvalues at the diagonal. Proof: • We have proved on Slide 6-16 that a symmetric matrix has real eigenvalues and a skew-symmetric matrix has pure imaginary eigenvalues. • vTv Next, we prove that eigenvectors are orthogonal, i.e., 1 2 =0,forunequal eigenvalues λ1 = λ2.
Av1 = λ1v1 and Av2 = λ2v2 ⇒ λ v Tv Av Tv vTATv vTAv vTλ v = ( 1 1) 2 =( 1) 2 = 1 2 = 1 2 = 1 2 2 which implies λ − λ vTv . ( 1 2) 1 2 =0 vTv λ λ 2 Therefore, 1 2 = 0 because 1 = 2. 6.4 Symmetric matrices 6-70
• A = QΛQT implies that A λ q qT ··· λ q qT = 1 1 1 + + n n n = λ1P1 + ···+ λnPn
Recall from Slide 4-37, the projection matrix Pi onto a unit vector qi is
T −1 T T Pi = qi(qi qi) qi = qiqi So, Ax is the sum of projections of vector x onto each eigenspace.
Ax = λ1P1x + ···+ λnPnx.
The spectral theorem can be extended to a symmetric matrix with repeated eigenvalues. To prove this, we need to introduce the famous Schur’s theorem. 6.4 Symmetric matrices 6-71
Theorem (Schur’s Theorem): Every square matrix A can be factorized into A = QT Q−1 with T upper triangular and Q† = Q−1, where “†” denotes Hermitian transpose operation (and Q and T can generally have complex entries!) Further, if A has real eigenvalues (and hence has real eigenvectors), then Q and T can be chosen real. Proof: The existence of Q and T such that A = QT Q† and Q† = Q−1 can be proved by induction. • The theorem trivially holds when A is a 1 × 1 matrix. • Suppose that Schur’s Theorem is valid for all (n−1)×(n−1) matrices. Then, we claim that Schur’s Theorem will hold true for all n × n matrices. t q – This is because we can take 1,1 and 1 to be the eigenvalue and unit eigenvector of An×n (as there must exist at least one pair of eigenvalue and A p ... p eigenvector for n×n). Then, choose any unit 2, , n such that they together with q span the n-dimensional space, and 1 P q p ··· p = 1 2 n is a (unitary) orthonormal matrix, satisfying P † = P −1. 6.4 Symmetric matrices 6-72
– We derive q† 1 p† † 2 P AP = Aq Ap ··· Apn . 1 2 p† n q† 1 p† 2 = t , q Ap ··· Apn (since A q = t , q ) . 1 1 1 2 1 1 1 1 p† eigenvector eigenvalue n t1,1 t1,2 ···t1,n 0 = . ˜ . A(n−1)×(n−1) 0 t q†Ap where 1,j = 1 j and p†Ap ··· p†Ap 2 2 2 n ˜ . . . A(n−1)×(n−1) = . .. . . p† Ap ··· p† Ap n 2 n n 6.4 Symmetric matrices 6-73
– Since Schur’s Theorem is true for any (n − 1) × (n − 1) matrix, we can find ˜ ˜ Q(n−1)×(n−1) and T(n−1)×(n−1) such that A˜ Q˜ T˜ Q˜† (n−1)×(n−1) = (n−1)×(n−1) (n−1)×(n−1) (n−1)×(n−1) and Q˜† Q˜−1 . (n−1)×(n−1) = (n−1)×(n−1) – Finally, define T Q P 1 0 n×n = ˜ 0 Q(n−1)×(n−1) and ˜ t1,1 [t1,2 ··· t1,n]Q(n−1)×(n−1) T 0 . n×n = . ˜ . T(n−1)×(n−1) 0 They satisfy that 1 0T T Q† Q P †P 1 0 I n×n n×n = Q˜† Q˜ = n×n 0 (n−1)×(n−1) 0 (n−1)×(n−1) 6.4 Symmetric matrices 6-74
and ˜ t1,1 [t1,2 ··· t1,n]Q(n−1)×(n−1) 1 0T Q T P 0 n×n n×n = n×n ˜ . ˜ 0 Q(n−1)×(n−1) . T(n−1)×(n−1) 0 ˜ t1,1 [t1,2 ··· t1,n]Q(n−1)×(n−1) P 0 = n×n . ˜ ˜ . Q(n−1)×(n−1)T(n−1)×(n−1) 0 ˜ t1,1 [t1,2 ··· t1,n]Q(n−1)×(n−1) P 0 = n×n . ˜ ˜ . A(n−1)×(n−1)Q(n−1)×(n−1) 0 t1,1 t1,2 ···t1,n 1 0T P 0 = n×n . ˜ ˜ . A(n−1)×(n−1) 0 Q(n−1)×(n−1) 0 T P P † A P 1 0 A Q = n×n( n×n n×n n×n) ˜ = n×n n×n 0 Q(n−1)×(n−1) 6.4 Symmetric matrices 6-75
• It remains to prove that if A has real eigenvalues, then Q and T can be chosen real, which can be similarly proved by induction. Suppose “If A has real eigenvalues, then Q and T can be chosen real.” is true for all (n − 1) × (n − 1) matrices. Then, the claim should be true for all n × n matrices.
– For a given An×n with real eigenvalues (and real eigenvectors), we can t q p ,...,p certainly have the real 1,1 and 1,andsoare 2 n. This makes real ˜ the resultant A(n−1)×(n−1) and t1,2,...,t1,n.
The eigenvector associated with a real eigenvalue can be chosen real. For complex v,andrealλ and A, by its definition, Av = λv is equivalent to A · Re{v} = A · λ · Re{v} and A · Im{v} = A · λ · Im{v}. 6.4 Symmetric matrices 6-76
˜ – The proof is completed by noting that the eigenvalues of A(n−1)×(n−1), satisfying t1,1 t1,2 ···t1,n P †AP 0 , = . ˜ . A(n−1)×(n−1) 0
are also the eigenvalues of An×n; hence, they are all reals. So, by the validity of the claimed statement for (n − 1) × (n − 1) matrices, the existence of ˜ real Q(n−1)×(n−1) and real T(n−1)×(n−1) satisfying A˜ Q˜ T˜ Q˜† (n−1)×(n−1) = (n−1)×(n−1) (n−1)×(n−1) (n−1)×(n−1) and Q˜† Q˜−1 (n−1)×(n−1) = (n−1)×(n−1) is confirmed. 2 6.4 Symmetric matrices 6-77
Two important facts: • P †AP and A have the same eigenvalues but possibly different eigenvectors. A simple proof is that for v = P v , (P †AP )v = P †Av = P †(λv)=λP †v = λv .
• (Section 6.1: Problem 26 or see slide 6-25) BC det(A)=det = det(B) · det(D) 0 D Thus, t1,1 t1,2 ···t1,n P †AP 0 = . ˜ . A(n−1)×(n−1) 0 ˜ implies that the eigenvalues of A(n−1)×(n−1) should be the eigenvalues of † P An×nP . 6.4 Symmetric matrices 6-78
Theorem (Spectral theorem or principal axis theorem) A symmetric matrix A (not necessarily with distinct eigenvalues) can be written as A = QΛQ−1 = QΛQT with Q−1 = QT where Λ is a diagonal matrix with real eigenvalues at the diagonal. Proof:
• A symmetric matrix certainly satisfies Schur’s Theorem. A = QT Q† with T upper triangular and Q† = Q−1.
• A has real eigenvalues. So, both T and Q are reals. • By AT = A,wehave AT = QT TQT = QT QT = A. This immediately gives T T = T which implies the off-diagonals are zeros. 6.4 Symmetric matrices 6-79
• By AQ = QT ,equivalently, Aq = t , q 1 1 1 1 Aq t q 2 = 2,2 2 . . Aqn = tn,nqn we know that Q is the matrix of eigenvectors (and there are n of them) and T is the matrix of eigenvalues. 2
This result immediately indicates that a symmetric A can always be diagonalized. Summary • A symmetric matrix has real eigenvalues and n real orthogonal eigenvec- tors. 6.4 Problem discussions 6-80
(Problem 21, Section 6.4)(Recommended) This matrix M is skew-symmetric and also . Then all its eigenvalues are pure imaginary and they also have |λ| =1.( Mx = x for every x so λx = x for eigenvectors.) Find all four eigenvalues from the trace of M: 0111 1 −10−11 M = √ can only have eigenvalues ı or − ı. 3 −11 0−1 −1 −11 0
Thinking over Problem 21: The eigenvalues of an orthogonal matrix satisfies |λ| =1. Proof: Qv 2 =(Qv)†Qv = v†Q†Qv = v†v = v 2 implies λv = v . 2
|λ| =1andλ pure imaginary implies λ = ±ı. 6.4 Problem discussions 6-81
(Problem 15, Section 6.4) Show that A (symmetric but complex) has only one line of eigenvectors: ı 1 A = is not even diagonalizable: eigenvalues λ =0, 0. 1 −ı AT = A is not such a special property for complex matrices. The good property is A¯T = A (Section 10.2). Then all λ’s are real and eigenvectors are orthogonal. Thinking over Problem 15: That a symmetric matrix A satisfying AT = A has real eigenvalues and n orthogonal eigenvectors is only true for real symmetric matrices. For a complex matrix A, we need to rephrase it as “A Hermitian symmetric matrix A satisfying A† = A has real eigenvalues and n orthogonal eigenvectors.”
Inner product and norm for complex vectors: v · w = w†v and v 2 = v†v 6.4 Problem discussions 6-82
Proof of the red-color claim (in the previous slide): Suppose Av = λv. Then, Av = λv =⇒ (Av)†v =(λv)†v (i.e., (A¯v¯)Tv =(λ¯v¯)Tv) =⇒ v†A†v = λ∗v†v =⇒ v†Av = λ∗v†v =⇒ v†λv = λ∗v†v =⇒ λ v 2 = λ∗ v 2 =⇒ λ = λ∗ 2 (Problem 28, Section 6.4) For complex matrices, the symmetry AT = A that ¯T produces real eigenvalues changes to A = A.From det(A − λI)=0,findthe eigenvalues of the 2 by 2 “Hermitian” matrix A = 42+ı;2− ı 0 = A¯T.To see why eigenvalues are real when A¯T = A, adjust equation (1) of the text to A¯x¯ = λ¯x¯. (See the green box above.) 6.4 Problem discussions 6-83
(Problem 27, Section 6.4) (MATLAB) Take two symmetric matrices with different 10 81 eigenvectors, say A = and B = . Graph the eigenvalues λ (A + tB) 02 10 1 and λ2(A + tB)for−8 Correction for Problem 27: The problem should be “... Graph the eigenvalues λ1 and λ2 of A + tB for −8 Hint: Draw the pictures of λ1(t)andλ2(t) with respect to t and check λ1(t)−λ2(t). 6.4 Problem discussions 6-84 (Problem 29, Section 6.4) Normal matrices have A¯TA = AA¯T. For real matrices, ATA = AAT includes symmetric, skew symmetric, and orthogonal matrices. Those have real λ, imaginary λ and |λ| = 1. Other normal matrices can have any complex eigenvalues λ. Key point: Normal matrices have n orthonormal eigenvectors. Those vectors xi T probably will have complex components. In that complex case orthogonality means x¯ i xj =0 as Chapter 10 explains. Inner products (dot products) become x¯ Ty. The test for n orthonormal columns in Q becomes Q¯TQ = I instead of QTQ = I. A has n orthonormal eigenvectors (A = QΛQ¯T) if and only if A is normal. (a) Start from A = QΛQ¯T with Q¯TQ = I. Show that A¯TA = AA¯T. (b) Now start from A¯TA = AA¯T. Schur found A = QT Q¯T for every matrix A, with a triangular T . For normal matrices we must show (in 3 steps) that this T will actually be diagonal. Then T =Λ. A QT Q¯T A¯TA AA¯T T¯TT T T¯T Step 1. Put = into = to find = . ab Step 2: Suppose T = has T¯TT = T T¯T. Prove that b =0. 0 d Step 3: Extend Step 2 to size n. A normal triangular T must be diagonal. 6.4 Problem discussions 6-85 Important conclusion from Problem 29: A matrix A has n orthogonal eigenvec- tors if, and only if, A is normal. Definition (Normal matrix): A matrix A is normal if A†A = A†A. Proof: • Schur’s theorem: A = QT Q† with T upper triangular and Q† = Q−1. • A†A = AA† =⇒ T †T = TT† =⇒ T diagonal =⇒ Q is the matrix of eigenvectors by AQ = QT . 2 6.5 Positive definite matrices 6-86 Definition (Positive definite): A symmetric matrix A is positive defi- nite if its eigenvalues are all positive. • The above definition only applies to a symmetric matrix because a non-symmetric matrix may have complex eigenvalues (which cannot be compared with zero)! Properties of positive definite (symmetric) matrices A • Equivalent Definition: (symmetric) is positive definite if, and only if, xTAx > 0 for all non-zero x. xTAx is usually referred to as the quadratic form! The proofs can be found in Problems 18 and 19. (Problem 18, Section 6.5) If Ax = λx then xTAx = .IfxTAx > 0, prove that λ>0. 6.5 Positive definite matrices 6-87 (Problem 19, Section 6.5) Reverse Problem 18 to show that if all λ>0 then xTAx > 0. We must do this for every nonzero x, not just the eigenvectors. So write x as a combination of the eigenvectors and explain why all “cross T T terms” are xi xj =0.Thenx Ax is c x ··· c x T c λ x ··· c λ x c2λ xTx ··· c2 λ xTx > . ( 1 1+ + n n) ( 1 1 1+ + n n n)= 1 1 1 1+ + n n n n 0 Proof (Problems 18): T T – (only if part : Problem 19) Avi = λivi implies vi Avi = λivi vi > 0 for all {λ }n {v }n eigenvalues i i=1 and eigenvectors i i=1. The proof is completed by x n c v {v }n noting that with = i=1 i i and i i=1 orthogonal, n n n T T 2 2 x (Ax)= civi cjλjvj = ci λi vi > 0. i=1 j=1 i=1 T T – (if part : Problem 18) Taking x = vi, we obtain that vi (Avi)=vi (λvi)= 2 λi vi > 0, which implies λi > 0. 2 6.5 Positive definite matrices 6-88 • Based on the above proposition, we can conclude similar to two positive scalars (as “if a and b are both positive, so is a + b”) that Proposition: If A and B are both positive definite, so is A + B. • The next property provides an easy way to construct positive definite matrices. T Proposition: If An×n = Rn×mRm×n and Rm×n has linearly independent columns, then A is positive definite. Proof: – x non-zero =⇒ Rx non-zero. – Then, xT(RTR)x =(Rx)T(Rx)= Rx 2 > 0. 2 6.5 Positive definite matrices 6-89 Since (symmetric) A = QΛQT,wecanchooseR = QΛ1/2QT,whichrequires R to be a square matrix. This observation gives another equivalent definition of positive definite matrices. Equivalent Definition: An×n is positive definite if, and only if, there exists T Rm×n with independent columns such that A = R R. 6.5 Positive definite matrices 6-90 A • Equivalent Definition: n×n is positive definite if, and only if, all pivots are positive. Proof: – (By LDU decomposition,) A = LDLT,whereD is a diagonal matrix with pivots as diagonals, and L is a lower triangular matrix with 1 as diagonals. Then, xTAx xT LDLT x LTx TD LTx = ( ) =( ) ( ) lTx 1 = yTDy where y = LTx = . T lnx d y2 ··· d y2 = 1 1 + + n n d lTx 2 ··· d lTx 2 . = 1 1 + + n n – So, if all pivots are positive, xTAx > 0 for all non-zero x.Conversely,if xTAx > 0 for all non-zero x, which in turns implies yTDy > 0 for all non-zero y (as L is invertible), then pivot di must be positive for the choice of yj = 0 except for j = i . 2 6.5 Positive definite matrices 6-91 An extension of the previous proposition is: Suppose A = BCBT with B invertible and C diagonal. Then, A is positive definite if, and only if, diagonals of C are all positive! See Problem 35. (Problem 35, Section 6.5) Suppose C is positive definite (so yTCy > 0 when- ever y = 0)andA has independent columns (so Ax = 0 whenever x = 0). Apply the energy test to xTATCAx to show that ATCA is positive definite: the crucial matrix in engineering. 6.5 Positive definite matrices 6-92 Equivalent Definition: An×n is positive definite if, and only if, the n upper left determinants (i.e., the (n − 1) leading principle minors and det(A)) are all positive. Definition (Minors, Principle minors and leading principle mi- nors): – A minor of a matrix A is the determinant of some smaller square matrix, obtained by removing one or more of its rows or columns. – The first-order (respectively, second-order, etc) minor is a minor, obtained by removing (n − 1) (respectively, (n − 2),etc)rowsorcolumns. – A principle minor of a matrix is a minor, obtained by removing the same rows and columns. – A leading principle minor of a matrix is a minor, obtaining by removing the last few rows and columns. 6.5 Positive definite matrices 6-93 The below example gives three upper left determinants (or two leading prin- ciple minors and det(A)), det(A1×1), det(A2×2)anddet(A3×3). a1,1 a1,2 a1,3 a2,1 a2,2 a2,3 a3,1 a3,2 a3,3 Proof of the equivalent definition: This can be proved based on Slide 5-18: By LU decomposition, 100··· 0 d u , u , ··· u ,n 1 1 2 1 3 1 l , 10··· 0 0 d u , ··· d ,n 2 1 2 2 3 2 A l l ··· d ··· d = 3,1 3,2 1 0 00 3 3,n ...... ...... ln,1 ln,2 ln,3 ··· 1 00 0··· dn det(Ak×k) =⇒ dk = det(A(k−1)×(k−1)) 2 6.5 Positive definite matrices 6-94 n Equivalent Definition: An×n is positive definite if, and only if, all (2 − 2) principle minors and det(A) are positive. • Basedontheaboveequivalent definitions, a positive definite matrix cannot have either zero or negative value in its main diagonals. See Problem 16. (Problem 16, Section 6.5) A positive definite matrix cannot have a zero (or even worse, a negative number) on its diagonal. Show that this matrix fails to have xTAx > 0: x 411 1 x1 x2 x3 102 x2 is not positive when (x1,x2,x3)=( ,,). 125 x3 With the above discussion of positive definite matrices, we can proceed to de- fine similar notions like negative definite, positive semidefinite, negative semidefinite matrices. 6.5 Positive semidefinite (or nonnegative definite) 6-95 Definition (Positive semidefinite): A symmetric matrix A is positive semidefinite if its eigenvalues are all nonnegative. Equivalent Definition: A is positive semidefinite if, and only if, xTAx≥0for all non-zero x. Equivalent Definition: An×n is positive semidefinite if, and only if, there exists T Rm×n (perhaps with dependent columns) such that A = R R. Equivalent Definition: An×n is positive semidefinite if, and only if, all pivots are nonnegative. n Equivalent Definition: An×n is positive semidefinite if, and only if, all (2 −2) principle minors and det(A)arenon-negative. 1 00 Example. Non-“positive semidefinite” A=0 0 0 has (23−2) principle minors. 00−1 1 0 1 0 0 0 det , det , det , det 1 , det 0 , det −1 . 0 0 0 −1 0 −1 Note: It is not sufficient to define the positive semidefinite based on the non- negativity of leading principle minors and det(A). Check the above example. 6.5 Negative definite, negative semidefinite, indefinite 6-96 We can similarly define negative definite. Definition (Negative definite): A symmetric matrix A is negative def- inite if its eigenvalues are all negative. Equivalent Definition: A is negative definite if, and only if, xTAx<0 for all non-zero x. Equivalent Definition: An×n is negative definite if, and only if, there exists T Rm×n with independent columns such that A = −R R. Equivalent Definition: An×n is negative definite if, and only if, all pivots are negative. Equivalent Definition: An×n is negative definite if, and only if, all odd-order leading principle minors are negative and all even-order leading principle minors and det(A)arepositive. Equivalent Definition: An×n is negative definite if, and only if, all odd-order principle minors are negative and all even-order principle minors and det(A)are positive. 6.5 Negative definite, negative semidefinite, indefinite 6-97 We can also similarly define negative semidefinite. Definition (Negative definite): A symmetric matrix A is negative semidefinite if its eigenvalues are all non-positive. Equivalent Definition: A is negative semidefinite if, and only if, xTAx≤0for all non-zero x. Equivalent Definition: An×n is negative semidefinite if, and only if, there T exists Rm×n (possibly with dependent columns) such that A = −R R. Equivalent Definition: An×n is negative semidefinite if, and only if, all pivots are non-positive. Equivalent Definition: An×n is negative definite if, and only if, all odd-order principle minors are non-positive and all even-order principle minors and det(A) are non-negative. Note: It is not sufficient to define the negative semidefinite based on the non- positivity of leading principle minors and det(A). Finally, if some of the eigenvalues of a symmetric matrix are positive, and some are negative, the matrix will be referred to as indefinite. 6.5 Quadratic form 6-98 • For a positive definite matrix A2×2, xTAx xT Q QT x QTx T QTx = ( Λ ) =( ) Λ( ) qTx yT y y QTx 1 = Λ where = = qTx 2 λ qTx 2 λ qTx 2 = 1 1 + 2 2 So, xTAx = c gives an ellipse if c>0. 54 Example. A = 45 1 1 1 1 – λ1 =9andλ2 =1,andq = √ and q = √ . 1 2 1 2 2 −1 x x 2 x − x 2 T T 2 T 2 1 + 2 1 2 – x Ax = λ1 (q x) + λ2 (q x) =9 √ + √ 1 2 2 2 qTx is an axis perpendicular to q . (Not along q as the textbook said.) – 1 1 1 qTx q . 2 is an axis perpendicular to 2 Tip: A = QΛQT is called the principal axis theorem (cf. Slide 6-69) because xTAx = yTΛy = c is an ellipse with axes along the eigenvectors. 6.5 Problem discussions 6-99 (Problem 13, Section 6.5) Find a matrix with a>0andc>0anda + c>2b that has a negative eigenvalue. ab Missing point in Problem 13: Thematrixtobedeterminedisoftheshape bc. 6.6 Similar matrices 6-100 Definition (Similarity): Two matrices A and B are similar if B = M −1AM for some invertible M. Theorem: A and M −1AM have the same eigenvalues. Proof: For eigenvalue λ and eigenvector v of A,definev = M −1v.Wethenderive Bv =(M −1AM)v = M −1AM(M −1v)=M −1Av = M −1λv = λM −1v = λv So, λ is also the eigenvalue of M −1AM (associated with eigenvector v = M −1v). 2 Notes: • The LU decomposition over a symmetric matrix A gives A = LDLT but A and D (pivot matrix) apparently may have different eigenvalues. Why? Because (LT)−1 = L. Similarity is defined based on M −1, not M T. • The converse to the above theorem is wrong!. In other words, we cannot say that “two matrices with the same eigenvalues are similar.” 01 00 Example. A = and B = have the same eigenvalues 0, 0, but 00 00 they are not similar. 6.6 Jordan form 6-101 • Why introducing matrix similarity? Answer: We can then extend the diagonalization of a matrix with less than n eigenvectors to the Jordan form. • For an un-diagonalizable matrix A, we can find invertible M such that A = MJM−1, where J 0 ··· 0 1 0 J ··· 0 J = 2 ...... 00··· Js and s is the number of distinct eigenvalues, and the size of Ji is equal to the multiplicity of eigenvalue λi,andJi is of the form λi 10··· 0 0 λi 1 ··· 0 J λ ··· . i = 00 i . ...... 1 000··· λi 6.6 Jordan form 6-102 • The idea behind the Jordan form is that A is similar to J. • Based on this, we can now compute J 100 0 ··· 0 1 0 J 100 ··· 0 A100 = MJ100M −1 = M 2 M −1. ...... 100 00··· Js • How to find Mi corresponding to Ji? Answer: By AM = MJ with M = M1 M2 ··· Ms ,weknow AMi = MiJi for 1 ≤ i ≤ s. Specifically, assume that the multiplicity of λi is two. Then, λ A (1) (2) (1) (2) i 1 vi vi = vi vi 0 λi which is equivalent to (1) (1) (1) Av = λiv (A − λiI)v =0 i i ⇒ i (2) (1) (2) = (2) (1) Avi = vi + λivi (A − λiI)vi = vi 6.6 Problem discussions 6-103 (Problem 14, Section 6.6) Prove that AT is always similar to A (we know the λ’s are the same): −1 T 1. For one Jordan block Ji:FindMi so that Mi JiMi = Ji . J J M M −1JM J T 2. For any with blocks i: Build 0 from blocks so that 0 0 = . 3. For any A = MJM−1: Show that AT is similar to J T and so to J and to A. 6.6 Problem discussions 6-104 Thinking over Problem 14: AT and A are always similar. Answer: • It can be easily checked that u1,1 u1,2 ··· u1,n 0 ··· 01 un,n ··· un,2 un,1 0 ··· 01 . . . . . . u2,1 u2,2 ··· u2,n . ··· 10 . .. . . . ··· 10 . . . . = . . . . . . . . . .. . . ··· . . . ··· u2,2 u2,1 . ··· . . un, un, ··· un,n 1 ··· 00 u ,n ··· u , u , 1 ··· 00 1 2 1 1 2 1 1 un,n ··· un,2 un,1 . . . . − . .. . . = V 1 . V . ··· u2,2 u2,1 u1,n ··· u1,2 u1,1 where here V represents a matrix with zero entries except vi,n+1−i =1for 1 ≤ i ≤ n.WenotethatV −1 = V T = V . So, we can use the proper size of Mi = V to obtain T −1 Ji = Mi JiMi. 6.6 Problem discussions 6-105 Define M 0 ··· 0 1 0 M ··· 0 M = 2 ...... 00··· Ms We have J T = MJM−1, where J 0 ··· 0 1 0 J ··· 0 J = 2 ...... 00··· Js and s is the number of distinct eigenvalues, and the size of Ji is equal to the multiplicity of eigenvalue λi,andJi is of the form λi 10··· 0 0 λi 1 ··· 0 J λ ··· . i = 00 i . ...... 1 000··· λi 6.6 Problem discussions 6-106 Finally, we know – A is similar to J; – AT is similar to J T; – J T is similar to J. So, AT is similar to A. 2 (Problem 19, Section 6.6) If A is 6 by 4 and B is 4 by 6, AB and BA have different sizes. But with blocks, I −A AB 0 IA 00 M −1FM = = = G. 0 I B 0 0 I BBA (a) What sizes are the four blocks (the same four sizes in each matrix)? (b) This equation is M −1FM = G,soF and G have the same eigenvalues. F has the 6 eigenvalues of AB plus 4 zeros; G has the 4 eigenvalues of BA plus 6 zeros. AB has the same eigenvalues as BA plus zeros. 6.6 Problem discussions 6-107 Thinking over Problem 19: Am×nBn×m and Bn×mAm×n have the same eigenval- ues except for additional (m − n) zeros. Solution: The example shows the usefulness of the similarity. Im×m −Am×n Am×nBn×m 0m×n Im×m Am×n 0m×m 0m×n • = 0n×m In×n Bn×m 0n×n 0n×m In×n Bn×m Bn×mAm×n Am×nBn×m 0m×n 0m×m 0m×n • Hence, and are similar and have the Bn×m 0n×n Bn×m Bn×mAm×n same eigenvalues. • From Problem 26 in Section 6.1 (see Slide 6-25), the desired claim (as indicated above by red-color text) is proved. 2 6.6 Problem discussions 6-108 (Problem 21, Section 6.6) If J is the 5 by 5 Jordan block with λ =0,findJ 2 and count its eigenvectors (are these the eigenvectors?) and find its Jordan form (there will be two blocks). Problem 21 : Find the Jordan form of A = J 2,where 01000 00100 J . = 00010 00001 00000 Solution. 00100 00010 • A J 2 A = = 00001, and the eigenvalues of are0,0,0,0,0. 00000 00000 6.6 Problem discussions 6-109 v v v 1,1 3,1 1,1 v v v 2,1 4,1 2,1 • Av A v v ⇒ v v v ⇒ v 1 = 3,1 = 5,1 = 0 = 3,1 = 4,1 = 5,1 =0 = 1 = 0 . v4,1 0 0 v5,1 0 0 v v v 1,2 3,2 1,2 v v v3,2 = v1,1 v 2,2 4,2 2,2 • Av2 = A v , = v , = v1 =⇒ v , = v , =⇒ v2 = v , 3 2 5 2 4 2 2 1 1 1 v , 0 v , 4 2 v5,2 =0 2 1 v5,2 0 0 v v v 1,3 3,3 v , = v , 1,3 3 3 1 2 v2,3 v4,3 v2,3 v , = v , • Av A v v v ⇒ 4 3 2 2 ⇒ v v 3 = 3,3 = 5,3 = 2 = = 3 = 1,2 v5,3 = v1,1 v4,3 0 v2,2 v v5,3 0 2,1 =0 v1,1 6.6 Problem discussions 6-110 • To summarize, v v v 1,1 1,2 1,3 v v v 2,1 2,2 2,3 v ⇒ v v ⇒ v , v v 1 = 0 = 2 = 1,1 = If 2,1 =0 then 3 = 1,2 0 v2,1 v2,2 0 0 v1,1 (1) (1) 1 v , v , 1 2 1 3 v(1) v(1) 0 2,2 2,3 (1) (1) (1) (1) v = 0 =⇒ v = 1 =⇒ v = v , 1 2 3 1 2 v(1) 0 0 2,2 0 0 1 (2) 0 v , 1 2 v(2) 1 2,2 (2) (2) v = 0 =⇒ v = 0 1 2 0 1 0 0 6.6 Problem discussions 6-111 v(1) v(2) v(1) Sincewewishtochooseeachof 2 and 2 to be orthogonal to both 1 and v(2) 1 , v(1) v(1) v(2) v(2) , 1,2 = 2,2 = 1,2 = 2,2 =0 i.e., (1) 1 0 v , 1 3 v(1) 0 0 2,3 (1) (1) (1) v = 0 =⇒ v = 1 =⇒ v = 0 1 2 3 0 0 0 0 0 1 0 0 1 0 (2) (2) v = 0 =⇒ v = 0 1 2 0 1 0 0 v(1) v(1) v(2) v(1) v(2) Since we wish to choose 3 to be orthogonal to all of 1 , 1 , 2 and 2 , v(1) v(1) . 1,3 = 2,3 =0 6.6 Problem discussions 6-112 As a result, 01000 00100 A v(1) v(1) v(1) v(2) v(2) = v(1) v(1) v(1) v(2) v(2) 00000 1 2 3 1 2 1 2 3 1 2 00001 00000 2 • v(1), v(1), v(1), v(2) v(2) A Av λv Are all 1 2 3 1 and 2 eigenvectors of (satisfying = )? Hint: A does not have 5 eigenvectors. More specifically, A only has 2 eigen- vectors? Which two? Think of it. • Hence, the problem statement may not be accurate as I mark “(are these the eigenvectors?)”. 6.7 Singular value decomposition (SVD) 6-113 • Now we can represent all square matrix in the Jordan form: A = M −1JM. • What if the matrix Am×n is not a square one? – Problem: Av = λv is not possible! Specifically, v cannot have two different dimensionalities. Am×nvn×1 = λvn×1 infeasible if n = m. – So, we can only have Am×nvn×1 = σum×1. 6.7 Singular value decomposition (SVD) 6-114 • If we can find enough numbers of orthogonal u and v such that σ ··· 00··· 0 1 ...... ··· σ ··· A v v ··· v u u ··· u 0 r 0 0 . m×n 1 2 n n×n = 1 2 m m×m ··· ··· 0 00 0 V U ...... ··· ··· 0 00 0 m×n where r is the rank of A, then we can perform the so-called singular value decomposition (SVD) A = UΣV −1. Note again that the required enough number of orthogonal u and v may be impossible when A has repeated “eigenvalues.” 6.7 Singular value decomposition (SVD) 6-115 • If V is chosen to be an orthogonal matrix satisfying V −1 = V T,thenwehave the so-called reduced SVD A = UΣV T r T = σiuivi i=1 σ ··· vT 1 0 1 u ··· u . .. . . = 1 r m×r . . . . ··· σ vT 0 r r×r r r×n • Usually, we prefer to choose an orthogonal V (as well as orthogonal U). • In the sequel, we will assume the found U and V are orthogonal matrices in the first place; later, we will confirm that orthogonal U and V can always be found. 6.7 How to determine U, V and {σi}? 6-116 T • Am×n = Um×mΣm×nVn×n T T T T =⇒ An×mAm×n = Um×mΣm×nVn×n Um×mΣm×nVn×n T T T 2 T = Vn×nΣn×mUm×mUm×mΣm×nVn×n = Vn×nΣn×nVn×n. So, V is the (orthogonal) matrix of n eigenvectors of symmetric ATA. T • Am×n = Um×mΣm×nVn×n T T T T =⇒ Am×nAn×m = Um×mΣm×nVn×n Um×mΣm×nVn×n T T T 2 T = Um×mΣm×nVn×nVn×nΣn×mUm×m = Um×mΣm×mUm×m. So, U is the (orthogonal) matrix of m eigenvectors of symmetric AAT. • Remember that ATA and AAT have the same eigenvalues except for additional (m − n) zeros. Section 6.6, Problem 19 (See Slide 6-107): Am×nBn×m and Bn×mAm×n have the same eigenvalues except for additional (m − n) zeros. In fact, there are only r non-zero eigenvalues for ATA and AAT, which satisfy 2 λi = σi , where 1 ≤ i ≤ r. 6.7 How to determine U, V and {σi}? 6-117 Example (Problem 21 in Section 6.6): Find the SVD of A = J 2,where 01000 00100 J . = 00010 00001 00000 Solution. 00000 10000 00000 01000 • ATA AAT = 00100 and = 00100. 00010 00000 00001 00000 00010 10000 10000 00001 01000 01000 • V U 2 So = 10000 and = 00100 for Λ = Σ = 00100. 01000 00010 00000 00100 00001 00000 6.7 How to determine U, V and {σi}? 6-118 • Hence, 00100 10000 10000 00100 00010 01000 01000 00010 00001 = 00100 00100 00001 00000 00010 00000 10000 00000 00001 00000 01000 A U Σ V T 2 We may compare this with the Jordan form. 00100 10000 01000 10000 00010 00010 00100 00100 00001 = 01000 00000 00001 00000 00001 00001 01000 00000 00100 00000 00010 A M J M−1 6.7 How to determine U, V and {σi}? 6-119 Remarks on the previous example 10000 01000 • 2 In the above example, we only know Λ = Σ = 00100. 00000 00000 ±10 000 0 ±1000 • ± Hence, Σ = 00 100. 00000 00000 • However, by Avi = σiui =(−σi)(−ui), we can always choose positive σi by adjusting the sign of ui. 6.6 SVD 6-120 Important notes: • In terminology, σi,where1≤ i ≤ r, is called the singular value. Hence, the name of singular value decomposition is used. • The singular value is always non-zero (even though the eigenvalues of A can be zeros). 00100 10000 00010 01000 A Example. = 00001 but Σ = 00100. 00000 00000 00000 00000 • It is good to know that: The first r columns of V ∈ row space R(A)andarebases of R(A) The last (n − r) columns of V ∈ null space N(A)andarebases of N(A) The first r columns of U ∈ column space C(A)andarebases of C(A) The last (m − r) columns of U ∈ left null space N(AT)andarebases of N(AT) How useful the above facts are can be seen from the next example. 6.6 SVD 6-121 A x yT Example. Find the SVD of 4×3 = 4×1 1×3. Solution. σ 00 1 000 • A isarank-1matrix.So,r =1andΣ= ,whereσ = 0. (Actually, 000 1 000 σ1 =1.) y • The base of the row space is v1 = y ; pick up perpendicular v2 and v3 that span the null space. x • The base of the column space is u1 = x ; pick up perpendicular u2, u3, u4 that span the left null space. • The SVD is then T x y row space u u u y 2 3 4 A4×3 = x Σ4×3 vT left null space 2 4×4 null space column space vT 3 3×3 2