Stat 309: Mathematical Computations I Fall 2013 Lecture 5
Total Page:16
File Type:pdf, Size:1020Kb
STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2013 LECTURE 5 1. Jordan canonical form • if A is not diagonalizable and we want something like a diagonalization, then the best we could do is a Jordan canonical form or Jordan normal form where we get A = XJX−1 (1.1) { the matrix J has the following characteristics ∗ it is not diagonal but it is the next best thing to diagonal, namely, bidiagonal, i.e. only the entries aii and ai;i+1 can be non-zero, every other entry in J is 0 ∗ the diagonal entries of J are precisely the eigenvalues of A, counted with multi- plicity ∗ the superdiagonal entries ai;i+1 are as simple as they can be | they can take one of two possible values ai;i+1 = 0 or 1 ∗ if ai;i+1 = 0 for all i, then J is in fact diagonal and (1.1) reduces to the eigenvalue decomposition { the matrix J is more commonly viewed as a block diagonal matrix 2 3 J1 6 .. 7 J = 4 . 5 Jk ∗ each block Jr, for r = 1; : : : ; k, has the form 2λr 1 3 . 6 .. .. 7 6 7 Jr = 6 . 7 4 .. 1 5 λr where Jr is nr × nr Pk ∗ clearly r=1 nr = n { the set of column vectors of X are called a Jordan basis of A • in general the Jordan basis X include all eigenvectors of A but also additional vectors that are not eigenvectors of A • the Jordan canonical form provides valuable information about the eigenvalues of A • the values λj, for j = 1; : : : ; k, are the eigenvalues of A • for each distinct eigenvalue λ, the number of Jordan blocks having λ as a diagonal element is equal to the number of linearly independent eigenvectors associated with λ, this number is called the geometric multiplicity of the eigenvalue λ • the sum of the sizes of all of these blocks is called the algebraic multiplicity of λ • we now consider Jr's eigenvalues, λ(Jr) = λr; : : : ; λr Date: October 12, 2013, version 1.0. Comments, bug reports: [email protected]. 1 where λr is repeated nr times, but because 20 1 3 . 6 .. .. 7 6 7 Jr − λrI = 6 . 7 4 .. 15 0 is a matrix of rank nr − 1, it follows that the homogeneous system (Jr − λrI)x = 0 has only one vector (up to a scalar multiple) for a solution, and therefore there is only one eigenvector associated with this Jordan block > • the unique unit vector that solves (Jr − λrI)x = 0 is the vector e1 = [1; 0;:::; 0] • now consider the matrix 20 1 3 20 1 3 20 0 1 3 . 6 .. .. 7 6 .. .. 7 6 .. .. .. 7 6 7 6 7 6 7 2 6 . 7 6 . 7 6 . 7 (Jr − λrI) = 6 .. .. 7 6 .. .. 7 = 6 .. .. 7 6 7 6 7 6 17 6 . 7 6 . 7 6 . 7 4 .. 15 4 .. 15 4 .. 05 0 0 0 2 • it is easy to see that (Jr − λrI) e2 = 0 • continuing in this fashion, we can conclude that k (Jr − λrI) ek = 0; k = 1; : : : ; nr − 1 • the Jordan form can be used to easily compute powers of a matrix • for example, A2 = XJX−1XJX−1 = XJ 2X−1 and, in general, Ak = XJ kX−1 • due to its structure, it is easy to compute powers of a Jordan block Jr: k 2λr 1 3 . 6 .. .. 7 k 6 7 Jr = 6 . 7 4 .. 1 5 λr 20 1 3 . 6 .. .. 7 k 6 7 = (λrI + K) ;K = 6 . 7 4 .. 15 0 k X k = λk−jKj j r j=0 2 which yields, for k > nr, 2 k−(n −1)3 λk kλk−1 kλk−2 ··· k λ r r 1 r 2 r nr−1 r 6 . 7 6 .. .. 7 6 7 k 6 . 7 Jr = 6 .. .. 7 (1.2) 6 7 6 .. 7 4 . 5 k λr • for example, 2λ 1 033 2λ3 3λ2 3λ 3 40 λ 15 = 4 0 λ3 3λ25 0 0 λ 0 0 λ3 • we now consider an application of the Jordan canonical form { consider the system of differential equations 0 y (t) = Ay(t); y(t0) = y0 { using the Jordan form, we can rewrite this system as y0(t) = XJX−1y(t) { ultiplying through by X−1 yields X−1y0(t) = JX−1y(t) which can be rewritten as z0(t) = Jz(t) where z = Q−1y(t) { this new system has the initial condition −1 z(t0) = z0 = Q y0 { if we assume that J is a diagonal matrix (which is true in the case where A has a full set of linearly independent eigenvectors), then the system decouples into scalar equations of the form 0 zi(t) = λizi(t); where λi is an eigenvalue of A { this equation has the solution λi(t−t0) zi(t) = e zi(0); and therefore the solution to the original system is 2eλ1(t−t0) 3 6 .. 7 −1 y(t) = X 4 . 5 X y0 eλn(t−t0) • Jordan canonical form suffers however from one major defect that makes them useless in practice: they cannot be computed in finite precision or in the presence of rounding errors in general, a result of Golub and Wilkinson • that is why you won't find a Matlab function for Jordan canonical form 3 2. spectral radius • matrix 2-norm is also known as the spectral norm • name is connected to the fact that the norm is given by the square root of the largest eigenvalue of A>A, i.e. largest singular value of A (more on this later) n×n • in general, the spectral radius ρ(A) of a matrix A 2 C is defined in terms of its largest eigenvalue ρ(A) = maxfjλij : Axi = λixi; xi 6= 0g n×n • note that the spectral radius does not define a norm on C • for example the non-zero matrix 0 1 J = 0 0 has ρ(J) = 0 since both its eigenvalues are 0 • there are some relationships between the norm of a matrix and its spectral radius • the easiest one is that ρ(A) ≤ kAk n for any matrix norm that satisfies the inequality kAxk ≤ kAkkxk for all x 2 C { here's a proof: Axi = λixi taking norms, kAxik = kλixik = jλijkxik therefore kAxik jλij = ≤ kAk kxik since this holds for any eigenvalue of A, it follows that maxjλij = ρ(A) ≤ kAk i { in particular this is true for any operator norm { this is in general not true for norms that do not satisfy the inequality kAxk ≤ kAkkxk (thanks to Likai Chen for pointing out); for example the matrix " p1 p1 # A = 2 2 p1 − p1 2 2 p is orthogonal and therefore ρ(A) = 1 but kAkH;1 = 1= 2 and so ρ(A) > kAkH;1 { exercise: show that any eigenvalue of a unitary or an orthogonal matrix must have absolute value 1 • on the other hand, the following characterization is true for any matrix norm, even the inconsistent ones ρ(A) = lim kAmk1=m m!1 • we can also get an upper bound for any particular matrix (but not for all matrices) Theorem 1. For every " > 0, there exists an operator norm k · kα such that kAkα ≤ ρ(A) + . The norm is dependent on A and ". • this result suggests that the largest eigenvalue of a matrix can be easily approximated 4 • here is an example, let 2 2 −1 3 6−1 2 −1 7 6 7 6 .. .. .. 7 A = 6 . 7 6 7 4 −1 2 15 −1 2 { the eigenvalues of this matrix, which arises frequently in numerical methods for solving differential equations, are known to be jπ λ = 2 + 2 cos ; j = 1; 2; : : : ; n j n + 1 the largest eigenvalue is π jλ j = 2 + 2 cos ≤ 4 1 n + 1 and kAk1 = 4, so in this case, the 1-norm provides an excellent approximation { on the other hand, suppose 1 106 A = 0 1 6 we have kAk1 = 10 + 1; but ρ(A) = 1, so in this case the norm yields a poor approximation however, suppose " 0 D = 0 1 then 1 106" DAD−1 = 0 1 −1 −6 and kDAD k1 = 1 + 10 ,, which for sufficiently small ", yields a much better approximation to ρ(DAD−1) = ρ(A): • if kAk < 1 for some submultiplicative norm, then kAmk ≤ kAkm ! 0 as m ! 1 • since kAk is a continuous function of the elements of A, it follows that Am ! O, i.e. every entry of Am goes to 0 • however, if kAk > 1, it does not follow that kAmk ! 1 • for example, suppose 0:99 106 A = 0 0:99 m m in this case, kAk1 > 1, we claim that because ρ(A) < 1, A ! O and so kA k ! 0 • let us prove this more generally, in fact we claim the following m Lemma 1. limm!1 A = O if and only if ρ(A) < 1: Proof. ()) Let Ax = λx with x 6= 0. Then Amx = λmx. Taking limits lim λm x = lim λmx = lim Amx = lim Am x = Ox = 0: m!1 m!1 m!1 m!1 m Since x 6= 0, we must have limm!1 λ = 0 and thus jλj < 1. Hence ρ(A) < 1. (() Since ρ(A) < 1, there exists some operator norm k · kα such that kAkα < 1 by Theorem 1. m m m So kA kα ≤ kAkα ! 0 and so A ! O. Alternatively this part may also be proved directly via the Jordan form of A and (1.2) (without using Theorem 1). • in Homework 1 we will see that if for some operator norm, kAk < 1, then I−A is nonsingular 5 3.