<<

A useful for defective matrices: Jordan vectors and the Jordan form

S. G. Johnson, MIT 18.06 Created Spring 2009; updated May 8, 2017

Abstract This has a characteristic polynomial λ 2 − 2λ + 1, with a repeated root (a single eigenvalue) λ1 = 1. (Equiv- Many textbooks and lecture notes can be found online alently, since A is upper triangular, we can read the de- for the existence of something called a “Jordan form” terminant of A − λI, and hence the eigenvalues, off the of a matrix based on “generalized eigenvectors (or “Jor- diagonal.) However, it only has a single indepenent eigen- dan vectors” and “Jordan chains”). In these notes, in- vector, because stead, I omit most of the formal derivations and instead  0 1  focus on the consequences of the Jordan vectors for how A − I = we understand matrices. What happens to our traditional 0 0 eigenvector-based pictures of things like An or eAt when diagonalization of A fails? The answer, for any matrix is obviously rank 1, and has a one-dimensional nullspace function f (A), turns out to involve the derivative of f ! spanned by x1 = (1,0). Defective matrices arise rarely in practice, and usually only when one arranges for them intentionally, so we have 1 Introduction not worried about them up to now. But it is important to have some idea of what happens when you have a defec- So far in the eigenproblem portion of 18.06, our strategy tive matrix. Partially for computational purposes, but also has been simple: find the eigenvalues λi and the corre- to understand conceptually what is possible. For example, sponding eigenvectors xi of a A, expand what will be the result of any vector of interest u in the basis of these eigenvectors     k 1 At 1 (u = c1x1 +···+cnxn), and then any operation with A can A or e 2 2 be turned into the corresponding operation with λi acting on each eigenvector. So, Ak becomes λ k, eAt becomes eλit , i for the A above, since (1,2) is not in the and so on. But this relied on one key assumption: we re- span of the (single) eigenvector of A? For diagonaliz- quire the n × n matrix A to have a basis of n independent able matrices, this would grow as λ k or eλt , respectively, eigenvectors. We call such a matrix A diagonalizable. but what about defective matrices? Although matrices in Many important cases are always diagonalizable: ma- real applications are rarely exactly defective, it sometimes trices with n distinct eigenvalues λ , real symmetric or or- i happens (often by design!) that they are nearly defective, thogonal matrices, and complex Hermitian or unitary ma- and we can think of exactly defective matrices as a limit- trices. But there are rare cases where A does not have a ing case. (The book Spectra and Pseudospectra by Tre- complete basis of n eigenvectors: such matrices are called fethen & Embree is a much more detailed dive into the defective. For example, consider the matrix fascinating world of nearly defective matrices.)  1 1  The textbook (Intro. to , 5th ed. by A = . 0 1 Strang) covers the defective case only briefly, in section

1 8.3, with something called the Jordan form of the matrix, If λi is a double root, we will need a second vector to a generalization of diagonalization, but in this section we complete our basis. Remarkably,1 it turns out to always 2 will focus more on the “Jordan vectors” than on the Jordan be the case for a double root λi that N([A − λiI] ) is two- factorization. For a , the fundamen- dimensional, just as for the 2 × 2 example above. Hence, tal vectors are the eigenvectors, which are useful in their we can always find a unique second solution ji satisfying: own right and give the diagonalization of the matrix as a side-effect. For a defective matrix, to get a complete basis (A − λiI)ji = xi, ji ⊥ xi . we need to supplement the eigenvectors with something called Jordan vectors or generalized eigenvectors. Jor- Again, we can choose ji to be perpendicular to xi via dan vectors are useful in their own right, just like eigen- Gram-Schmidt—this is not strictly necessary, but gives a vectors, and also give the Jordan form. Here, we’ll focus convenient orthogonal basis. (That is, the complete solu- mainly on the consequences of the Jordan vectors for how tion is always of the form xp +cxi, a particular solution xp matrix problems behave. plus any multiple of the nullspace basis xi. If we choose T T c = −xi xp/xi xi we get the unique orthogonal solution ji.) We call ji a Jordan vector or Jordan vector of A. 2 Defining Jordan vectors The relationship between j1 and x1 is also called a Jor- dan chain. In the example above, we had a 2 × 2 matrix A but only a single eigenvector x1 = (1,0). We need another vector to get a basis for R2. Of course, we could pick another vector 2.1 More than double roots at random, as long as it is independent of x1, but we’d like (1) A more general notation is to use xi instead of xi and it to have something to do with A, in order to help us with (2) xi instead of ji. If λi is a triple root, we would find computations just like eigenvectors. The key thing is to (3) (1,2) 2 look at A−I above, and to notice that (A−I) = 0. (A ma- a third vector xi perpendicular to xi by requiring (3) (2) trix is called if some power is the .) (A − λiI)xi = xi , and so on. In general, if λi is an m- 2 m So, the nullspace of (A−I) must give us an “extra” basis times repeated root, then N([A − λi] ) is m-dimensiohnal vector beyond the eigenvector. But this extra vector must we will always be able to find an orthogonal sequence (a still be related to the eigenvector! If y ∈ N[(A − I)2], then ( j) Jordan chain) of Jordan vectors xi for j = 2...m satis- (A − I)y must be in N(A − I), which means that (A − I)y ( j) ( j−1) (1) fying (A − λ I)x = x and (A − λ I)x = 0. Even is a multiple of x ! We just need to find a new “Jordan i i i i i 1 more generally, you might have cases with e.g. a triple vector” or “” j satisfying 1 root and two ordinary eigenvectors, where you need only (A − I)j1 = x1, j1 ⊥ x1. one generalized eigenvector, or an m-times repeated root with ` > 1 eigenvectors and m − ` Jordan vectors. How- Notice that, since x ∈ N(A−I), we can add any multiple 1 ever, cases with more than a double root are extremely of x to j and still have a solution, so we can use Gram- 1 1 rare in practice. Defective matrices are rare enough to be- Schmidt to get a unique solution j perpendicular to x . 1 1 gin with, so here we’ll stick with the most common defec- This particular 2 × 2 equation is easy enough for us to tive matrix, one with a double root λi: hence, one ordinary solve by inspection, obtaining j1 = (0,1). Now we have eigenvector xi and one Jordan vector ji. a nice for R2, and our basis has some simple relationship to A! Before we talk about how to use these Jordan vectors, 3 Using Jordan vectors let’s give a more general definition. Suppose that λi is an eigenvalue of A corresponding to a repeated root of Using an eigenvector xi is easy: multiplying by A is just det(A−λiI), but with only a single (ordinary) eigenvector like multiplying by λi. But how do we use a Jordan vector xi, satisfying, as usual: 1This fact is proved in any number of advanced textbooks on linear (A − λiI)xi = 0. algebra; I like Linear Algebra by P. D. Lax.

2 ji? The key is in the definition above. It immediately tells 3.1 More than double roots us that In the rare case of two Jordan vectors from a triple root, Aji = λiji + xi. you will have a Jordan vector x(3) and get a f (A)x(3) = It will turn out that this has a simple consequence for more i i (3) 0 00 00 complicated expressions like Ak or eAt , but that’s probably f (λ)xi + f (λ)ji + f (λ)xi, where the f term will give 2 k−2 2 λit k At not obvious yet. Let’s try multiplying by A : you k(k − 1)λi and t e for A and e respectively. 2 A quadruple root with one eigenvector and three Jordan A ji = A(Aji) = A(λiji + xi) = λi(λiji + xi) + λixi vectors will give you f 000 terms (that is, k3 and t3 terms), 2 = λi ji + 2λixi and so on. The theory is quite pretty, but doesn’t arise often in practice so I will skip it; it is straightforward to and then try A3: work it out on your own if you are interested. 3 2 2 2 2 A ji = A(A ji) = A(λi ji + 2λixi) = λi (λiji + xi) + 2λi xi 3 2 = λi ji + 3λi xi. 3.2 Example From this, it’s not hard to see the general pattern (which  1 1  Let’s try this for our example 2×2 matrix A = can be formally proved by induction): 0 1

k k k−1 from above, which has an eigenvector x1 = (1,0) and a A ji = λi ji + kλ xi . i Jordan vector j1 = (0,1) for an eigenvalue λ1 = 1. Sup- k At Notice that the coefficient in the second term is exactly pose we want to comput A u0 and e u0 for u0 = (1,2). d k (λi) . This is the clue we need to get the general for- As usual, our first step is to write u0 in the basis of the dλi mula to apply any function f (A) of the matrix A to the eigenvectors...except that now we also include the gener- eigenvector and the Jordan vector: alized eigenvectors to get a complete basis:   f (A) = f ( ) , 1 xi λi xi u = = x + 2j . 0 2 1 1 0 f (A)ji = f (λi)ji + f (λi)xi . k Now, computing A u0 is easy, from our formula above: Multiplying by a function of the matrix multiplies ji by the same function of the eigenvalue, just as for an eigenvec- k k k k k k−1 A u0 = A x1 + 2A j1 = λ1 x1 + 2λ1 j1 + 2kλ1 x1 tor, but also adds a term multiplying xi by the derivative       0 k 1 k− 1 1 + 2k f (λi). So, for example: = 1 + 2k1 1 = . 2 0 2 At λit λit e ji = e ji +te xi . For example, this is the solution to the recurrence uk+1 = We can show this explicitly by considering what happens Auk. Even though A has only an eigenvalue |λ1| = 1 ≤ 1, when we apply our formula for Ak in the Taylor series for the solution still blows up, but it blows up linearly with k eAt : instead of exponentially. Consider instead eAt u , which is the solution to the ∞ Aktk ∞ tk 0 eAt j = j = (λ kj + kλ k−1x ) system of ODEs du(t) = Au(t) with the initial condition i ∑ k! i ∑ k! i i i i dt k=0 k=0 u(0) = u0. In this case, we get: ∞ ( t)k ∞ ( t)k−1 λi λi λit λit = j +t x = e j +te x . At At At λ1t λ1t λ1t ∑ i ∑ i i i e u0 = e x1 + 2e j1 = e x1 + 2e j1 + 2te x1 k=0 k! k=1 (k − 1)!  1   1   1 + 2t  = et + 2tet = et . In general, that’s how we show the formula for f (A) 2 0 2 above: we Taylor expand each term, and the Ak formula means that each term in the Taylor expansion has corre- In this case, the solution blows up exponentially since sponding term multiplying ji and a derivative term multi- λ1 = 1 > 0, but we have an additional term that blows plying xi. up as an exponential multiplied by t.

3 Those of you who have taken 18.03 are probably fa- exists, but how to use it via the Jordan vectors: in particu- miliar with these terms multiplied by t in the case of a lar, that generalized eigenvectors give us linearly growing repeated root. In 18.03, it is presented simply as a guess terms kλ k−1 and teλt when we multiply by Ak and eAt , for the solution that turns out to work, but here we see respectively. that it is part of a more general pattern of Jordan vectors Computationally, the Jordan form is famously problem- for defective matrices. atic, because with any slight random perturbation to A (e.g. roundoff errors) the matrix typically becomes diag- onalizable, and the 2 × 2 Jordan block for λ2 disappears. 4 The Jordan form One then has a basis X of eigenvectors, but it is nearly singular (“ill conditioned”): for a nearly defective ma- For a diagonalizable matrix A, we made a matrix S out trix, the eigenvectors are almost linearly dependent. of the eigenvectors, and saw that multiplying by A was −1 −1 This makes eigenvectors a problematic way of looking at equivalent to multiplying by SΛS where Λ = S AS is nearly defective matrices as well, because they are so sen- the of eigenvalues, the diagonalization of sitive to errors. Finding an approximate Jordan form of a A. Equivalently, AS = ΛS: A multiplies each column of S nearly defective matrix is the famous Wilkinson problem by the corresponding eigenvalue. Now, we will do exactly in numerical linear algebra, and has a number of tricky the same steps for a defective matrix A, using the basis of solutions. Alternatively, there are approaches like “Schur eigenvectors and Jordan vectors, and obtain the Jordan factorization” or the SVD that lead to nice orthonormal form J instead of Λ. bases for any matrix, but aren’t nearly as simple to use as Let’s consider a simple case of a 4 × 4 first, in which eigenvectors. there is only one repeated root λ2 with an eigenvector x2 and a Jordan vector j2, and the other two eigenvalues λ1 and λ3 are distinct with independent eigenvectors x1 and x3. Form a matrix M = (x1,x2,j2,x3) from this basis of four vectors (3 eigenvectors and 1 Jordan vector). Now, consider what happends when we multiply A by M:

AM = (λ1x1,λ2x2,λ2j2 + x2,λ3x3).   λ1  λ2 1  = M   = MJ.  λ2  λ3 That is, A = MJM−1 where J is almost diagonal: it has λ 0s along the diagonal, but it also has 1’s above the diag- onal for the columns corresponding to generalized eigen- vectors. This is exactly the Jordan form of the matrix A. J, of course, has the same eigenvalues as A since A and J are similar, but J is much simpler than A. The 2×2 block  λ 1  2 is a called a 2 × 2 Jordan block. λ2 The generalization of this, when you perhaps have more than one repeated root, and perhaps the multiplicity of the root is greater than 2, is fairly obvious, and leads immediately to the formula given without proof in section 6.6 of the textbook. What I want to emphasize here, how- ever, is not so much the formal theorem that a Jordan form

4