<<

A useful for defective matrices: Generalized eigenvectors and the Jordan form

S. G. Johnson, MIT 18.06 Spring 2009 April 22, 2009

1 Introduction not worried about them up to now. But it is important to have some idea of what happens when you have a defec- So far in the eigenproblem portion of 18.06, our strategy tive . Partially for computational purposes, but also has been simple: find the eigenvalues λi and the corre- to understand conceptually what is possible. For example, sponding eigenvectors xi of a A, expand what will be the result of any vector of interest u in the basis of these eigenvectors (u = c1x1 +···+cnxn), and then any operation with A can     be turned into the corresponding operation with λi acting k 1 At 1 k k At λit A or e on each eigenvector. So, A becomes λi , e becomes e , 2 2 and so on. But this relied on one key assumption: we re- quire the n × n matrix A to have a basis of n independent eigenvectors. We call such a matrix A diagonalizable. for the A above, since (1,2) is not in the Many important cases are always diagonalizable: ma- span of the (single) eigenvector of A? For diagonalizable trices with n distinct eigenvalues λ , real symmetric or or- i matrices, this would grow as λ k or eλt , respectively, but thogonal matrices, and complex Hermitian or unitary ma- what about defective matrices? trices. But there are rare cases where A does not have a complete basis of n eigenvectors: such matrices are called The textbook (Intro. to Linear , 4th ed. by defective. For example, consider the matrix Strang) covers the defective case only briefly, in section  1 1  6.6, with something called the Jordan form of the ma- A = . 0 1 trix, a generalization of diagonalization. In that short sec- tion, however, the Jordan form falls down out of the sky, This matrix has a λ 2 − 2λ + 1, and you don’t really learn where it comes from or what with a repeated root (a single eigenvalue) λ1 = 1. (Equiv- its consequences are. In this section, we will take a dif- alently, since A is upper triangular, we can read the de- ferent tack. For a , the fundamen- terminant of A − λI, and hence the eigenvalues, off the tal vectors are the eigenvectors, which are useful in their .) However, it only has a single indepenent - own right and give the diagonalization of the matrix as a vector, because side-effect. For a defective matrix, to get a complete basis we need to supplement the eigenvectors with something  0 1  A − I = called generalized eigenvectors. Generalized eigenvec- 0 0 tors are useful in their own right, just like eigenvectors, is obviously 1, and has a one-dimensional nullspace and also give the Jordan form as a side effect. Here, how- spanned by x1 = (1,0). ever, we’ll focus on defining, obtaining, and using the Defective matrices arise rarely in practice, and usually generalized eigenvectors, and not so much on the Jordan only when one arranges for them intentionally, so we have form.

1 2 Defining generalized eigenvectors 2.1 More than double roots

In the example above, we had a 2 × 2 matrix A but only If we wanted to be more notationally consistent, we could (1) a single eigenvector x1 = (1,0). We need another vector use xi instead of xi. If λi is a triple root, we would 2 (3) (1,2) to get a basis for R . Of course, we could pick another find a third vector xi perpendicular to xi by requir- vector at random, as long as it is independent of x1, but (3) (2) ing (A − λiI)xi = xi , and so on. In general, if λi is we’d like it to have something to do with A, in order to an m-times repeated root, then we will always be able to help us with computations just like eigenvectors. The key find an orthogonal of generalized eigenvectors thing is to look at A−I above, and to notice that the one of ( j) ( j) ( j−1) xi for j = 2...m satisfying (A − λiI)xi = xi and the columns is equal to x1: the vector x1 is in (A−I), so (1) (2) (A − λ I)x = 0. Even more generally, you might have we can find a new x satisfying i i 1 cases with e.g. a triple root and two ordinary eigenvec- (2) (2) tors, where you need only one generalized eigenvector, (A − I)x1 = x1, x1 ⊥ x1. or an m-times repeated root with ` > 1 eigenvectors and Notice that, since x1 ∈ N(A−I), we can add any multiple m−` generalized eigenvectors. However, cases with more (2) of x1 to x1 and still have a solution, so we can use Gram- than a double root are extremely rare in practice. Defec- (2) tive matrices are rare enough to begin with, so here we’ll Schmidt to get a unique solution x1 perpendicular to x1. This particular 2 × 2 is easy enough for us to stick with the most common defective matrix, one with (2) a double root λi: hence, one ordinary eigenvector xi and solve by inspection, obtaining x1 = (0,1). Now we have (2) 2 one generalized eigenvector x . a nice for R , and our basis has some i simple relationship to A! Before we talk about how to use these generalized eigenvectors, let’s give a more general definition. Sup- 3 Using generalized eigenvectors pose that λi is an eigenvalue of A corresponding to a re- Using an eigenvector xi is easy: multiplying by A is just peated root of det(A − λiI), but with only a single (ordi- like multiplying by λi. But how do we use a generalized nary) eigenvector xi, satisfying, as usual: (2) eigenvector xi ? The key is in the definition above. It (A − λiI)xi = 0. immediately tells us that (2) (2) If λi is a double root, we will need a second vector to Axi = λixi + xi. complete our basis. Remarkably,1 it turns out to always be the case for a double root λi that xi is in C(A − λiI), It will turn out that this has a simple consequence for more k At just as in the example above. Hence, we can always find complicated expressions like A or e , but that’s probably (2) not obvious yet. Let’s try multiplying by A2: a unique second solution xi satisfying: 2 (2) (2) (2) (2) (2) (2) A xi = A(Axi ) = A(λixi + xi) = λi(λixi + xi) + λixi (A − λiI)x = xi, x ⊥ xi . i i 2 (2) = λi xi + 2λixi Again, we can choose x(2) to be perpendicular to x(1) via i i and then try A3: Gram-Schmidt—this is not strictly necessary, but gives a convenient . (That is, the complete solu- 3 (2) 2 (2) 2 (2) 2 (2) 2 A xi = A(A xi ) = A(λi xi + 2λixi) = λi (λixi + xi) + 2λi xi tion is always of the form xp +cxi, a particular solution xp 3 (2) 2 plus any multiple of the nullspace basis xi. If we choose = λi xi + 3λi xi. T T c = −xi xp/xi xi we get the unique orthogonal solution From this, it’s not hard to see the general pattern (which (2) (2) xi .) We call xi a generalized eigenvector of A. can be formally proved by induction): 1This fact is proved in any number of advanced textbooks on linear k (2) k (2) k−1 algebra; I like by P. D. Lax. A xi = λi xi + kλi xi .

2 Notice that the coefficient in the second term is exactly 3.2 Example d k (λi) . This is the clue we need to get the general for- dλi mula to apply any f (A) of the matrix A to the  1 1  Let’s try this for our example 2×2 matrix A = eigenvector and the generalized eigenvector: 0 1 from above, which has an eigenvector x1 = (1,0) and f (A)xi = f (λi)xi, (2) a generalized eigenvector x1 = (0,1) for an eigenvalue k At λ1 = 1. Suppose we want to comput A u0 and e u0 for (2) (2) 0 u0 = (1,2). As usual, our first step is to write u0 in the ba- f (A)xi = f (λi)xi + f (λi)xi . sis of the eigenvectors...except that now we also include (2) the generalized eigenvectors to get a complete basis: Multiplying by a function of the matrix multiplies xi by the same function of the eigenvalue, just as for an eigen-   vector, but also adds a term multiplying xi by the deriva- 1 (2) 0 u0 = = x1 + 2x1 . tive f (λi). So, for example: 2

At (2) λit (2) λit k e xi = e xi +te xi . Now, computing A u0 is easy, from our formula above:

We can show this explicitly by considering what happens k k k (2) k k (2) k−1 when we apply our formula for Ak in the for A u0 = A x1 + 2A x1 = λ1 x1 + 2λ1 x1 + 2kλ1 x1 eAt :  1   1   1 + 2k  = 1k + 2k1k−1 = . 2 0 2 ∞ Aktk ∞ tk eAt x(2) = x(2) = (λ kx(2) + kλ k−1x ) i ∑ k! i ∑ k! i i i i k=0 k=0 For example, this is the solution to the recurrence uk+1 = ∞ (λ t)k ∞ (λ t)k−1 Auk. Even though A has only an eigenvalue |λ1| = 1 ≤ 1, = i x(2) +t i x = eλit x(2) +teλit x . ∑ k! i ∑ (k − 1)! i i i the solution still blows up, but it blows up linearly with k k=0 k=1 instead of exponentially. At In general, that’s how we show the formula for f (A) Consider instead e u0, which is the solution to the k du(t) above: we Taylor expand each term, and the A formula system of ODEs dt = Au(t) with the initial condition means that each term in the Taylor expansion has corre- u(0) = u0. In this case, we get: (2) sponding term multiplying xi and a derivative term mul- tiplying x . i At At At (2) λ1t λ1t (2) λ1t e u0 = e x1 + 2e x1 = e x1 + 2e x1 + 2te x1  1   1   1 + 2t  = et + 2tet = et . 3.1 More than double roots 2 0 2 In the rare case of two generalized eigenvectors from a triple root, you will have a generalized eigenvector In this case, the solution blows up exponentially since (3) (3) (3) (2) 0 00 λ1 = 1 > 0, but we have an additional term that blows xi and get a f (A)xi = f (λ)xi + f (λ)xi + f (λ)xi, 00 k−2 2 λit up as an exponential multiplied by t. where the f term will give you k(k − 1)λi and t e for Ak and eAt respectively. A quadruple root with one Those of you who have taken 18.03 are probably fa- eigenvector and three generalized eigenvectors will give miliar with these terms multiplied by t in the case of a you f 000 terms (that is, k3 and t3 terms), and so on. The repeated root. In 18.03, it is presented simply as a guess theory is quite pretty, but doesn’t arise often in practice so for the solution that turns out to work, but here we see that I will skip it; it is straightforward to work it out on your it is part of a more general pattern of generalized eigen- own if you are interested. vectors for defective matrices.

3 4 The Jordan form

For a diagonalizable matrix A, we made a matrix S out of the eigenvectors, and saw that multiplying by A was equivalent to multiplying by SΛS−1 where Λ = S−1AS is the of eigenvalues, the diagonalization of A. Equivalently, AS = ΛS: A multiplies each column of S by the corresponding eigenvalue. Now, we will do exactly the same steps for a defective matrix A, using the basis of eigenvectors and generalized eigenvectors, and obtain the Jordan form J instead of Λ. Let’s consider a simple case of a 4 × 4 first, in which there is only one repeated root λ2 with an eigenvector (2) x2 and a generalized eigenvector x2 , and the other two eigenvalues λ1 and λ3 are distinct with independent eigen- (2) vectors x1 and x3. Form a matrix M = (x1,x2,x2 ,x3) from this basis of four vectors (3 eigenvectors and 1 generalized eigenvector). Now, consider what happends when we multiply A by M:

(2) AM = (λ1x1,λ2x2,λ2x2 + x2,λ3x3).   λ1  λ2 1  = M   = MJ.  λ2  λ3

That is, A = MJM−1 where J is almost diagonal: it has λ 0s along the diagonal, but it also has 1’s above the diag- onal for the columns corresponding to generalized eigen- vectors. This is exactly the Jordan form of the matrix A. J, of course, has the same eigenvalues as A since A and J are similar, but J is much simpler than A. The generalization of this, when you perhaps have more than one repeated root, and perhaps the multiplicity of the root is greater than 2, is fairly obvious, and leads immediately to the formula given without proof in sec- tion 6.6 of the textbook. What I want to emphasize here, however, is not so much the formal theorem that a Jordan form exists, but how to use it via the generalized eigen- vectors: in particular, that generalized eigenvectors give us linearly growing terms kλ k−1 and teλt when we multi- ply by Ak and eAt , respectively.

4