A Useful Basis for Defective Matrices: Generalized Eigenvectors and the Jordan Form

A useful basis for defective matrices: Generalized eigenvectors and the Jordan form S. G. Johnson, MIT 18.06 Spring 2009 April 22, 2009 1 Introduction not worried about them up to now. But it is important to have some idea of what happens when you have a defec- So far in the eigenproblem portion of 18.06, our strategy tive matrix. Partially for computational purposes, but also has been simple: find the eigenvalues li and the corre- to understand conceptually what is possible. For example, sponding eigenvectors xi of a square matrix A, expand what will be the result of any vector of interest u in the basis of these eigenvectors (u = c1x1 +···+cnxn), and then any operation with A can be turned into the corresponding operation with li acting k 1 At 1 k k At lit A or e on each eigenvector. So, A becomes li , e becomes e , 2 2 and so on. But this relied on one key assumption: we re- quire the n × n matrix A to have a basis of n independent eigenvectors. We call such a matrix A diagonalizable. for the defective matrix A above, since (1;2) is not in the Many important cases are always diagonalizable: ma- span of the (single) eigenvector of A? For diagonalizable trices with n distinct eigenvalues l , real symmetric or or- i matrices, this would grow as l k or elt , respectively, but thogonal matrices, and complex Hermitian or unitary ma- what about defective matrices? trices. But there are rare cases where A does not have a complete basis of n eigenvectors: such matrices are called The textbook (Intro. to Linear Algebra, 4th ed. by defective. For example, consider the matrix Strang) covers the defective case only briefly, in section 1 1 6.6, with something called the Jordan form of the ma- A = : 0 1 trix, a generalization of diagonalization. In that short section, however, the Jordan form falls down out of the sky, This matrix has a characteristic polynomial l 2 − 2l + 1, and you don’t really learn where it comes from or what with a repeated root (a single eigenvalue) l1 = 1. (Equiv- its consequences are. In this section, we will take a dif- alently, since A is upper triangular, we can read the de- ferent tack. For a diagonalizable matrix, the fundamen- terminant of A − lI, and hence the eigenvalues, off the tal vectors are the eigenvectors, which are useful in their diagonal.) However, it only has a single indepenent eigen- own right and give the diagonalization of the matrix as a vector, because side-effect. For a defective matrix, to get a complete basis we need to supplement the eigenvectors with something 0 1 A − I = called generalized eigenvectors. Generalized eigenvec- 0 0 tors are useful in their own right, just like eigenvectors, is obviously rank 1, and has a one-dimensional nullspace and also give the Jordan form as a side effect. Here, how- spanned by x1 = (1;0). ever, we’ll focus on defining, obtaining, and using the Defective matrices arise rarely in practice, and usually generalized eigenvectors, and not so much on the Jordan only when one arranges for them intentionally, so we have form. 1 2 Defining generalized eigenvectors 2.1 More than double roots In the example above, we had a 2 × 2 matrix A but only If we wanted to be more notationally consistent, we could (1) a single eigenvector x1 = (1;0). We need another vector use xi instead of xi. If li is a triple root, we would 2 (3) (1;2) to get a basis for R . Of course, we could pick another find a third vector xi perpendicular to xi by requir- vector at random, as long as it is independent of x1, but (3) (2) ing (A − liI)xi = xi , and so on. In general, if li is we’d like it to have something to do with A, in order to an m-times repeated root, then we will always be able to help us with computations just like eigenvectors. The key find an orthogonal sequence of generalized eigenvectors thing is to look at A−I above, and to notice that the one of ( j) ( j) ( j−1) xi for j = 2:::m satisfying (A − liI)xi = xi and the columns is equal to x1: the vector x1 is in C(A−I), so (1) (2) (A − l I)x = 0. Even more generally, you might have we can find a new generalized eigenvector x satisfying i i 1 cases with e.g. a triple root and two ordinary eigenvec- (2) (2) tors, where you need only one generalized eigenvector, (A − I)x1 = x1; x1 ? x1: or an m-times repeated root with ` > 1 eigenvectors and Notice that, since x1 2 N(A−I), we can add any multiple m−` generalized eigenvectors. However, cases with more (2) of x1 to x1 and still have a solution, so we can use Gram- than a double root are extremely rare in practice. Defec- (2) tive matrices are rare enough to begin with, so here we’ll Schmidt to get a unique solution x1 perpendicular to x1. This particular 2 × 2 equation is easy enough for us to stick with the most common defective matrix, one with (2) a double root li: hence, one ordinary eigenvector xi and solve by inspection, obtaining x1 = (0;1). Now we have (2) 2 one generalized eigenvector x . a nice orthonormal basis for R , and our basis has some i simple relationship to A! Before we talk about how to use these generalized eigenvectors, let’s give a more general definition. Sup- 3 Using generalized eigenvectors pose that li is an eigenvalue of A corresponding to a re- Using an eigenvector xi is easy: multiplying by A is just peated root of det(A − liI), but with only a single (ordi- like multiplying by li. But how do we use a generalized nary) eigenvector xi, satisfying, as usual: (2) eigenvector xi ? The key is in the definition above. It (A − liI)xi = 0: immediately tells us that (2) (2) If li is a double root, we will need a second vector to Axi = lixi + xi: complete our basis. Remarkably,1 it turns out to always be the case for a double root li that xi is in C(A − liI), It will turn out that this has a simple consequence for more k At just as in the example above. Hence, we can always find complicated expressions like A or e , but that’s probably (2) not obvious yet. Let’s try multiplying by A2: a unique second solution xi satisfying: 2 (2) (2) (2) (2) (2) (2) A xi = A(Axi ) = A(lixi + xi) = li(lixi + xi) + lixi (A − liI)x = xi; x ? xi : i i 2 (2) = li xi + 2lixi Again, we can choose x(2) to be perpendicular to x(1) via i i and then try A3: Gram-Schmidt—this is not strictly necessary, but gives a convenient orthogonal basis. (That is, the complete solu- 3 (2) 2 (2) 2 (2) 2 (2) 2 A xi = A(A xi ) = A(li xi + 2lixi) = li (lixi + xi) + 2li xi tion is always of the form xp +cxi, a particular solution xp 3 (2) 2 plus any multiple of the nullspace basis xi. If we choose = li xi + 3li xi: T T c = −xi xp=xi xi we get the unique orthogonal solution From this, it’s not hard to see the general pattern (which (2) (2) xi .) We call xi a generalized eigenvector of A. can be formally proved by induction): 1This fact is proved in any number of advanced textbooks on linear k (2) k (2) k−1 algebra; I like Linear Algebra by P. D. Lax. A xi = li xi + kli xi : 2 Notice that the coefficient in the second term is exactly 3.2 Example d k (li) . This is the clue we need to get the general for- dli mula to apply any function f (A) of the matrix A to the 1 1 Let’s try this for our example 2×2 matrix A = eigenvector and the generalized eigenvector: 0 1 from above, which has an eigenvector x1 = (1;0) and f (A)xi = f (li)xi; (2) a generalized eigenvector x1 = (0;1) for an eigenvalue k At l1 = 1. Suppose we want to comput A u0 and e u0 for (2) (2) 0 u0 = (1;2). As usual, our first step is to write u0 in the ba- f (A)xi = f (li)xi + f (li)xi : sis of the eigenvectors...except that now we also include (2) the generalized eigenvectors to get a complete basis: Multiplying by a function of the matrix multiplies xi by the same function of the eigenvalue, just as for an eigenvector, but also adds a term multiplying xi by the deriva- 1 (2) 0 u0 = = x1 + 2x1 : tive f (li). So, for example: 2 At (2) lit (2) lit k e xi = e xi +te xi : Now, computing A u0 is easy, from our formula above: We can show this explicitly by considering what happens k k k (2) k k (2) k−1 when we apply our formula for Ak in the Taylor series for A u0 = A x1 + 2A x1 = l1 x1 + 2l1 x1 + 2kl1 x1 eAt : 1 1 1 + 2k = 1k + 2k1k−1 = : 2 0 2 ¥ Aktk ¥ tk eAt x(2) = x(2) = (l kx(2) + kl k−1x ) i ∑ k! i ∑ k! i i i i k=0 k=0 For example, this is the solution to the recurrence uk+1 = ¥ (l t)k ¥ (l t)k−1 Auk.

Load more