7 Canonical Forms

7 Canonical Forms 7.1 Jordan Forms, part I As previously, we remain concerned with the question of diagonalizability, more properly: given T 2 L(V), find a basis b such that the matrix [T]b is as close to diagonal as possible. In this chapter we see what is possible when T is not diagonalizable. We start with an example. −8 4 Example 7.1. A = −25 12 2 M2(R) has characteristic equation p(t) = (−8 − t)(12 − t) + 4 · 25 = t2 − 4t + 4 = (t − 2)2 and thus only one eigenvalue l = 2. We quickly observe that the eigenspace has dimension 1: −10 4 2 E = N = Span 2 −25 10 5 We cannot diagonalize A since there are insufficient independent eigenvectors. However, if we con- 2 sider a basis b = fv1, v2g where v1 = 5 is an eigenvector, we see that [LA]b is upper-triangular, x which is better than nothing. How simple can we make this matrix? Let v2 = ( y ), then −8x + 4y x −10x + 4y Av = = 2 + = 2v + (−5x + 2y)v 2 −25x + 12y y −25x + 10y 2 1 2 −5x + 2y =) [L ] = A b 0 2 Since v2 isn’t parallel to v1, the only thing we cannot have is a diagonal matrix. The next best thing is to have the upper right corner be 1; for instance, 2 1 2 1 b = fv , v g = , =) [L ] = 1 2 5 3 A b 0 2 The matrix [LA]b in this example is of a special type: Definition 7.2. A Jordan block is a square matrix of the form 0l 1 1 B .. C B l . C J = B . C @ .. 1A l where l 2 F and all non-indicated entries are zero. Every 1 × 1 matrix is also a Jordan block. A Jordan canonical form is a block-diagonal matrix diag(J1,..., Jm) where each Jk is a Jordan block. A Jordan canonical basis for T 2 L(V) is a basis b of V such that [T]b is a Jordan canonical form. If a map is diagonalizable, then any eigenbasis is Jordan canonical and the corresponding Jordan canonical form is diagonal. Our interest is in the situation when a map isn’t diagonalizable. 1 Example 7.3. It can easily be checked that if 0 1 80 1 0 1 0 19 −1 2 3 < 1 1 1 = A = @−4 5 4A and b = fv1, v2, v3g = @2A , @1A , @0A −2 1 4 : 0 1 1 ; then 03 1 01 Av1 = 3v1, Av2 = v1 + 3v2, Av3 = 2v3 =) [LA]b = @0 3 0A 0 0 2 Thus b is a Jordan canonical basis for A (that is, for LA). Generalized Eigenvectors We have two goals: showing that a linear map has a Jordan canonical basis, and computing such. For instance, in Example 7.3, if we were merely given A, how could we find b? We brute-forced this in Example 7.1, but such is not a reasonable approach in general. Eigenvectors get you some of the distance but not all the way: • v1 is an eigenvector in Example 7.1, but v2 is not; • v1, v3 are eigenvectors in Example 7.3, but v2 is not. The practical question is therefore how we fill out a Jordan canonical basis once we have all the eigenvectors. We now define the necessary objects. Definition 7.4. Let T 2 L(V) have an eigenvalue l: its generalized eigenspace is k [ k Kl := fx 2 V : (T − lI) (x) = 0 for some k 2 Ng = N (T − lI) k2N A non-zero element of Kl is called a generalized eigenvector. n If A 2 Mn(F) then its generalized eigenspaces are those of the linear map LA 2 L(F ). It is easy to check that our earlier Jordan canonical bases consist of generalized eigenvectors: 2 0 0 • In Example 7.1, we have one eigenvalue l = 2. Since (A − 2I) = 0 0 is the zero matrix, it 2 is immediate that we have a single generalized eigenspace: K2 = R . Every non-zero vector is therefore a generalized eigenvector. • In Example 7.3, we easily check that 2 (A − 3I)v1 = 0, (A − 3I) v2 = 0, (A − 2I)v3 = 0 so that the basis b consists of generalized eigenvectors. Indeed it can be seen that K3 = Spanfv1, v2g, K2 = E2 = Spanfv3g though to verify this thoroughly using only the definition is a little awkward. 2 In order to compute generalized eigenspaces directly, it is useful to invoke the main result of this section; we’ll delay the proof until later. Theorem 7.5. Suppose T is a linear operator on a finite-dimensional vector space V over F. Suppose that the characteristic polynomial of T splits over F: m1 mk p(t) = (l1 − t) ··· (lk − t) (∗) where the lj are the distinct eigenvalues of T with multiplicities mj. Then: m 1. For each eigenvalue, Kl = N (T − lI) ; 2. V = Kl1 ⊕ · · · ⊕ Klk so that dim Klj = mj for each eigenvalue. Compare, especially part 2, with the statement on diagonalizability from the start of the course. Observe how Example 7.3 works in this language: p(t) = det(A − tI) = (3 − t)2(2 − t)1 02 −1 −11 8011 0119 2 < = N (A − 3I) = N @0 0 0 A = Span @2A , @1A = K3 =) dim K3 = 2 2 −1 −1 : 0 1 ; 011 1 N (A − 2I) = Span @0A = E2 = K2 =) dim K2 = 1 1 3 R = K3 ⊕ K2 5 2 −1 Example 7.6. We find the generalized eigenspaces of the matrix A = 0 0 0 9 6 −1 The characteristic polynomial is 5 − t −1 p(t) = det(A − lI) = −t = −t(t2 − 5t + t − 5 + 9) = −(0 − t)1(2 − t)2 9 −1 − t whence there are two eigenvalues. 1 1 • l = 0 has multiplicity1 and so K0 = N (A − 0I) = N (A) = Span −1 . This is in fact E0. 3 • l = 2 has multiplicity2, and so 2 03 2 −11 00 −4 01 8011 0019 2 < = K2 = N (A − 2I) = N @0 −2 0 A = N @0 4 0A = Span @0A , @0A 9 6 −3 0 −12 0 : 0 1 ; 1 Observe that the corresponding eigenspace is only one-dimensional; E2 = Span 0 , and so the 3 matrix is not diagonalizable. 3 Observe also that R = K0 ⊕ K2 in accordance with the Theorem. 3 Properties of Generalized Eigenspaces and the Proof of Theorem 7.5 Quite a lot of work is required in order to justify our main result. You might want to skip the proofs at first reading. Lemma 7.7. Let l be an eigenvalue of an operator T on a vector space V. Then: 1. El is a subspace of Kl, which is indeed a subspace of V; 2. Kl is T-invariant; 3. If m 6= l and Kl is finite-dimensional, then T − mI restricted to Kl is an isomorphism of Kl. In particular, Kl does not contain the eigenspace Em. Proof. Part 1 is an easy exercise. k 2. Let x 2 Kl, then 9k 2 N such that (T − lI) (x) = 0. But then (T − lI)kT(x) = (T − lI)kT(x) − lx + lx = (T − lI)k+1(x) + l(T − lI)k(x) = 0 =) T(x) 2 Kl 3. Since Kl is T-invariant, it is also (T − mI)-invariant. It is therefore enough to show that T − mI is injective on Kl. Suppose not, then 9x 2 Kl non-zero such that (T − mI)(x) = 0. Let k 2 N be k k−1 minimal such that (T − lI) (x) = 0, and let y = (T − lI) (x). Clearly y 2 El. But then (T − mI)(y) = (T − lI + (l − m)I(T − lI)k−1(x) = (T − lI)k−1(T − lI + (l − m)I(x) = (T − lI)k−1(T − mI)(x) = 0 Thus y 2 Em \ El =) y = 0, contradicting the minimality of k. Now we come to the main proof: it’s a bit of a monster, but things become much easier once we’re through with it. Proof of Theorem 7.5. Fix an eigenvalue l. Since Kl is T-invariant, the characteristic polynomial pl(t) of the restriction TKl divides that of T. Since Kl contains no eigenvectors except those in El, we see that dim Kl pl(t) = (l − t) =) dim Kl ≤ m By the Cayley–Hamilton Theorem, TKl satisfies its characteristic polynomial, whence dim Kl m (lI − T) (x) = 0, 8x 2 Kl =) Kl ⊆ N (T − lI) m That N (T − lI) ⊆ Kl is by definition of Kl. We’ve therefore established part 1. For part 2, we prove by induction on the number of distinct eigenvalues of T. Let dim V = n and suppose the characteristic polynomial of T splits as in (∗). (Base case) If T has one eigenvalue, then p(t) = (l − t)n. By Cayley–Hamilton, (T − lI)n(x) = 0 for all x 2 V and so V = Kl. 4 (Induction step) Fix k 2 N and suppose the result is true for any map with k − 1 distinct eigenvalues. Suppose T 2 L(V) has k distinct eigenvalues. Let lk have multiplicity mk and define mk W = R(T − lkI) The subspace W has the following properties: • W is T-invariant, mk w 2 W =) 9x such that w = (T − lkI) (x) mk+1 =) T(w) = (T − lkI) (x) + lkw 2 W whence the characteristic polynomial of the restriction TW divides that of T. • By Lemma 7.7, we have isomorphisms (T − l I)K 2 L(K ) for each j 6= k.

Load more