<<

(VI.E)

Set V = Cn and let T : V → V be any linear transformation, with distinct eigenvalues σ1,..., σm. In the last lecture we showed that V decomposes into stable eigenspaces for T :

s s V = W1 ⊕ · · · ⊕ Wm = ker (T − σ1I) ⊕ · · · ⊕ ker (T − σmI).

Let B = {B1,..., Bm} be a for V subordinate to this and set B = [T | ] , so that k Wk Bk

[T]B = diag{B1,..., Bm}.

Each Bk has only σk as eigenvalue. In the event that A = [T]eˆ is s diagonalizable, or equivalently ker (T − σkI) = ker(T − σkI) for all k , B is an eigenbasis and [T]B is a

diag{ σ1,..., σ1 ;...; σm,..., σm }. | {z } | {z } d1=dim W1 dm=dim Wm

Otherwise we must perform further surgery on the Bk ’s separately, in order to transform the blocks Bk (and so the entire matrix for T ) into the “simplest possible” form. The attentive reader will have noticed above that I have written

T − σkI in place of σkI − T . This is a strategic move: when deal- ing with characteristic polynomials it is far more convenient to write det(λI − A) to produce a monic polynomial. On the other hand, as you’ll see now, it is better to work on the individual Wk with the nilpotent transformation T | − σ I =: N . Wk k k

Decomposition of the Stable Eigenspaces (Take 1). Let’s briefly omit subscripts and consider T : W → W with one eigenvalue σ , dim W = d , B a basis for W and [T]B = B. The operator N = T − σI 1 2 (VI.E) JORDAN NORMAL FORM

d (with matrix [N]B = B − σId ) is nilpotent, and so fN(λ) = λ (since its only eigenvalue is 0 ). Since fN(λ) must be the product of the invariant factors (of λI − N), the normal form of λI − N is quite limited: (r) (d) n f (λI − N) = diag{1, . . . , 1, λq ,..., λq } where1 q(r) + ... + q(d) = d. To express the corresponding rational for N we introduce the following notation:

 σ   ..  q  1 .  N :=   [q × q matrix]. σ  .. ..   . .  1 σ Then   q(r) q(d) (1) R(N) = diag N0 ,..., N0 ; for instance, if   1    1    n f (λI − N) =  1     2   λ  λ3 then   0    1 0      R(N) =   .  0     1 0    1 0 (Notice the apparent “gap” between the first and second 1’s.)

1once and for all: the numbers (r) ,..., (d) are not exponents, just superscripts. (VI.E) JORDAN NORMAL FORM 3

So for some change-of-basis matrix S we have

−1 B − σId = [N]B = S R(N) S or −1 −1 B = S R(N) S + σId = S (R(N) + σId) S .

Using (1) this means that if, say, S = SC→B then

 (r) (d)  −1 q q [T]C = S BS = diag N0 ,..., N0 + σId   q(r) q(d) = diag Nσ ,..., Nσ ; for example,   σ    1 σ      n o   = diag N 2, N 3 .  σ  σ σ    1 σ    1 σ

q This is called a big Jordan block, and the “boxes” Nσ are little Jordan blocks. They have very easy characteristic polynomials, namely (λ − σ)q (after all, T : W → W has only eigenvalue σ ). The little blocks correspond to a decomposition of W into N -cyclic subspaces

W = W(r) ⊕ · · · ⊕ W(d) as in the rational canonical structure theorem (§VI.C). C is a basis subordinate to this decomposition, consisting of a choice of “cyclic vector” for each W(i) and its successive images under N ; more on this later. Going back to a transformation T : V → V with multiple eigen- values, what we are aiming at is a further decomposition of each of s the Wk = ker (T − σkI) into Nk -cyclic subspaces:

 (r1) (d1)  (rm) (dm) V = W1 ⊕ · · · ⊕ W1 ⊕ · · · ⊕ Wm ⊕ · · · ⊕ Wm . 4 (VI.E) JORDAN NORMAL FORM

With respect to some special basis C subordinate to this entire direct sum decomposition2 the matrix of T will then be

 (r ) (d ) ( ) ( )  q 1 q 1 rm dm J ( ) = N 1 N 1 N qm N qm T diag σ1 ,..., σ1 ;...; σm ,..., σm , where each semicolon separates two big Jordan blocks3 (and each comma two little Jordan blocks). As it stands this looks formidable, and at least the basis C looks very hard to compute.

Decomposition of Wk (Take 2). So in order to facilitate compu- tation of the entire Jordan decomposition A = SJ S−1 , it is useful to have an approach to the “nilpotent building blocks” Nk : Wk → Wk that does not appeal to rational canonical form.

DEFINITION 1. The height h of a nilpotent transformation N : W → W is defined by Nh = 0, Nh−1 6= 0. (The minimal polynomial of such a transformation is simply λh .) One may also define the (N-)height h(w~ ) of w~ ∈ W by Nh(w~ )w~ = 0, Nh(w~ )−1w~ 6= 0 . Clearly the height of N is just the supremum of the (N-)heights of vectors ∈ W . (The height of a subspace of W, similarly, just means the supremum of heights of vectors in that subspace.)

h0 Notation: We shall write ZN (w~ ) for the N -cyclic subspace of W generated by w~ (w~ of height h0 ); that is,

span{w~ , Nw~ ,..., Nh0−1w~ }.

CLAIM 2. Given N : W → W nilpotent of height h , with

h−1 h−1 {N w~ 1,..., N w~ t} a basis for im(Nh−1). Then we have a decomposition

h h (t+1) (s) W = Z (w~ 1) ⊕ · · · ⊕ Z (w~ t) ⊕ W ⊕ · · · ⊕ W . N N | {z } cyclic of height ≤h−1

2but this is not enough; “special” means the basis also has “cyclic” properties (to be described below) 3there is one for each eigenvalue (VI.E) JORDAN NORMAL FORM 5

PROOF (BYINDUCTIONON h). h = 1 : W = kerN . Take any basis for W ; the spans of the individual basis vectors give the desired decomposition. inductive step : The following picture of W is useful, where N acts by sending you up one level, the top row is kerN , and the height of a vector is its “distance” from the top. h−1 im(N ) (top=0) h−1 h−1 N w N w ker(N) N 1 t

N

N w w 1 t

In order to apply the inductive hypothesis (= the Claim for height h−1 h − 1 ), we consider U = kerN ⊆ W . Now N | U has height h − 1: h−2 im(N| U ) h−1 h−1 h−2 h−2 N w...... N w N u N u 1 t t+1 r

Nw 1 ...... Nwt ut+1 ...... u r

Set {~u1,...,~ut} = {Nw~ 1,..., Nw~ t} ⊆ U , and complete

h−2 h−2  h−1 h−1  {N ~u1,..., N ~ut} = {N w~ 1,..., N w~ t} 6 (VI.E) JORDAN NORMAL FORM

h−2 h−2 h−2 to a basis for im{(N | U) } , by adding some {N ~ut+1,..., N ~ur} as shown. By the inductive hypothesis,

h−1 h−1 h−1 h−1 U = ZN (~u1) ⊕ · · · ⊕ ZN (~ut) ⊕ ZN (~ut+1) ⊕ · · · ⊕ ZN (~ur) ⊕ U(r+1) ⊕ · · · ⊕ U(s) . | {z } N-cyclic of height≤h−2

The picture is now

h−1 h−1 h−2 h−2 N w...... N w N u t+1 N ur 2 1 2 t NwNw Nu Nu 1 t t+1 r Nw ...... Nw u ...... u 1 t t+1 r guaranteed by inductive hypothesis

— U is the direct sum of the columns (= cyclic subspaces under the action of N ). So why not tack w~ 1,..., w~ t on to the bottom of the first t columns and obtain a decomposition

...... Nw Nw 1 t w w 1 t as promised? h−1 h−1 Here’s how to do this rigorously: since N w~ 1,..., N w~ t are h−1 a basis for im(N ) , no nontrivial linear combination of w~ 1,..., w~ t can be in ker(Nh−1) = U . Therefore

(2) span{w~ 1,..., w~ t} ∩ U = {0}, which implies that

dim(span{w~ 1,..., w~ t}) + dim U = dim(span{w~ 1 ... w~ t} + U). (VI.E) JORDAN NORMAL FORM 7

Rank+Nullity applied to Nh−1 =⇒

dim W = dim(span{w~ 1,..., w~ t} + U)

=⇒ W = span{w~ 1,..., w~ t} + U. Together with (2) this =⇒

W = span{w~ 1,..., w~ t} ⊕ U, which by our decomposition of U

 h−1 h−1  = span{w~ 1,..., w~ t} ⊕ ZN (~u1) ⊕ · · · ⊕ ZN (~ut)

h−1 h−1 (r+1) (s) ⊕ZN (~ut+1) ⊕ ... ⊕ ZN (~ur) ⊕ U ⊕ ... ⊕ U .

Since {~u1,...,~ut} was just {Nw~ 1,..., Nw~ t} , the direct sum in the parentheses is just

h h ZN(w~ 1) ⊕ · · · ⊕ ZN(w~ t) and we are done.  The upshot of the Claim just proved is this (weaker)

PROPOSITION 3. N : W → W nilpotent =⇒

(1) (s) h1 hs W = W ⊕ · · · ⊕ W = ZN (w~ 1) ⊕ · · · ⊕ ZN (w~ s); that is, W decomposes into a direct sum of N -cyclic subspaces of various heights.4

So if we take

h1−1 hs−1 C = {w~ 1, Nw~ 1,..., N w~ 1;...; w~ s, Nw~ s,..., N w~ s}, then n o h1 hs [N]C = diag N0 ,..., N0 .

Given T : V → V with distinct eigenvalues {σ1,..., σm} and s associated stable eigenspaces Wk = ker (T − σkI) , we apply this

4 This decomposition is not unique but the numbers s and {h1,..., hs} are. 8 (VI.E) JORDAN NORMAL FORM

Proposition to the nilpotent transformations (T | − σ I) = N : Wk k k Wk → Wk and find

V = W1 ⊕ · · · ⊕ Wm

 (1) (s1)  (1) (sm) = W1 ⊕ · · · ⊕ W1 ⊕ · · · ⊕ Wm ⊕ · · · ⊕ Wm

(i) (i) (i) (i) (i) (i) hk −1 where the Wk are Nk -cyclic. Let Ck = {w~ k , Tw~ k ,..., T w~ k } (i) be cyclic bases for the Wk , and combine them to produce Ck = (1) (sk) {Ck ,..., Ck }, and then C = {C1,..., Cm}. Clearly C is a basis for V and it is subordinate to the entire direct-sum decomposition just written. With respect to this basis h i h i  [T] = diag T | ,..., T | C W1 Wm C1 Cm decomposes into big Jordan blocks and each big block ( ) h i     T | = diag T | (1) ,..., T | (s ) Wk W (1) k (s ) Ck k Wk k Ck Ck decomposes into little blocks. Now for N : W → W it is clear from our pictures that the “top” of each “N -cycle” contributes one to the of N .

( ) PROPOSITION 4. s = {# of W i in the decomposition of W} = dim(kerN).

This has the following important consequence concerning T : V → V .

COROLLARY 5. sk = {# of little Jordan blocks in the big block for σk}= dim{ker(T − σkI)} = geometric multiplicity of σk .

How to compute Jordan form (and an associated basis). We will work in terms of the matrix A = [T]eˆ ; our goal is to rewrite it as −1 SJ S . Begin by computing and factoring fA(λ) = det(λI − A), in order to find the eigenvalues {σ1,..., σm}. Pick some σ = σk to s concentrate on, set W = Wk = ker (A − σkI) , and use the following (VI.E) JORDAN NORMAL FORM 9 two steps to construct C = Ck. (Then repeat for the remaining eigen- values.) Note that in the following A(i) and C(i) always denote finite collections of vectors (like a basis). Step I. Again consider the picture of W :

(1) ker(A−bI) 2 (2) ker(A−bI) 3 ker(A−bI) h (3)

(4)

Find bases for the successive “levels” of W :

A(1) for ker(A − σI), {A(1), A(2)} for ker{(A − σI)2},

{A(1), A(2), A(3)} for ker{(A − σI)3}, and so on. This may require readjustment of bases along the way: the basis for ker(A − σI)2 produced by the usual method may not include ( ) the elements of A 1 , etc. WARNING: the picture is slightly decep- tive because (unlike span(A(1)) which is ker(A − σI) ) span(A(2)), span(A(3)), etc. are not uniquely determined (a different choice of A(2), A(3) etc. could change them). However, the number of vectors (i) in A is unique – call this number ai , and note that a1 ≥ a2 ≥ a3 ≥ ··· . Step II. Now we find the generators C(i) for the cyclic subspaces of height i (1 ≤ i ≤ h ). Here is the picture:5 (1)

(A−bI) (2)

(A−bI) (3)

(A−bI) (4)

5h = 4 in this figure 10 (VI.E) JORDAN NORMAL FORM

(h) (h) (h−1) First set C = A ; this is a set of ah vectors. Then take C ⊆ A(h−1) to be some subset such that n o (A − σI)C(h), C(h−1) is an independent set consisting of ah−1 vectors.

(h−1) REMARK 6. (1) C contains (ah−1 − ah) vectors. If ah−1 = ah then C(h−1) will be empty. n o (2) span (A − σI)C(h), C(h−1) will not in general be the same subspace as span(A(h−1)), although the dimensions are equal. (h) (h) (3) You may wish to put subscripts on the A ’s, viz. Ak (and (h) likewise Ck below), to avoid confusing the bases for different Wk’s.

(h−2) (h−2) Now let C ⊆ A be any subset (containing ah−2 − ah−1 vectors) so that n o (A − σI)2C(h), (A − σI)C(h−1), C(h−2) is an independent set consisting of ah−2 vectors, and so on. It may help to actually write the vectors inside diagrams similar to the ones we have drawn.

The desired basis (for Wk ) is then n (1) (2) (2) (3) (3) 2 (3) o Ck = C ; C , (A − σI)C ; C , (A − σI)C , (A − σI) C ; etc.

(where e.g. (A − σI)C(2) means: apply (A − σI) to each of the vec- tors of C(2) ). Each element of C(i) will correspond to a small Jordan i block of the form Nσ , and this is in fact all the information you need to write down the big Jordan block for σk . In terms of our picture: the whole diagram corresponds to a big Jordan block for one eigenvalue σ ; each column corresponds to a little Jordan (i.e., (A − σI) -cyclic) block; and the dimension of the little block is the height of the corre- sponding column. (VI.E) JORDAN NORMAL FORM 11

EXAMPLE 7. Consider the matrix  2 −1 1 −1   −1 2 −1 1  f (λ) = det(λI − A) =   −→ A A   3  0 1 1 1  = (λ − 1) (λ − 2). 0 −1 1 0 We will work on σ = 1 first. Since  0   −   1  ker(A − I) = span    0  1 is 1 -dimensional, our Corollary =⇒ the big Jordan block for σ = 1 contains 1 little Jordan block! That is, kers(A − I) is cyclic under A − I = N . Therefore we know immediately the Jordan normal form   1    1 1  J =    1 1    2 of A but not the basis change S (yet). Continue to compute:

 −1   0       −  2  0   1  ker(A − I) = span   ,   ,  1   0     0 1 

 0   −1   0          3  0   0   1  ker(A − I) = span   ,   ,    0   1   0     1 0 0   −1   0   0       −     0   1   0  = span   ,   ,   .  1   0   0     0 1 1  12 (VI.E) JORDAN NORMAL FORM

The picture of W = kers(A − I) is

(1)

A−I (2)

A−I (3) where  0   −1   0   −      (1)  1  (2)  0  (3)  0  A =   , A =   , A =   = eˆ4.  0   1   0  1 0 1

The beginning of “Step II” is now C3 = A(3). In fact, there is no C(2) or C(1) (our picture of W has no “steps”, just one column), and so this is also the end. Throwing out A(1), A(2) and replacing them with the 3 cyclic images of C = eˆ4 under (A − I) , the desired basis is       −  0 1 0        n (3) (3) 2 (3)o  0   1   −1  C1 = C , (A − I)C , (A − I) C =   ,   ,    0   1   0         1 −1 1 

(NOT A(1), A(2), A(3) !!). For the eigenvalue σ = 2 we find

 −2   −2         1   1  ker(A − 2I) = span   ; so C2 =   .  1   1    0  0 

−1 Setting C = {C1, C2}, we may now write A = SC J SC =  0 −1 0 −2   1   1 1 1 1   −       0 1 1 1   1 1   1 0 2 0        .  0 1 0 1   1 1   0 −1 1 0  1 −1 1 0 2 −1 0 −1 0 (VI.E) JORDAN NORMAL FORM 13

For a more “general” example, giving rise to a diagram of the form

you might look at   0 0 0 0 0 0    1 0 0 0 0 0     −1 −1 0 0 0 0  A =   .  0 1 0 0 1 0       −1 0 0 0 0 0  1 0 0 0 −1 0

Application to differential equations. So suppose we had a sys- tem of equations

dy1  = 2y1 − y2 + y3 − y4  dt  dy2 = −y + 2y − 2y + y  d~y dt 1 2 3 4 −→ = A~y dy3 = y2 + y3 + y4  dt dt  dy4  dt = −y2 + y3 −1 (where A is the same as in the last example). Setting~c = SC ~y — that is, changing coordinates as usual — we obtain the new system dc dc dc dc 1 = c , 2 = c + c , 3 = c + c , 4 = 2c . dt 1 dt 1 2 dt 2 3 dt 4 2t Clearly c4(t) = Ce , while (c1(t), c2(t), c3(t)) is a linear combina- tion of  t2  et, tet, et , 0, et, tet , and 0, 0, et . 2 14 (VI.E) JORDAN NORMAL FORM

Applying SC to ~c(t), as usual, recovers ~y(t). More generally, for a block equation in Jordan form       c1 λ c1  .   .  d  .   1 λ   .    =     , dt  .   1 λ   .   .     .  cr 1 λ cr one finds that any solution (c1(t),..., cr(t)) is a linear combination of  tr−1  eλt, teλt,...... , eλt (r − 1)! and its “shifts” to the right (as above). The same functions arise in a slightly different fashion (as basis vectors rather than time-dependent coefficients). We are going to find all solutions of the ODE dn f dn−1 f d f + a − + ... + a + a f = 0. dxn n 1 dxn−1 1 dx 0 Working over C , we write m n n−1 ri q(λ) = λ + an−1λ + ... + a1λ + a0 = ∏(λ − σi) i=1

(where the σi are distinct), define   d   V = space of solutions to q f = 0 dx (assume it’s finite-dimensional); and let d D : V → V be the restriction of to V. dx

Clearly q(D) = 0. Let λ0 be an eigenvalue of D ; then there exists a nonzero f ∈ V such that D f = λ0 f . But then 0 = q(D) f = q(λ0) f =⇒ q(λ0) = 0, and so λ0 is a root ( =⇒ λ0 = σi ). Therefore V decomposes into stable eigenspaces

s s V = ker (D − σ1I) ⊕ · · · ⊕ ker (D − σmI). (VI.E) JORDAN NORMAL FORM 15

Moreover, dim ker(D − σkI) = 1 (because the only solution to D f = σ x σk f is Ce k ) , and so each of the stable kernels is cyclic nilpotent, spanned by the iterates of some generator

2 gk , (D − σkI)gk , (D − σkI) gk ,......

Suppose this doesn’t terminate at rk iterations: more precisely say (for N ≥ 1 )

rk+N−1 rk+N (D − σkI) gk 6= 0 but (D − σkI) gk = 0 (this must happen for some N by the definition of stable kernel). Now on the one hand ! ri rk+N−1 (3) q(D)gk = 0 =⇒ ∏(D − σiI) (D − σkI) gk = 0. i6=k

r +N−1 On the other, (D − σkI) k is an eigenvector for D with eigen- σ x value σk (because it is killed by D − σkI ) and therefore must be Ce k (C a nonzero constant). Plugging this in to (3), we have ! σkx σkx 0 = ∏(D − σiI)e = ∏(σk − σi) e , i6=k i6=k which is impossible because the σ1,..., σm are distinct ! s Therefore ker (D − σkI) is cyclic nilpotent (under D − σkI ) with xrk−1 σ x height rk , as you might expect. The generator gk is just e k . (rk−1)! Thus the solutions to the above ODE are of the form m ri−1 xj σix f (x) = ∑ ∑ Cij e . i=1 j=0 j! Uniqueness. We conclude this section with the general statement about matrices and Jordan form. By a , we shall mean i simply a block with blocks of the form Nσ.

THEOREM 8. Let A ∈ Mn(C) be an arbitrary matrix. There is, up to reordering the blocks, exactly one Jordan matrix J (A) similar to A. 16 (VI.E) JORDAN NORMAL FORM

PROOF. Clearly we have established existence: there is an invert- ible S ∈ Mn(C) (far from unique!) such that h −1 = J = {N h11 N 1s1 N hm1 N hmsm } SAS diag σ1 ,..., σ1 ;...; σm ,..., σm , with σ1,..., σm the eigenvalues of J , hence A. Furthermore, for each k, the dimension of the big Jordan block is

sk h = dim(Es (J )) = dim(Es (A)) ∑ kj σk σk j=1 since A and J are similar; likewise, we have seen that sk = dim(Eσk (A)).

To show that the entire list hk1,..., hksk is determined by A as well, write ckl for the number of times ` appears in this list. We have ` `−1 ak` := nullity{(A − σkI) } − nullity{(A − σkI) } ` `−1 = nullity{(J − σkI) } − nullity{(J − σkI) }

= ∑ ckh. h≥` (You may check this by hand, or sort it out from the discussion of (`) (`) bases Ak and Ck above; ak` and ck` are just the numbers of vectors in these sets.) Hence

ck` = ak` − ak,`+1 is completely determined and the uniqueness is proved. 

REMARK 9. Of course, a matrix A with real entries belongs to

Mn(C), and the Theorem applies. However, it should be understood that, as with diagonalization, one has to allow S and J to have com- plex entries in general. EXERCISES 17

Exercises (1) Both of the following matrices have only one eigenvalue:   2 0 0 0 0 0      1 2 0 0 0 0  3 −5 −35    −1 −1 2 0 0 0   0 −4 −49  ,      0 1 0 2 1 0    0 1 10    −1 0 0 0 2 0  1 0 0 0 −1 2 Put them in Jordan normal form (= generalized “diagonal” form, if you will); that is, write the entire decomposition A = SJ S−1. It may help you to draw out the diagrams, and write the vectors you find where they belong. For the first one it should look like

(2) Find all solutions of the system dx dy dz = 2x + z , = y + 2z , = 2z. dt dt dt [Hint: use Jordan.] (3) (a) Let N be cyclic nilpotent on W , dim W = d . (i) What must the minimal polynomial of N be? d (ii) So what is the minimal polynomial of a little Jordan block Nσ ? d1(σ) dr(σ) (iii) How about a big Jordan block diag{Nσ ,..., Nσ } ? (As- sume d1(σ) ≥ · · · ≥ dr(σ) .) (iv) Finally, how about an arbitrary Jordan-form matrix J ? (b) A matrix A having this Jordan form (A = SJ S−1 ) will have the same minimal polynomial (why?). Also the Jordan form is unique , up to reordering of the blocks along the diagonal. Use these facts to explain why A is diagonalizable iff its minimal polynomial has no repeated root. 18 (VI.E) JORDAN NORMAL FORM

(c) What are the minimal polynomials of the matrices of problem (1)? ni (4) Use Jordan form to show that if fA(λ) = ∏i(λ − σi) , then det(A) = ni ∏i σi . (This can be seen without Jordan – cf. the beginning of §V.C – but this proof is simpler.) (5) Consider the transformation T : P3 → P3 on polynomials de- fined by T(p(t)) = p(t + 1) . What must the Jordan form for T be? (You can do this without writing out the matrix [T] .) {1,t,t2,t3} (6) Let A be a matrix with σ as an eigenvalue, J the Jordan matrix to which A is similar. Prove that the number of little Jordan blocks k in J equal to Nσ is the number of invariant factors (of λI − A) in k k k+1 which (λ − σ) appears (that is, (λ − σ) | ∆i(λ), and (λ − σ) - ∆i(λ)). This allows you to compute J from n f (λI − A) alone.