Chapter 6 the Jordan Canonical Form

Home , Block matrix, Diagonal matrix, Idempotent matrix, Identity matrix, Invertible matrix, Jordan matrix

Chapter 6

6.1 Introduction

The importance of the Jordan canonical form became evident in the last chapter, where it frequently served as an important theoretical tool to derive practical procedures for calculating matrix polynomials. In this chapter we shall take a closer look at the Jordan canonical form of a given matrix A. In particular, we shall be interested in the following questions:

• how to determine its structure;

• how to calculate P such that P −1AP is a Jordan matrix.

As we had learned in the previous chapter in connection with the diagonalization theorem (cf. section 5.4), the eigenvalues and eigenvectors of A yield important clues for determining the shape of the Jordan canonical form. Now it is not diﬃcult to see that for 2 × 2 and 3 × 3 matrices the knowledge of the eigenvalues and eigenvectors A alone suﬃces to determine the Jordan canonical form J of A, but for larger size matrices this is no longe true. However, by generalizing the notion of eigenvectors, we can determine J from this additional information. Thus we shall:

• study some basic properties of eigenvalues and eigenvectors in section 6.2;

• learn how to ﬁnd J and P when m ≤ 3 (section 6.3);

• deﬁne and study generalized eigenvectors and learn how determine J (section 6.4);

• learn a general algorithm for determining P in section 6.5.

In addition, we shall also look at some applications of the Jordan canonical form such as a proof of the Cayley-Hamilton theorem (cf. section 6.6). Other applications will follow in later chapters. 274 Chapter 6: The Jordan Canonical Form 6.2 Algebraic and geometric multiplicities of eigenvalues

As we shall see, much (but not all) of the structure of the Jordan canonical form J of a matrix A can be read off from the algebraic and geometric multiplicities of the eigenvalues of A, which we now define. Definition. Let A be an m × m matrix and λ ∈ C. Then

mA(λ) = multλ(chA), the multiplicity of λ as a root of chA(t) (cf. chapter 3), is called the algebraic multiplicity of λ in A;

νA(λ) = dimCEA(λ) is called the geometric multiplicity of λ in A. Here, as before (cf. section 5.4),

m EA(λ) = {~v ∈ C : A~v = λ~v} = Nullsp(A − λI) denotes the λ-eigenspace of A.

Remarks. 1) Note that the above deﬁnition does not require λ to be an eigenvalue of A. Thus by deﬁnition:

λ is an eigenvalue of A ⇔ νA(λ) ≥ 1 ⇔ mA(λ) ≥ 1.

2) We shall see later (in Theorem 6.4) that we always have νA(λ) ≤ mA(λ). defn 3) By linear algebra: νA(λ) = dim Nullsp(A − λI) = m − rank(A − λI). Example 6.1. Find the algebraic and geometric multiplicities of (the eigenvalues of) the matrices  1 1 2   1 0 2  A =  0 1 2  and B =  0 1 2  0 0 3 0 0 3 Solution. Since A and B are both upper triangular and have the same diagonal entries 1, 1, 3 we see that 2 chA(t) = chB(t) = (t − 1) (t − 3).

Thus, both matrices have λ1 = 1 and λ3 = 3 as their eigenvalues with algebraic multiplicities mA(1) = mB(1) = 2 and mA(2) = mB(2) = 1.

To calculate the geometric multiplicities, we have to determine the ranks of A − λiI and B − λiI for i = 1, 2. Now

 0 1 2   0 0 2   −2 1 2   −2 0 2  A−I =  0 0 2 ,B−I =  0 0 2 ,A−3I =  0 −2 2 ,B−3I =  0 −2 2  . 0 0 2 0 0 2 0 0 0 0 0 0 Section 6.2: Algebraic and geometric multiplicities of eigenvalues 275

Thus, since A−I clearly has 2 linearly independent column vectors, we see that rank(A− I) = 2, and so νA(1) = 3 − rank(A − I) = 3 − 2 = 1. Similarly, rank(B − I) = 1 and so νB(1) = 3 − rank(B − I) = 3 − 2 = 1. Furthermore, since A − 3I and B − 3I both have rank 2, it follows that νA(3) = νB(3) = 3 − 2 = 1. Thus, the geometric multiplicities of the eigenvalues of A and B are

νA(1) = 1, νB(1) = 2 and νA(3) = νB(3) = 1.

Example 6.2. Consider the following three Jordan matrices:  5 0 0   5 1 0   5 1 0  J1 =  0 5 0  ,J2 =  0 5 0  ,J3 =  0 5 1  . 0 0 5 0 0 5 0 0 5 Then their algebraic and geometric multiplicities are given in the following table: 1 2 3 3 3 3 chJi (t) (t − 5) (t − 5) (t − 5)

mJi (5) 3 3 3

νJi (5) 3 2 1

EJi (5) h~e1,~e2,~e3i h~e1,~e3i h~e1i

t t t 3 Here ~e1 = (1, 0, 0) , e2 = (0, 1, 0) , e3 = (0, 0, 1) denote the standard basis vectors of C and h...i denotes the span (= set of all linear combinations) of the vectors. Veriﬁcation of table: To check the ﬁrst two rows of the table we note that  5 − t ∗ ∗  3 3 3 chJi (t) = (−1) det(Ji − 5I) = − det  0 5 − t ∗  = −(5 − t) = (t − 5) . 0 0 5 − t

Thus, for all three matrices λ1 = 5 is the only eigenvalue and its algebraic multiplicity is mJi (5) = 3 (= the exponent of (t − 5) in chJi (t)).

To compute νJi (5), it is enough to ﬁnd rank(Ji − 5I) = the number of non-zero rows of the associated row echelon form. Here we need to consider the three cases separately:

1) Since J1 −5I = 0, and rank(0) = 0, we have νJi (5) = n−rank(J1 −5I) = 3−0 = 3.  0 1 0  2) Next, J2 −5I =  0 0 0 , which is in row echelon form. Thus rank(J2 −5I) = 1, 0 0 0 and hence νJ2 (5) = n − rank(J2 − 5I) = 3 − 1 = 2.  0 1 0  3) Similarly, J3 − 5I =  0 0 1 , which is again in row echelon form. Thus 0 0 0

νJ3 (5) = n − rank(J3 − 5I) = 3 − 2 = 1.

Finally, the indicated basis of EJi (5) is obtained by using back-substitution. 276 Chapter 6: The Jordan Canonical Form

 λ 1 0 0   ......   0 0  Example 6.3. Let J = J(λ, k) =  . . .  be a Jordan block of size k.  . .. .. 1  0 ... 0 λ

k Then: chJ (t) = (t − λ) (since J is upper triangular) t EJ (λ) = {c(1, 0,..., 0) : c ∈ C} (cf. Example 5.7 of chapter 5) mJ (λ) = multλ(chJ ) = k,

νJ (λ) = dimC EA(λ) = 1. The above example shows us how to quickly ﬁnd the algebraic and geometric multiplicities of Jordan blocks. To extend this to Jordan matrices, i.e. to matrices of the form   J1 0 ... 0  .. .   0 J2 . .  J = Diag(J1,J2,...,Jr) =   ,  . .. ..   . . . 0  0 ... 0 Jr where the Ji = J(λi, mi) are Jordan blocks, we shall use the following result.

B 0 Theorem 6.1 (Sum Formula). If A = Diag(B,C) = 0 C , then the algebraic and geometric multiplicities of the eigenvalues of A are the sum of the corresponding multiplicities of those of B and C. In other words, for any λ ∈ C we have

(1) mA(λ) = mB(λ) + mC (λ) and νA(λ) = νB(λ) + νC (λ).

Example 6.4. As in Example 6.2, let J2 = Diag(J (5, 2),J (5, 1)). Then | {z } | {z } B C (1) Ex. 6.3 mJ2 (5) = mB(5) + mC (5) = 2 + 1 = 3, (1) Ex. 6.3 νJ2 (5) = νB(5) + νC (5) = 1 + 1 = 2.

This example generalizes as follows:

Corollary. If J = Diag(J11,J12,...,Jij,... ) is a Jordan matrix with Jordan blocks Jij = J(λi, kij) and λ ∈ C, then

νJ (λ) = the number of Jordan blocks Jij with eigenvalue λi = λ,

mJ (λ) = the sum of the sizes kij of the Jordan blocks Jij with eigenvalue λi = λ. P Proof. By Theorem 6.1 we have νJ (λ) = i,j νJij (λ). Now by Example 6.3 we know that νJij (λ) = 1 if Jij has eigenvalue λi = λ and νJij (λ) = 0 otherwise, so the assertion for νJ (λ) follows. The formula for mJ (λ) is proved similarly. Section 6.2: Algebraic and geometric multiplicities of eigenvalues 277

Theorem 6.1 is, in fact, a special case of a much more precise result. To state it in a convenient form, it is useful to introduce the following notation.

t n t m Notation. (a) If ~v = (v1, . . . , vn) ∈ C and ~w = (w1, . . . , wm) ∈ C , then the vector t n+m ~v ⊕ ~w := (v1, . . . , vn, w1, . . . , wm) ∈ C is called the direct sum of ~v and ~w. (b) If V ⊂ Cn and W ⊂ Cm are subspaces, then the direct sum of V and W is the subspace n+m V ⊕ W = {~v ⊕ ~w ∈ C : ~v ∈ V, ~w ∈ W }. n m Remarks. 1) If ~v1, . . . ,~vr is a basis of V ⊂ C and ~w1, . . . , ~ws is one of W ⊂ C , then t ~v1 ⊕ ~0m, . . . ,~vr ⊕ ~0m,~0n ⊕ ~w1,...,~0n ⊕ ~ws is a basis of V ⊕ W . (Here, ~0m = (0,..., 0) .) | {z } Thus m (2) dim(V ⊕ W ) = dim V + dim W. 2) If A is an a × m matrix and B is a b × n matrix, then for every ~v ∈ Cm and ~w ∈ Cn we have (3) Diag(A, B)(~v ⊕ ~w) = (A~v) ⊕ (B ~w).

t t 0 t Example 6.5. Let V = {c1(1, 2) + c2(3, 4) : c1, c2 ∈ C} and W = {c1(1, 2, 1) + 0 : 0 0 c2(3, 4, 1) c1, c2 ∈ C}. Verify the addition rule (2) for V and W . t t t t Solution. If ~v ∈ V , then ~v = c1(1, 2) +c2(3, 4) = (c1, 2c1) +(3c2, 4c2) = (c1 +3c2, 2c1 + t 0 0 0 0 0 0 t 4c2) , and similarly each ~w ∈ W has the form ~w = (c1 + 3c2, 2c1 + 4c2, c1 + c2) . Thus t 0 0 0 0 0 0 t ~v ⊕ ~w = (c1 + 3c2, 2c1 + 4c2) ⊕ (c1 + 3c2, 2c1 + 4c2, c1 + c2) 0 0 0 0 0 0 t = (c1 + 3c2, 2c1 + 4c2, c1 + 3c2, 2c1 + 4c2, c1 + c2) t t 0 t 0 t = c1(1, 2, 0, 0, 0) + c2(3, 4, 0, 0, 0) + c1(0, 0, 1, 2, 1) + c2(0, 0, 3, 4, 1) , t t 0 t 0 t and so V ⊕ W = {c1(1, 2, 0, 0, 0) + c2(3, 4, 0, 0, 0) + c1(0, 0, 1, 2, 1) + c2(0, 0, 3, 4, 1) : 0 0 c1, c2, c1, c2 ∈ C}. Therefore, dim(V ⊕ W ) = 4 = dim V + dim W . 1 2 5 2 Example 6.6. Verify the multiplication rule (3) for A = 3 4 and B = 6 1 .

Solution. Write ~v = (v1, v2) and ~w = (w1, w2). Then       1 2 0 0 v1 v1 + 2v2  3 4 0 0   v2   3v1 + 4v2  Diag(A, B)(~v ⊕ ~w) =     =    0 0 5 2   w1   5w1 + 2w2  0 0 6 1 w2 6w1 + w2 v + 2v 5w + 2w = 1 2 ⊕ 1 2 3v1 + 4v2 6w1 + w2 1 2 v 5 2 w = 1 ⊕ 1 = A~v ⊕ B ~w. 3 4 v2 6 1 w2 278 Chapter 6: The Jordan Canonical Form

We are now ready to state and prove the following reﬁnement of Theorem 6.1. Theorem 6.2. If A = Diag(B,C), then

(4) chA(t) = chB(t) · chC (t) and EA(λ) = EB(λ) ⊕ EC (λ). Proof. (a) Since the determinant of a block diagonal matrix is the product of the determinants of the blocks, we obtain chA(t) = det(tI−A) = det(Diag(tI−B, tI−C)) = det(tI−B) det(tI−C) = chB(t) chC (t).

(b) Suppose that B is an m × m matrix and C an n × n matrix. Now each ~u ∈ Cm+n can be written uniquely as ~u = ~v ⊕ ~w with ~v ∈ Cm and ~w ∈ Cn, so (A − λI)~u = Diag(B − λI, C − λI)(~v ⊕ ~w) = (B − λI)~v ⊕ (C − λI)~w by (3). Thus, ~u ∈ EA(λ) ⇔ (A − λI)~u = ~0 ⇔ (B − λI)~v = ~0 and (C − λI)~w = ~0 ⇔ ~v ∈ EB(λ) and ~w ∈ EC (λ) ⇔ ~u = ~v ⊕ ~w ∈ EB(λ) ⊕ EC (λ), and so EA(λ) = EB(λ) ⊕ EC (λ), as claimed. B 0 BX Remark. If A = XC or A = 0 C , then it is still true that chA(t) = chB(t) chC (t). However, in this case the second formula of (4) no longer holds (in general). Proof of Theorem 6.1. It is easy to see that Theorem 6.1 follows from Theorem (4) 6.2. Indeed, mA(λ) = multλ(chA) = multλ(chB chC ) = multλ(chB) + multλ(chC ) = mB(λ) + mC (λ), which proves the ﬁrst equality of Theorem 6.1. (4) (2) For the second we observe that νA(λ) = dim EA(λ) = dim(EB(λ) ⊕ EC (λ)) = dim EB(λ) + dim EC (λ) = νB(λ) + νC (λ).  2 1 0 0   1 2 0 0  Example 6.7. Find chA(t) and EA(λ) when A =   .  0 0 0 1  0 0 −1 2 2 1 0 1 Solution. Since A = Diag(B,C), where B = 1 2 and C = −1 2 , it is enough by Theorem 6.2 to work out the eigenspaces and characteristic polynomials for B and C. 2 2 (a) chB(t) = (t − 2) − 1 = t − 4t + 3 = (t − 1)(t − 3) 1 1 EB(1) = {c −1 : c ∈ C},EB(3) = {c 1 : c ∈ C}, and EB(λ) = {0} for λ 6= 1, 3. 2 (b) chC (t) = −t(2 − t) + 1 = (t − 1) 1 EC (1) = {c 1 : c ∈ C}, and EC (λ) = {0} for λ 6= 1. (4) 2 3 (c) Thus: chA(t) = chB(t) · chC (t) = (t − 1)(t − 3) · (t − 1) = (t − 1) (t − 3); (4) 1 1 EA(1) = EB(1) ⊕ EC (1) = {c1 −1 } ⊕ {c2 1 } t t = {c1(1, −1, 0, 0) + c2(0, 0, 1, 1) : c1, c2 ∈ C},

EA(3) = EB(3) ⊕ EC (3) 1 ~ = {c1 1 } ⊕ {0} (Note: 3 is not an eigenvalue of C) t = {c1(1, 1, 0, 0) : c ∈ C},

EA(λ) = EB(λ) ⊕ EC (λ) = {~0} ⊕ {~0} = {~0}, if λ 6= 1, 3. Section 6.2: Algebraic and geometric multiplicities of eigenvalues 279

Another important property of algebraic and geometric multiplicities is the following. Theorem 6.3 (Invariance Property). If B = P −1AP , then

(5) chB(t) = chA(t) and EA(λ) = PEB(λ) := {P~v : ~v ∈ EB(λ)}.

In particular, we have

mB(λ) = mA(λ) and νB(λ) = νA(λ).

Example 6.8. Find the characteristic polynomial and the eigenspaces of

 4 0 −2   2 1 0   2 1 1  −1 A = PBP =  1 2 −1  , where B =  0 2 0  and P =  1 1 1  . 2 0 0 0 0 2 2 0 1

Solution. First note that B = Diag(J1,J2), where J1 = J(2, 2) and J2 = J(2, 1). Thus (5) (4) Ex.6.3 2 3 chA(t) = chB(t) = chJ1 (t) chJ2 (t) = (t − 2) (t − 2) = (t − 2) . Thus, λ1 = 2 is the only eigenvalue of A (and of B). (4) t Moreover, EB(2) = {c1(1, 0) : c1 ∈ C} ⊕ {c2(1) : c2 ∈ C} t t = {c1(1, 0, 0) + c2(0, 0, 1) : c1, c2 ∈ C} (5) t t and hence EA(2) = {c1 P (1, 0, 0) +c2 P (0, 0, 1) : c1, c2 ∈ C} | {z } | {z } 1st column of P 3rd column of P t t = {c1(2, 1, 2) + c2(1, 1, 1) : c1, c2 ∈ C}. 0 2 1 0 1 1 0 1 1 0 1 1 0 2 1 −1 Check: A@ 1 A = (PBP )(P @ 0 A) = PB@ 0 A = P (2@ 0 A) = 2@ 1 A 2 0 0 0 2 t ⇒ (2, 1, 2) ∈ EA(2). Proof of Theorem 6.3. (a) Recall that for any two n × n matrices X and Y we have det(XY ) = det(X) · det(Y ), and hence also det(P −1XP ) = det(P )−1 det(X) det(P ) = det(X). Applying this to X = A − tI yields

n −1 n −1 chP −1AP (t) = (−1) det(P AP − tI) = (−1) det(P (A − tI)P ) n = (−1) det(A − tI) = chA(t).

−1 (b) We have: ~v ∈ EA(λ) ⇔ A~v = λ~v ⇔ PBP ~v = λ~v ⇔ BP −1~v = P −1λ~v ⇔ B(P −1~v) = λ(P −1~v) −1 ⇔ P ~v ∈ EB(λ) ⇔ ~v ∈ PEB(λ).

We have thus shown that ~v ∈ EA(λ) ⇔ ~v ∈ PEB(λ), which means that EA(λ) = PEB(λ). The last two assertions follow from (5) by talking degrees and dimensions. 280 Chapter 6: The Jordan Canonical Form

Corollary. If P −1AP = J is a Jordan matrix, then

νA(λ) = the number of Jordan blocks of J with eigenvalue λ

mA(λ) = the sum of the sizes of the Jordan blocks of J with eigenvalue λ. Proof. Combine Theorem 6.3 with the Corollary of Theorem 6.1:

Th. 6.3 Th. 6.1 νA(λ) = νJ (λ) = #{Jordan blocks Jij with eigenvalue λ}, and the assertion about mA(λ) is proved similarly. Remark. The above corollary represents a fundamental step towards computing the structure of the Jordan canonical form J associated to A: we see that the algebraic and geometric multiplicities reveal the number of Jordan blocks and the sum of their sizes. If n ≤ 3, then the above rules already determine J, as we shall see in more detail in the next section. However, if n ≥ 4, then this is no longer true, as the following example illustrates. Example 6.9. The two Jordan matrices  2 1 0 0   2 0 0 0   0 2 0 0   0 2 1 0  J1 =   and J2 =    0 0 2 1   0 0 2 1  0 0 0 2 0 0 0 2

clearly have the same number of Jordan blocks (so νJ1 (2) = νJ2 (2) = 2), and the sum of their sizes is also the same (so mJ1 (2) = mJ2 (2) = 4), but the Jordan matrices are not the same (even if we rearrange the blocks). Thus, the algebraic and geometric multiplicities alone cannot distinguish between these Jordan forms.

Theorem 6.4 (Jordan Canonical Form). Every square matrix A is similar to a Jordan matrix. In other words, there is an invertible matrix P such that

−1 P AP = J = Diag(J11,...,Jij,...) is a block diagonal matrix consisting of Jordan blocks Jij = J(λi, kij). Moreover:

1) The λ1, λ2, . . . , λs are the (distinct) eigenvalues of A.

2) The number of Jordan blocks Ji1,Ji2,... with eigenvalue λi equals the geometric multiplicity νA(λi) = νi (so that this list ends with Ji,νi ).

3) The sum of the sizes kij of the blocks Ji1,Ji2,...,Ji,νi with eigenvalue λi equals the algebraic multiplicity mi = mA(λi):

(6) ki1 + ki2 + ··· + kiνi = mi; in particular: νi ≤ mi.

4) The Jij’s are uniquely determined by A up to order. Section 6.2: Algebraic and geometric multiplicities of eigenvalues 281

Remarks. 1) In the above statement of the Jordan canonical form the following fact is implicitly used: If J and J 0 are two Jordan matrices which have the same lists of Jordan blocks but in a diﬀerent order, then J and J 0 are similar, i.e. there is a matrix P such that J 0 = P −1JP . [To see why this is true, consider an m×m block diagonal matrix Diag(A, B) consisting of two blocks A and B of size k × k and (m − k) × (m − k), respectively. Then we have

−1 Pk Diag(A, B)Pk = Diag(B,A), i t m where Pk = (~ek+1|~ek+2| ... | ~em|~e1| ... |~ek), and, as usual, ~ei = (0,..., 0, 1, 0,..., 0) ∈ C . From this the above statement about the Jordan matrices follows readily. Note that the same argument also yields the corresponding statement for arbitrary block diagonal matrices.] As a result of this fact, we can always choose the matrix P in Theorem 6.4 in such a way that, after ﬁxing an ordering λ1, . . . , λs of the eigenvalues, the Jordan matrix has the form J = Diag(J11,...,Jij,...), where the Jordan blocks Jij = J(λi, kij) are ordered in deceasing size (for each eigenvalue λi), i.e. we have

ki1 ≥ ki2 ≥ ... ≥ kiνi ; such a Jordan matrix will be said to be in standard form. For example, J = Diag(J(1, 2), J(1, 1), J(2, 2), J(−1, 3),J(−1, 2)) is in standard form but Diag(J(2, 2),J(2, 3)) is not. 2) Conversely, suppose J and J 0 are two Jordan matrices which are similar. Then the lists of Jordan blocks of J and J 0 are the same (up to order), as we shall see later (cf. Theorem 6.5, Corollary). Corollary 1. Two m × m matrices A and B are similar (that is, B = P −1AP for some P ) if and only if they have the same Jordan canonical form (up to order) J. Proof. Let J and J 0 be the Jordan canonical forms of A and B, respectively. Since A is similar to J and B is similar to J 0, it follows that A and B are similar if and only if J and J 0 are similar. Now by the above remark, J and J 0 are similar if and only if they are identical up to the order of their blocks, and so the assertion follows. Corollary 2. A matrix A is diagonable if and only if the algebraic and geometric multiplicities of all its eigenvalues λi are the same:

νA(λ1) = mA(λ1), . . . , νA(λs) = mA(λs). Proof. First note that if A is diagonable, then the associated diagonal matrix P −1AP is the Jordan canonical form of A, and conversely, if the JCF of A is a diagonal matrix, then A is clearly diagonable. Thus: A is diagonable ⇔ its associated Jordan form is a diagonal matrix ⇔ all Jordan blocks have size 1 × 1 ⇔ kij = 1 for all i, j ⇔ νA(λi) = mA(λi), 1 ≤ i ≤ s, by (6). 282 Chapter 6: The Jordan Canonical Form

Remark. By using terminology which will be studied in more detail in the next chapter, the above Corollary 2 can be rephrased more elegantly as follows: Corollary 20. A matrix A is diagonable if and only if all its eigenvalues are regular. Here, an eigenvalue λ of A is called regular if its algebraic and geometric multiplicities coincide, i.e. if mA(λ) = νA(λ). Indeed, the above proof (or equation (6)) shows more precisely that Corollary 3. An eigenvalue λ of A is regular if and only if all its Jordan blocks (in the associated Jordan canonical form J of A) have size 1 × 1.

Exercises 6.2.

1. Find all the eigenvalues, their algebraic and geometric multiplicities and their associated eigenspaces of the matrix A when:

 2 1 0 0   −1 4 0 0  −1 (a) A =   ; (b) A = PBP , where  −1 1 2 1  −1 1 −1 4

 1 1 1 1 1 1 1   1 1 0 0 0 0 0   3 2 1 0 −1 −2 −3   0 1 1 0 0 0 0       9 4 1 0 1 4 9   0 0 1 0 0 0 0      P =  27 8 1 0 −1 −8 −27  and B =  0 0 0 1 1 0 0  .      81 16 1 0 1 16 81   0 0 0 0 1 0 0       243 32 1 0 −1 −32 −243   0 0 0 0 0 2 1  729 64 1 0 1 64 729 0 0 0 0 0 0 2

Hint: In (b), you shouldn’t have to do any calculations.

2. Write down two 4 × 4 Jordan matrices which are not similar and yet have the same eigenvalues and the same algebraic and geometric multiplicities. Justify your answer.

3. Find a Jordan matrix J such that  4 1 −1  −1 PJP =  −2 1 1  2 1 1

for some invertible matrix P . [Do not ﬁnd P !]

2 4 4. Find all the Jordan matrices J (in standard form) with chJ (t) = (t − 1) (t − 2) . Section 6.2: Algebraic and geometric multiplicities of eigenvalues 283

5. Consider the Jordan matrices  1 1 0   1 0 0  J1 =  0 1 0  and J2 =  0 1 1  . 0 0 1 0 0 1

(a) Which of these is standard form? −1 (b) Find a matrix P such that J1 = P J2P . (c) Find a matrix Q such that Q−1JQ is in standard form where

 1 1 0 0 0 0   0 1 0 0 0 0    J1 0  0 0 1 0 0 0  J = Diag(J1,J2) = =   . 0 J2  0 0 0 1 0 0     0 0 0 0 1 1  0 0 0 0 0 1

6. (a) Suppose that A1 and A2 are two square matrices with characteristic polynomial

m1j m2j msj chAj (t) = (t − λ1) (t − λ2) ... (t − λs) ,

Aj th where mij ≥ 0 for 1 ≤ i ≤ s and j = 1, 2. Let Eik denote the ik constituent Aj matrix of Aj for 1 ≤ i ≤ s and 0 ≤ k ≤ mij − 1, and put Eik = 0 if k ≥ mij. Show A that the constituent matrices Eik of A = Diag(A1,A2) are given by

A A1 A2 Eik = Diag(Eik ,Eik ), 1 ≤ i ≤ s, 1 ≤ k ≤ mi1 + mi2 − 1. (b) Use this formula to ﬁnd the constituent matrices of the Jordan matrices

(i) J = Diag(J(−1, 2),J(1, 2)) and (ii) J 0 = Diag(J(−1, 2),J(−1, 3)).

7. Let a0, a1, . . . , an−1 ∈ C and consider the matrix  0 1 0 ... 0   0 0 1 0     ......  A =  . . . .  .    0 0 1  a0 a1 . . . an−2 an−1

(a) Show that the geometric multiplicity of every eigenvalue λ of A is νA(λ) = 1.

(b) Show that A is diagonable if and only if chA(t) has n distinct roots. [Recall n n−1 from section 5.9 that chA(t) = t − an−1t − ... − a1t − a0, but you don’t need this here.] 284 Chapter 6: The Jordan Canonical Form 6.3 How to ﬁnd P such that P −1AP = J (for m ≤ 3)

Before explaining the general procedure of finding the Jordan canonical form J (and the associated matrix P ) of an n×n matrix A, let us first look at the special case that m ≤ 3. The advantage of the case m ≤ 3 is that the algebraic and geometric multiplicities suffice for finding the Jordan canonical form; this no longer true if m ≥ 4; cf. Example 6.9. Nevertheless, in calculating the associated matrix P , we are naturally led to a method which can be generalized to larger matrices, as will become evident in the next sections. This method consists of looking at the so-called generalized eigenvectors which we will need here only in special cases. The basic idea is the following. −1 Basic Idea: In order to find P = (~v1| ... |~vn) such that P AP = J, write this equation as AP = P J. By using the identities

AP = (A~v1| · · · |A~vn), t P (a1| · · · |an) = a1~v1 + ··· + an~vn, the equation AP = PJ translates into a set of (vector) equations for the ~vi’s which we can solve. The following examples show how this method works. 0 1 Example 6.10. If A = , ﬁnd P such that J = P −1AP is a Jordan matrix. −1 −2 Solution. The procedure naturally divides into two steps. Step 1. Find the Jordan canonical form J of A. (i) The characteristic polynomial of A is

−t 1 ch (t) = (−1)2 det = t(t + 2) − 1 = (t + 1)2, A −1 −2 − t and so λ1 = −1 is the only eigenvalue; it has algebraic multiplicity m1 = mA(λ1) = 2. Thus, the sum of the sizes of the Jordan blocks of J is m1 = 2. (ii) Since 1 1 1 1 A + I = → , −1 −1 0 0 t it follows that the λ1-eigenspace is EA(−1) = {c(1, −1) : c ∈ C}; in particular, ν1 = 1. Thus, we have 1 Jordan block. (iii) By combining (i) and (ii) we can conclude: m1 = 2 ⇒ 1 Jordan block of size 2 (with eigenvalue λ1 = −1) ν1 = 1 −1 1 ⇒ J = is the associated Jordan canonical form. 0 −1 Section 6.3: How to ﬁnd P such that P −1AP = J (for m ≤ 3) 285

Step 2. Find P such that P −1AP = J or, equivalently, such that AP = PJ. 2 Write P = (~v1|~v2), with ~v1,~v2 ∈ C . Since

AP = (A~v1|A~v2), −1 1 PJ = (~v1|~v2) 0 −1 = (−~v1|~v1 − ~v2), we want to choose ~v1,~v2 in such a way that ) 1) A~v = −~v 1 1 AP = PJ 2) A~v2 = ~v1 − ~v2

3) ~v1,~v2 are linearly independent (⇔ P is invertible)

Observations: (a) The equations 1) and 2) can also be written in the form

0 1 )(A + I)~v1 = ~0, 0 2 )(A + I)~v2 = ~v1.

0 0 2 2 ) 1 ) (b) These equations imply that (A + I) ~v2 = (A + I)~v1 = ~0, i.e. 2 4) (A + I) ~v2 = ~0.

(d) However: we have to pick ~v2 carefully such that condition 3) holds. It turns out that condition 3) will hold if (and only if) we take 0 3 ) ~v2 ∈/ EA(−1).

[Indeed, suppose that ~v2 ∈/ EA(−1) (and satisﬁes 4)); then ~v1 := (A + I)~v2 6= ~0. Now if c1~v1 + c2~v2 = ~0, then applying A + I yields ~0 = (A + I)(c1~v1 + c2~v2) = c2~v1 (because 0 (A + I)~v1 = 0 by 1 )), so c2 = 0. But then c1~v1 = ~0, so c1 = 0, and hence ~v1 and ~v2 are linearly independent.] These observations lead to the following strategy: 2 Pick ~v2 ∈/ EA(−1) such that (A + I) ~v2 = ~0,

put ~v1 = (A + I)~v2. ! −1 −1 1 Then: P = (~v1|~v2) is invertible and P AP = . 0 −1 Let us apply this strategy here. Since (A + I)2 = 0 (either by direct computation or 2 by using the Cayley-Hamilton Theorem: (A + I) = chA(A) = 0), it follows that every 2 t 1 ~v ∈ C satisﬁes 4). Thus, pick any ~v/∈ EA(−1) = {c(1, −1) }; take, for example, ~v2 = 0 . 286 Chapter 6: The Jordan Canonical Form ! ! ! 1 1 1 1 Then ~v1 = (A + I)~v2 = = , −1 −1 0 −1 ! 1 1 so P = (~v1|~v2) = is the desired matrix. −1 0

−1 0 −1 0 1 1 1 1 2 1 1 −1 1 Check: P AP = 1 1 −1 −2 −1 0 = −1 −1 −1 0 = 0 −1 = J.

Remark. Note that the above procedure depends on picking a vector ~v2 in the space

2 n 2 ~ EA(λ) := {~v ∈ C :(A + λI) ~v = 0}; such a vector is called a generalized eigenvector (of order ≤ 2).

 1 2 −1  Example 6.11. If A =  0 2 0 , ﬁnd P such that P −1AP is a Jordan matrix. 1 −2 3 Solution. We shall follow the steps of the previous example. Step 1. Find the associated Jordan canonical form J. (i) Expanding the determinant along the 2nd row yields

1 − t −1 ch (t) = (−1)3(2 − t) det = (t − 2)[(1 − t)(3 − t) + t] = (t − 2)3. A 1 3 − t

Thus, the only eigenvalue is λ1 = 2; its algebraic multiplicity is m1 = 3. (ii) By row reduction we get

 −1 2 −1   1 −2 1  A − 2I =  0 0 0  →  0 0 0  . 1 −2 1 0 0 0

t t Thus, the 2-eigenspace is EA(2) = {c1(2, 1, 0) + c2(−1, 0, 1) : c1, c2 ∈ C}, and hence ν1 = 2. Therefore, J has 2 Jordan blocks.  2 1 0  (iii) From (i) and (ii) we conclude that J =  0 2 0  . 0 0 2 Step 2. Find P such that AP = PJ. 3 Write P = (~v1|~v2|~v3), where ~v1,~v2,~v3 ∈ C . Then we want to choose the ~vi’s such that AP = PJ, i.e. such that (A~v1|A~v2|A~v3) = (2~v1|~v1 + 2~v2|2~v3). Thus we want:

A~v1 = 2~v1 (A − 2I)~v1 = ~0

A~v2 = ~v1 + 2~v2 (A − 2I)~v2 = ~v1

A~v3 = 2~v3 (A − 2I)~v3 = ~0 Section 6.3: How to ﬁnd P such that P −1AP = J (for m ≤ 3) 287

In addition, we need to pick the ~vi’s such that ~v1, ~v2 and ~v3 are linearly independent. Following the same line of thought as in the previous example, this leads to 2 Strategy: pick ~v2 ∈ EA(2), not in EA(2);

deﬁne ~v1 := (A − 2I)~v2;

pick ~v3 ∈ EA(2), linearly independent from ~v1,~v2.

t t Now EA(2) = {c1(2, 1, 0) + c2(1, 0, −1) } (cf. step 1) 2 3 2 EA(2) = C (since (A − 2I) = 0). t    Thus, take ~v2 = (1, 0, 0) −1 1 2 t  ⇒ ~v1 = (A − 2I)~v2 = (−1, 0, 1) ⇒ P = (~v1|~v2|~v3) =  0 0 1 . t and take ~v3 = (2, 1, 0) ∈ EA(2)  1 0 0  0 0 1   1 2 −1   −1 1 2  Check : P −1AP =  1 −2 1   0 2 0   0 0 1  0 1 0 1 −2 3 1 0 0  1 −2 3   −1 1 2   2 1 0  =  2 −4 2   0 0 1  =  0 2 0  = J; 0 2 0 1 0 0 0 0 2 in the above, P −1 was computed by row reducing (P |I) → (I|P −1):

 −1 1 2 1 0 0   1 0 0 0 0 1   1 0 0 0 0 1   0 0 1 0 1 0  →  0 1 2 1 0 0  →  0 1 0 1 −2 0  . 1 0 0 0 0 1 0 0 1 0 1 0 0 0 1 0 1 0

 2 2 1  Example 6.12. Find P such that P −1AP is a Jordan matrix when A =  0 3 1  . 0 −1 1 Step 1. Find the Jordan canonical form J. (i) By expanding the determinant along the ﬁrst column, we get

3 − t 1 ch (t) = (t − 2) det = (t − 2)[(3 − t)(1 − t) + 1] = (t − 2)3, A −1 1 − t so λ1 = 2 and m1 = 3. (ii) Since  0 2 1   0 2 1  A − 2I =  0 1 1  →  0 0 1  , 0 −1 −1 0 0 0 t we see that the 2-eigenspace is EA(2) = {c(1, 0, 0) : c ∈ C}. Thus, ν1 = 1 and so J 288 Chapter 6: The Jordan Canonical Form consists of 1 Jordan block:  2 1 0  J =  0 2 1  0 0 2 | {z } 1 Jordan block. Step 2. Find P such that AP = PJ.

Again, write P = (~v1|~v2|~v3), and choose the ~vi’s such that AP = PJ, i.e. such that

(A~v1|A~v2|A~v3) = (2~v1|~v1 + 2~v2|~v2 + 2~v3). Thus we want: A~v1 = 2~v1 (A − 2I)~v1 = ~0

A~v2 = ~v1 + 2~v2 (A − 2I)~v2 = ~v1

A~v3 = ~v2 + 2~v3 (A − 2I)~v3 = ~v2 Extending the reasoning of Example 6.10, we see that all these conditions are satisﬁed if we pick ~v3 such that 3 3 3 ~ ~v3 ∈ EA(2) := {~v ∈ C :(A − 2I) = 0}, and then deﬁne ~v2 = (A − 2I)~v3 and ~v1 = (A − 2I)~v2. In addition, we need to pick the 2 ~vi’s to be linearly independent, and this means that we must require that ~v3 ∈/ EA(2). We thus have the following 3 2 Strategy: pick ~v3 ∈ EA(2), not in EA(2),

deﬁne ~v2 := (A − 2I)~v3,

deﬁne ~v1 := (A − 2I)~v2. 2 3 For this, we ﬁrst need to compute the generalized eigenspaces EA(2) and EA(2). Since  0 2 1   0 2 1   0 1 1  2 (A − 2I) =  0 1 1   0 1 1  =  0 0 0  , 0 −1 −1 0 −1 −1 0 0 0

2 3 3 it follows that EA(2) = {c1(0, 1, −1) + c2(1, 0, 0) : c1, c2 ∈ C}. Moreover, EA(2) = C since (A − 2I)3 = 0, as can be seen either by a direct computation or by applying the 3 Cayley-Hamilton Theorem: (A − 2I) = chA(A) = 0. Thus

t EA(2) = Nullsp(A − 2I) = {c(1, 0, 0) } 2 2 t t EA(2) = Nullsp((A − 2I) ) = {c1(0, 1, −1) + c2(1, 0, 0) } 3 3 3 EA(2) = Nullsp((A − 2i) ) = C t 3 2 t Take ~v3 = (0, 0, 1) ∈ C ; note that ~v3 ∈/ EA(2). Then ~v2 = (A − 2I)~v3 = (1, 1, −1) ∈ 2 t EA(2) and ~v1 = (A − 2I)~v2 = (1, 0, 0) ∈ EA(2). Thus  1 1 0  −1 P = (~v1|~v2|~v3) =  0 1 0  satisﬁes: P AP = J. 0 −1 1 Section 6.3: How to ﬁnd P such that P −1AP = J (for m ≤ 3) 289

 1 −1 0   2 2 1   1 1 0  Check : P −1AP =  0 1 0   0 3 1   0 1 0  0 1 1 0 −1 1 0 −1 1  2 −1 0   1 1 0   2 1 0  =  0 3 1   0 1 0  =  0 2 1  = J, 0 2 2 0 −1 1 0 0 2 where (as in the previous example) we have computef P −1 by row reduction:  1 1 0 1 0 0   1 1 0 1 0 0   1 0 0 1 −1 0  (P |I) =  0 1 0 0 1 0  →  0 1 0 0 1 0  →  0 1 0 0 1 0  . 0 −1 1 0 0 1 0 0 1 0 1 1 0 0 1 0 1 1

Remark. For matrices of size ≥ 4 a similar method would also work once we could complete step 1, i.e. predict the matrix J. As long as the geometric multiplicity of every eigenvalue satisﬁes νA(λ) ≤ 3, the above method generalizes without much change, but not when some νA(λ) > 3. However, we shall see presently how to do this in general! Exercises 6.3. 1. Find an invertible matrix P such that P −1AP is in Jordan canonical form, where  2 1 −1  1 −2 (a) A = (b) A = 0 1 0 . 2 5   1 1 0 Also, ﬁnd the Jordan canonical form of A in each case. 2. Find a matrix P such that P −AP is in Jordan canonical form when  −2 1 −1   −2 0 −2  (a) A =  −6 4 −1  (b) A =  −6 2 −3  . 8 −2 4 8 0 6

3. (a) Suppose B is 3 × 3 matrix such that B3 = 0, and there exists ~v ∈ C3 such that B2~v 6= ~0. Show that ~v, B~v, B2~v are linearly independent and that we have  0 1 0  −1 2 P BP = J(0, 3) =  0 0 1  if P = (B ~v|B~v|~v). 0 0 0

3 (b) Let A be a matrix with characteristic polynomial chA(t) = (t − λ) . Suppose there exists a vector ~v ∈ C3 such that B2~v 6= ~0, where B = A − λI. Show that P = (B2~v|B~v|~v) is invertible and that P −1AP = J(λ, 3). (c) More generally, suppose A is an m × m matrix with characteristic polynomial m m m−1 ~ chA(t) = (t − λ) and that there exists a vector ~v ∈ C such that B ~v 6= 0, where B = A−λI. Show that P = (Bm−1~v|Bm−2~v| ... |B~v|~v) is invertible and that P −1AP = J(λ, m). 290 Chapter 6: The Jordan Canonical Form 6.4 Generalized Eigenvectors and the JCF

While for m ≤ 3 the algebraic and geometric multiplicities of the eigenvalues of A determine the Jordan canonical form (JCF), this is no longer true for m ≥ 4, as Example 6.9 shows. For this reason we need to look at generalized eigenspaces.

Deﬁnition. Let A be an m× matrix and λ ∈ C. If p ≥ 1 is an integer, then the p-th generalized eigenspace of A with respect to λ is the subspace p p m p ~ EA(λ) = Nullsp((A − λI) ) = {~v ∈ C :(A − λI) ~v = 0}. Its dimension p p p νA(λ) := dim EA(λ) = m − rank((A − λI) ). p is called the p-th geometric multiplicity of λ in A, and any vector ~v ∈ EA(λ) is called a generalized λ-eigenvector of A of order ≤ p. Remark. The generalized eigenspaces ﬁt into an increasing sequence of subspaces

1 2 p m {0} ⊂ EA(λ) = EA(λ) ⊂ EA(λ) ⊂ · · · ⊂ EA(λ) ⊂ ... ⊂ C , | {z } (usual) eigenspace p p ~ p+1 p for if ~v ∈ EA(λ), then B ~v = 0, where B = A − λI, and hence also B ~v = B(B ~v) = ~ ~ p+1 B0 = 0, i.e. ~v ∈ EA (λ). Thus, the generalized geometric multiplicities satisfy the inequalities 1 2 p 0 ≤ νA(λ) = νA(λ) ≤ νA(λ) ≤ ... ≤ νA(λ) ≤ ... ≤ m.

Notation. We denote the sequence of generalized geometric multiplicities by

∗ 1 2 p νA(λ) = (νA(λ), νA(λ), . . . , νA(λ),...).

Example 6.13. If J = J(λ, k) is a Jordan block of size k, then for p < k we have p + 1 0 0 0 1 ... 0 1 B . . . . . C B . . . . . C B . . . . . C p B . C p B . .. .. C EJ (λ) = Nullsp((J − λI) ) = NullspB . . . 1 C p + 1 B C B . C B . .. C @ . . 0 A 0 ...... 0 = {c1~e1 + ··· + cp~ep},

p k whereas EJ (λ) = Nullsp(0) = C if p ≥ k. Thus, for all p ≥ 1 we have p if p ≤ k (7) νp(λ) = min(p, k) = , J k if p ≥ k and hence ∗ νJ (λ) = (1, 2, 3, . . . , k − 1, k, k, . . .). Section 6.4: Generalized Eigenvectors and the JCF 291

Theorem 6.5 (Properties of generalized eigenvectors). Let λ ∈ C and p ≥ 1. p p p p p p (a) If A = Diag(B,C), then EA(λ) = EB(λ)⊕EC (λ) and hence νA(λ) = νB(λ)+νC (λ). −1 p p p p (b) If B = P AP, then EA(λ) = PEB(λ); in particular, νB(λ) = νA(λ).

p p−1 (8) νA(λ) − νA (λ) = #(Jordan blocks J(λi, kij) of J with λi = λ and kij ≥ p). p+1 p p+q p (d) If νA (λ) = νA(λ), then νA (λ) = νA(λ), for all q ≥ 1. Proof. (a) Since (A − λI)p = Diag(B − λI, C − λI)p = Diag((B − λI)p, (C − λI)p), we have (cf. the proof of Theorem 6.3)

p p p p EA(λ) = Nullsp((A − λI) ) = Nullsp(Diag(B − λI) , (C − λI) ) p p p p = Nullsp((B − λI) , (C − λI) ) = EB(λ) ⊕ EC (λ). This proves the first statement of (a) and the second follows from the first by taking dimensions. (b) We have (B − λI)p = P −1(A − λI)pP, and so p p ~ ~v ∈ EA(λ) ⇔ (A − λI) ~v = 0 ⇔ P −1(A − λI)pPP −1~v = ~0 ⇔ (B − λI)pP −1~v = 0 −1 p ⇔ P ~v ∈ EB(λ), p p p p which means that EA(λ) = PEB(λ). Taking dimensions yields νA(λ) = νB(λ). (c) Suppose first that A = J(λ, m) is a Jordan block. Then by Example 6.13 we have

1 if p ≤ m (9) νp (λ) − νP −1(λ) = min(p, m) − min(p − 1, m) = . A A 0 if p > m

Thus, the formula (8) is true for Jordan blocks. Next, suppose that A = J = Diag(J11,...,Jij,... ) is a Jordan matrix, where Jij = J(λi, kij). Then by (a) and (9) we have

(a) X (9) X νp(λ ) − νp−1(λ ) = νp (λ ) − νp−1(λ ) = 1, J i J i Jij i Jij i j j kij ≥p which proves formula (8) for Jordan matrices. −1 p p−1 p p−1 Finally, if A = PJP , then by (b) we have νA(λ) − νA (λ) = νJ (λ) − νJ (λ), and so formula (8) follows from what was just proved. (d) We ﬁrst note that if A is similar to a Jordan matrix J, then the assertion is clear p p+1 by (c). Indeed, if νA(λ) = νA (λ), then it follows from (c) that J has no Jordan blocks of p+q p+q−1 size ≥ p + 1, and hence also none of size ≥ p + q, which means that νA (λ) = νA (λ). 292 Chapter 6: The Jordan Canonical Form

Now although every matrix A is indeed similar to a Jordan matrix (Jordan’s theorem), we do not want to use this fact here, and so we give a direct proof of (d). This proof is based on the following formula which is also interesting in itself:

p+1 p p (10) νA (λ) − νA(λ) = dim(Im(A − λI) ∩ EA(λ)), in which Im(B) = {B~v : ~v ∈ Cn} denotes (as usual) the image space of a matrix B (also called the range or column space of B). From this formula (10) the assertion follows immediately, for if νp+1 = νp, then also (10) Ep+1 = Ep, and hence νp+2 − νp+1 = (νp+2 − ν) − (νp+1 − ν) = dim(Im(B) ∩ Ep+1) − dim(Im(B) ∩ Ep) = 0. Thus νp+2 = νp+1, and so the claim follows by induction. It thus remains to verify (10). For this, put B = A − λI. Then we have

p+1 p (11) BEA (λ) = Im(B) ∩ EA(λ).

p+1 p+1 p p Indeed, if ~w ∈ BEA (λ), i.e. ~w = B~v with ~v ∈ EA (λ), then B ~w = B (B~v) = p+1 ~ p p B ~v = 0, and so ~w ∈ Im(B) ∩ EA(λ). Conversely, if ~w = B~v ∈ Im(B) ∩ EA(λ), then p+1 p ~ p+1 p+1 B ~v = B ~w = 0, so ~v ∈ EA (λ) and ~w = B~v ∈ BE . Thus, equality holds in (11). Taking dimensions in (11) yields

p p+1 p+1 p+1 dim(Im(B) ∩ EA(λ)) = dim(BEA (λ)) = dim(EA (λ)) − dim(Nullsp(B) ∩ E ), where the latter equality follows from the general fact (the rank-nullity theorem) that for any subspace V we have dim BV = dim V − dim(V ∩ Nullsp(B)). Now since here p+1 p+1 Nullsp(B) = EA(λ) ⊂ EA (λ), it follows that dim(Nullsp(B) ∩ EA (λ)) = dim EA(λ) = p+1 p+1 νA(λ), and so (10) follows since νA (λ) = dim EA (λ) by deﬁnition. p Remark. By part (d) we see that the generalized geometric multiplicities νA(λ) exhibit the following growth pattern:

1 p p+1 p+q νA(λ) < . . . < νA(λ) = νA (λ) = ... = νA (λ) = .... This raises the question: what is the value to which the geometric multiplicities stabilize? Now it turns out (but this is more diﬃcult to prove) that this stabilizing value is precisely the algebraic multiplicity:

p p+1 p νA(λ) = νA (λ) ⇔ νA(λ) = mA(λ).

p Corollary. The numbers νA(λ) determine the Jordan canonical form J of A by taking p second diﬀerences. More precisely, the number nA(λ) of Jordan blocks of J of type J(λ, p) is given by the formula

p 2 p p p−1 p+1 p (12) nA(λ) = ∆ νA(λ) := (νA(λ) − νA (λ)) − (νA (λ) − νA(λ)).

0 p p In particular, two Jordan matrices J and J are similar if and only if νJ (λ) = νJ0 (λ), for all p ≥ 1 and λ ∈ C. Section 6.4: Generalized Eigenvectors and the JCF 293

p p−1 p+1 p Proof. By equation (10) we have (νA(λ) − νA (λ)) − (νA (λ) − νA(λ)) = #(Jordan p blocks of size ≥ p) - #(Jordan blocks of size ≥ p + 1) = nA(λ), which is (12). 0 p p If J and J are similar, then by Theorem 6.5b) we have that νJ (λ) = νJ0 (λ), for all p ≥ 1 and λ ∈ C. Conversely, if all these numbers are equal, then it follows from (12) that J and J 0 have exactly the same number of Jordan blocks of each type and hence are equal up to order. By the remark after Theorem 6.4 we know that then J and J 0 are similar.

Example 6.14. Determine the Jordan blocks of a Jordan matrix A with characteristic 5 polynomial chA(t) = (t − 7) and generalized geometric multiplicities

∗ νA(7) = (2, 4, 5, 5, 5,... )

2 ∗ Solution. By the above corollary, we have to determine the second diﬀerences ∆ νA, which can be calculated by using the following scheme: p 0 1 2 3 4 5

∗ νA 0 2 4 5 5 5 |{z} |{z} |{z} |{z} |{z} ∗ ∆νA 2 2 1 0 0 |{z} |{z} |{z} |{z} 2 ∗ ∆ νA 0 1 1 0 Note that we added an extra column for p = 0 in the above table in order to be able to 2 ∗ compute the ﬁrst column of ∆ νA. Conclusion. A has: 0 blocks of size 1 × 1 1 block of size 2 × 2 1 block of size 3 × 3 0 blocks of size 4 × 4 etc.  7 1 0 0 0   0 7 1 0 0    Thus: A = Diag(J(7, 3),J(7, 2)) =  0 0 7 0 0  (up to order).    0 0 0 7 1  0 0 0 0 7 Remark. In place of the above calculation scheme, we could have used instead the following longer but more detailed analysis: ν4 − ν3 = 5 − 5 = 0 ⇒ 0 blocks of size ≥ 4 A A } 1 block of size 3 ν3 − ν2 = 5 − 4 = 1 ⇒ 1 block of size ≥ 3 A A } 1 block of size 2 ν2 − ν1 = 4 − 2 = 2 ⇒ 2 blocks of size ≥ 2 A A } 0 blocks of size 1 1 0 νA − νA = 2 − 0 = 2 ⇒ 2 blocks of size ≥ 1 294 Chapter 6: The Jordan Canonical Form

 7 0 0 1 0   0 7 0 0 0    Example 6.15. Find the Jordan canonical form of A =  1 0 7 0 0 .    0 0 0 7 0  0 1 0 0 7 Solution. Step 1: Find the characteristic polynomial of A: 7 − t 0 0 1 0  7 − t 0 0 1  0 7 − t 0 0 0   0 7 − t 0 0 ch (t) = (−1)5 det  1 0 7 − t 0 0  = (t − 7) det   A    1 0 7 − t 0   0 0 0 7 − t 0      0 0 0 7 − t 0 1 0 0 7 − t 7 − t 0 0  = −(t − 7)2 det  0 7 − t 0  = (t − 7)5. 1 0 7 − t Step 2: Calculate the generalized geometric multiplicities: Put B = A − 7I. Then  0 0 0 1 0   0 0 0 0 0   0 0 0 0 0   0 0 0 0 0   0 0 0 0 0   0 0 0 0 0    2   3   B =  1 0 0 0 0  ,B =  0 0 0 1 0  ,B =  0 0 0 0 0  ,        0 0 0 0 0   0 0 0 0 0   0 0 0 0 0  0 1 0 0 0 0 0 0 0 0 0 0 0 0 0

p p which have rank 3, 1, and 0 respectively. Thus, since νA(7) = 5 − rank(B ), we see that ∗ νA(7) = (2, 4, 5, 5,...). Step 3: Find the Jordan blocks by the method of second diﬀerences: 5 Since the Jordan canonical form J of A has chJ (t) = (t−7) and its generalized geometric ∗ multiplicities are νJ (7) = (2, 4, 5, 5,...), we can conclude by Example 6.14 that  7 1 0 0 0   0 7 1 0 0    J = Diag(J(7, 3),J(7, 2)) =  0 0 7 0 0  (up to order).    0 0 0 7 1  0 0 0 0 7

 1 0 6 2 0 2   1 2 −1 0 0 0     0 0 4 1 0 1  Example 6.16. Find the Jordan canonical form of A =  .  1 0 −2 2 0 0     −1 0 4 0 2 1  −1 0 −2 −2 0 0 Section 6.4: Generalized Eigenvectors and the JCF 295

Solution. Step 1: Compute the characteristic polynomial of A. Expanding the following determinants successively along the 2nd column, the 4th column and the 3rd row, we get 1 − t 0 6 2 0 2   1 2 − t −1 0 0 0    6  0 0 4 − t 1 0 1  chA(t) = (−1) det    1 0 −2 2 − t 0 0     −1 0 4 0 2 − t 1  −1 0 −2 −2 0 −t 1 − t 6 2 0 2   0 4 − t 1 0 1    = (2 − t)  1 −2 2 − t 0 0     −1 4 0 2 − t 1  −1 −2 −2 0 −t 1 − t 6 2 2  2  0 4 − t 1 1  = (t − 2) det    1 −2 2 − t 0  −1 −2 −2 −t   6 2 2  1 − t 2 2  2 = (t − 2) 1 · det 4 − t 1 1  − (−2)  0 1 1  −2 −2 −t −1 −2 −t 1 − t 6 2  + (2 − t) det  0 4 − t 1  −1 −2 −t = (t − 2)2[(−4 + 6t − 2t2) + 2(1 − t)(2 − t) + (2 − t)(4 − 8t + 5t2 − t3)] = (t − 2)2[(t − 2)((2 − 2t) + 2(t − 1) + (t3 − 5t2 + 8t − 4))] = (t − 2)3[t3 − 5t2 + 8t − 4] = (t − 1)(t − 2)5.

Thus, we have two eigenvalues: λ1 = 1 and λ2 = 2 with algebraic multiplicities mA(λ1) = 1 and mA(λ2) = 5, respectively. Step 2: Find the p-th geometric multiplicities:

(i) For λ1 = 1:  0 0 6 2 0 2   1 0 −2 1 0 0   1 1 −1 0 0 0   0 1 1 −1 0 0       0 0 3 1 0 1   0 0 3 1 0 1  Put B1 = A − I =   →   .  1 0 −2 1 0 0   0 0 0 1 0 1       −1 0 4 0 1 1   0 0 0 0 1 0  −1 0 −2 −2 0 −1 0 0 0 0 0 0 p Thus, νA(λ1) = n − rk(B1) = 6 − 5 = 1. Since also mA(λ1) = 1, and always ν (λ1) ≤ p mA(λ1) (cf. the remark after Theorem 6.5), we see that ν (λ1) = 1 for all p ≥ 1. (Alter- 2 2 natively, we could have computed B1 and noticed that rank(B1 ) = rank(B1).) 296 Chapter 6: The Jordan Canonical Form

(ii) For λ2 = 2:  −1 0 6 2 0 2   1 0 −1 0 0 0   1 0 −1 0 0 0   0 0 1 0 0 0       0 0 2 1 0 1   0 0 0 1 0 1  Put B2 = A − 2I =   →   .  1 0 −2 0 0 0   0 0 0 0 0 1       −1 0 4 0 0 1   0 0 0 0 0 0  −1 0 −2 −2 0 −2 0 0 0 0 0 0

Thus νA(λ2) = n − rk(B2) = 6 − 4 = 2. Furthermore, since  1 0 −2 0 0 0   1 0 −2 0 0 0   −1 0 4 1 0 1   0 0 2 1 0 1      2  0 0 0 0 0 0   0 0 0 0 0 0  B =   →   , 2  −1 0 2 0 0 0   0 0 0 0 0 0       0 0 0 0 0 0   0 0 0 0 0 0  1 0 −2 0 0 0 0 0 0 0 0 0  −1 0 2 0 0 0   1 0 −2 0 0 0   1 0 −2 0 0 0   0 0 0 0 0 0      3  0 0 0 0 0 0   0 0 0 0 0 0  B =   →   , 2  1 0 −2 0 0 0   0 0 0 0 0 0       0 0 0 0 0 0   0 0 0 0 0 0  −1 0 2 0 0 0 0 0 0 0 0 0

2 2 3 3 it follows that νA(2) = n − rk(B2 ) = 6 − 2 = 4, and νA(2) = n − rk(B2 ) = 6 − 1 = 5. We 3 p can stop here because νA(2) = 5 = mA(2), and hence νA(2) = 5 for all p ≥ 3. Step 3: Find the number of Jordan blocks via the method of second diﬀerences. By step 2 we have: ∗ νA(1) = (1, 1, 1, 1,...) ∗ νA(2) = (2, 4, 5, 5,...)

Thus, since νA(1) = mA(1) = 1, we have 1 block J(1, 1). Moreover, since the values of ∗ ∗ νA(2) are identical to those of the νA(7) of Example 6.14, it follows that we have also have the same number of blocks (by taking second diﬀerences).

Thus, J has 1 block of size 1 with eigenvalue λ1 = 1 1 block of size 2 with eigenvalue λ2 = 2 1 block of size 3 with eigenvalue λ2 = 2, and hence the Jordan canonical form J of A is  1 0 0 0 0 0   0 2 1 0 0 0     0 0 2 0 0 0  J = Diag(J(1, 1),J(2, 2),J(2, 3)) =   (up to order).  0 0 0 2 1 0     0 0 0 0 2 1  0 0 0 0 0 2 Section 6.4: Generalized Eigenvectors and the JCF 297

Exercises 6.4.

1. Find the generalized geometric multiplicities of the following Jordan matrices: (a) J = Diag(J(1, 2),J(1, 3)); (b) J = Diag(J(2, 1),J(2, 2),J(2, 3)); (c) J = Diag(J(3, 3),J(3, 3),J(3, 3)); (d) J = Diag(J(1, 3),J(3, 2),J(3, 3)).

2. Let J be a Jordan matrix (in standard form) and suppose that its characteristic m polynomial has the form chJ (t) = (t + 1) . Find J if its sequence of generalized geometric multiplicities is ∗ (a) νJ (−1) = (3, 6, 6, 6,...); ∗ (b) νJ (−1) = (3, 5, 6, 6, 6,...); ∗ (c) νJ (−1) = (3, 6, 7, 7, 7,...); ∗ (d) νJ (−1) = (3, 6, 7, 8, 9, 9, 9,...). 3. Find the Jordan canonical form J of the following matrices and justify your result.

 −2 1 0 −1 1   −2 0 0 −1 1   0 −2 1 −1 1   0 −2 0 −1 1      (a) A =  0 0 −2 0 1  ; (b) B =  0 0 −2 0 1  .      0 0 0 −2 1   0 0 0 −2 1  0 0 0 0 −2 0 0 0 0 −2

[Do not ﬁnd P such that P −1AP = J or P −1BP = J]

4. Find the Jordan canonical form of the matrix  −1 1 0 0 1 1 1 1   0 −1 0 0 0 1 1 1     0 0 −1 1 0 0 1 1     0 0 0 −1 0 0 0 1  A =   .  0 0 0 0 −1 1 1 0     0 0 0 0 0 −1 0 1     0 0 0 0 0 0 1 1  0 0 0 0 0 0 0 1

Be sure to justify your result by suitable computations, but do not ﬁnd P such that P −1AP is a Jordan matrix. 298 Chapter 6: The Jordan Canonical Form 6.5 A procedure for ﬁnding P such that P −1AP = J

p In the previous section we proved that the generalized geometric multiplicities νA(λ) = p dim EA(λ) determine the Jordan canonical form J of a matrix A. Here we shall see that p −1 generalized eigenspaces EA(λ) can be used to ﬁnd a matrix P such that P AP = J; this partly generalizes the method which we learned in section 6.3. Procedure for ﬁnding P such that P −1AP = J

I. Compute and factor the characteristic polynomial of the n × n matrix A:

m1 m2 ms chA(t) = (t − λ1) (t − λ2) ··· (t − λs) .

∗ mi II. For each eigenvalue λi, 1 ≤ i ≤ s, ﬁnd a basis Bi of EA(λi) := EA (λi) as follows:

j j 1. Compute νA(λi) = m − rank((A − λiI) ) for j = 1, . . . , mi, and ﬁnd the k minimal k = ki such that νi = mi. j [It is usually a good idea to ﬁnd the generalized eigenspaces EA(λi) as well.] k mi ∗ 2. Build up a basis Bi of EA(λi) = EA (λi) = EA(λi) as follows: k (k−1) i. Pick a generalized eigenvector ~v ∈ EA(λi) of degree k (so ~v/∈ EA (λi)); start the list Bi with

Bi = {~wi1 := ~v, ~wi2 := (A − λiI)~wi1, . . . , ~wik := (A − λiI)~wi,k−1}.

ii. If we have already mi vectors in the list, Bi is the desired basis and we are done with the eigenvalue λi; otherwise, proceed to the next step. ` (`−1) iii. Determine the largest ` such that EA(λi) 6⊂ EA (λi) + span(Bi), and ` (`−1) pick a generalized eigenvector ~u ∈ EA(λi) such that ~u∈ / EA (λi) + span(Bi). (In other words, pick a generalized eigenvector ~u of highest possible degree ` such that ~u is linearly independent of the vectors already (`−1) in the list Bi, together with those of EA (λi). ) `−1 iv. Add the vectors ~u, (A − λiI)~u,. . . , (A − λiI) ~u at the end of the list Bi. Go back to step ii.

III. Each (ordered) list Bi has now the form Bi = {~wi1, ~wi2, . . . , ~wimi }. Assemble these lists as the column vectors of the matrix P by reversing the order in each list Bi; thus,

P = (~w1m1 | ~w1,m1−1 | ... | ~w11 | ~w2m2 | ~w2,m2−1 | ... | ~w21 | ... | ~wsms | ... | ~ws1). | {z } | {z } | {z } B1 B2 Bs Then: J = P −1AP is the Jordan Canonical Form of A, and the Jordan blocks with the same eigenvalue are arranged in order of increasing block size, i.e. J is in reverse standard form (cf. p. 281). Section 6.5: A procedure for ﬁnding P such that P −1AP = J 299

 3 1 0 0 0   0 3 0 0 0    Example 6.17. Verify the algorithm for A = Diag(J(3, 2),J(3, 3)) =  0 0 3 1 0 .    0 0 0 3 1  0 0 0 0 3 Solution. Since this matrix is already in (reverse standard) Jordan Canonical Form, we know that we can take P = I. Thus, we expect the algorithm to assemble the identity matrix I.

5 I. Clearly, det(A) = (t − 3) , so λ1 = 3 and m1 = n = 5. II. 1. The generalized eigenspaces are:  0 1 0 0 0  0 0 0 0 0   EA(3) = Nullspace(A − 3I) = Nullsp  0 0 0 1 0  = {c1~e1 + c2~e3}  0 0 0 0 1  0 0 0 0 0  0 0 0 0 0  0 0 0 0 0 2 2   EA(3) = Nullsp((A − 3I) ) = Nullsp  0 0 0 0 1  = {c1~e1 + c2~e2 + c3~e3 + c4~e4}  0 0 0 0 0  0 0 0 0 0 3 3 5 EA(3) = Nullsp((A − 3I) ) = Nullsp(0) = C

2 3 Thus, νA(3) = 2 < νA(3) = 4 < νA(3) = 5 = m1, and so k = 3. 3 5 2. We now construct the basis B1 of EA(3) = C as follows.

5 2 i. Pick ~v ∈ C of exact degree 3, i.e., ~v∈ / EA(3). For example, take ~v = ~e5. Then: ~w11 = ~e5, ~w12 = (A − 3I)~e5 = ~e4, ~w13 = (A − 3I)~e4 = ~e3.

ii. At this point the list is B1 = {~e5,~e4,~e3}, which consists only of 3 < 5 elements, so we continue with step iii. 2 3 iii. Since EA(3) + span(B1) = EA(3), we cannot take ` = 3. Thus, try ` = 2, 2 and look for ~u ∈ EA(3) = {c1~e1 + c2~e2 + c3~e3 + c4~e4} such that ~u/∈ EA(3) + span(B1) = {c1~e1 + c3~e3 +c2~e2 +c4~e4}. Clearly, we can take ~u = ~e2 (and hence | {z } EA(3) ` = 2).

iv. Thus, we add ~u = ~e2 and (A − 3I)~u = ~e1 at the end of B1 to get B1 = {e5, e4, e3, e2, e1}. Since we now have 5 = m1 vectors in B1, we have con- 3 5 structed the desired basis of EA(3) = C .

III. Assembling P from B1 (in reverse order) yields P = (~e1| ... |~e5) = I, the 5 × 5 identity matrix. Thus, P −1AP = A = J is in Jordan Canonical Form. 300 Chapter 6: The Jordan Canonical Form

Example 6.18. If A is the matrix of Example 6.16, ﬁnd P such that J = P −1AP is the Jordan canonical form of A. Solution. We follow the steps of the algorithm. Step I. Compute and factor the characteristic polynomial: 5 By Example 6.16 we know that the characteristic polynomial is chA(t) = (t − 1)(t − 2) , so we have 2 eigenvalues: λ1 = 1 and λ2 = 2.

mi Step II. For each i ﬁnd a basis Bi of EA (λi): a) For λ1 = 1:

1. Again, from Example 6.16 we know that νA(1) = 1; moreover, by the reduced matrix given there we obtain t EA(1) = h(1, −1, 0, −1, 0, 1) i t 2. Thus, B1 = {~w11}, where ~w11 := (1, −1, 0, −1, 0, 1) . b) For λ2 = 2: ∗ 1. From Example 6.16 we know that νA(2) = (2, 4, 5, 5,...), so the smallest k such that k νA(2) = m2 is k = 3. Moreover, from the row reduced matrices of Example 6.16 we obtain:

t EA(2) = h~u11, ~u12i, where ~u11 = (0, 1, 0, 0, 0, 0) , t ~u12 = (0, 0, 0, 0, 1, 0) ; 2 t EA(2) = h~u11, ~u12, ~u21, ~u22i, where ~u21 = (0, 0, 0, −1, 0, 1) , t ~u22 = (2, 0, 1, −2, 0, 0) ; 3 t EA(2) = h~u11, ~u12, ~u31, ~u32, ~u33i, where ~u31 = (0, 0, 0, 1, 0, 0) , t ~u32 = (0, 0, 0, 0, 0, 1) , t ~u33 = (2, 0, 1, 0, 0, 0) .

Note that there are two relations among the uij’s: ~u21 = −~u31 +~u32 and ~u22 = ~u33 −2~u31. t 3 2 2. i. Clearly, ~v = ~u31 = (0, 0, 0, 1, 0, 0) ∈ EA(2) but ~v/∈ EA(2), so ~v has degree 3. Thus, we can start the list B2 with ~v as a generator. (We could also take instead ~v = ~u32 or ~v33 or most linear combinations of these vectors.) Thus, put ~w11 = ~v, ~w12 = (A − 2I)~w11 = t t (2, 0, 1, 0, 0, −2) = ~u22 − 2~u21, ~w13 = (A − 2I)~w12 = (0, 1, 0, 0, 0, 0) = ~u11. (Note that (A − 2I)~w13 = ~0 as expected since k = 3.) We thus have the list

B21 = {~w11, ~w12, ~w13}.

ii. Since #B21 = 3 < m2 = 5, we continue with the next step. 2 3 iii. Clearly EA(2) + span(B21) = h~u11, ~u12, ~u21, ~u22, ~u31, ~u22 − 2~u21i = EA(2), so ` < 3. 2 However, ~u21 ∈/ EA(2) + span(B21) = span(B21) but ~u21 ∈ EA(2), so ` = 2. Moreover, we t can take ~w24 := ~u21 as the next generator. Thus ~w25 := (A − 2I)~w24 = (0, 0, 0, 0, 1, 0) = ~u21 is the next element in the list:

B22 = {~w24, ~w25}. Section 6.5: A procedure for ﬁnding P such that P −1AP = J 301

ii. We have now constructed the list B2 = B21 ∪ B22 = {~w21, . . . , ~w25}. Since #B2 = 5 = mA(2), we are done with step II.

Step III. In step II we had constructed the lists B1 = {~w11} and B2 = B21 ∪ B22 = {~w11, . . . , ~w15}. Assembling the elements of each list in reverse order yields  1 0 0 0 2 0   −1 0 0 1 0 0     0 0 0 0 1 0  P = (~w11|~w25|~w24| ... |~w21) =   .  −1 0 −1 0 0 1     0 1 0 0 0 0  1 0 1 0 −2 0 Thus, P is the matrix which transforms A to its Jordan canonical form; i.e. P is such that J = P −1AP is a Jordan matrix. Check: P −1AP  1 0 −2 0 0 0  1 0 6 2 0 2   1 0 0 0 2 0   0 0 0 0 1 0  1 2 −1 0 0 0   −1 0 0 1 0 0        −1 0 4 0 0 1  0 0 4 1 0 1   0 0 0 0 1 0  =       1 1 −2 0 0 0  1 0 −2 2 0 0   −1 0 −1 0 0 1        0 0 1 0 0 0  −1 0 4 0 2 1   0 1 0 0 0 0  0 0 2 1 0 1 −1 0 −2 −2 0 0 1 0 1 0 −2 0  1 0 −2 0 0 0   1 0 0 0 2 0   1 0 0 0 0 0   −1 0 4 0 2 1   −1 0 0 1 0 0   0 2 1 0 0 0         −2 0 8 0 0 2   0 0 0 0 1 0   0 0 2 0 0 0  =     =   .  2 2 −3 0 0 0   −1 0 −1 0 0 1   0 0 0 2 1 0         0 0 4 1 0 1   0 1 0 0 0 0   0 0 0 0 2 1  0 0 4 2 0 2 1 0 1 0 −2 0 0 0 0 0 0 2

Exercises 6.5. 1. Find a matrix P such that P −1AP is in Jordan canonical form for each of the matrices A of Problem 3 of Exercise 6.4. 2. If A is as in Problem 4 of Exercises 6.4, ﬁnd P such that P −1AP is in Jordan canonical form. 3. Find a matrix P such that P −1AP is in Jordan canonical form where  −1 1 0 0 1 1 1 1   0 −1 0 0 0 1 1 1     0 0 −1 1 0 0 1 1     0 0 0 −1 0 0 0 1  A =   .  0 0 0 0 −1 1 1 0     0 0 0 0 0 1 0 0     0 0 0 0 0 1 1 0  0 0 0 0 0 1 1 1 302 Chapter 6: The Jordan Canonical Form 6.6 A Proof of the Cayley–Hamilton Theorem

As was promised in the introduction, we want to use Jordan’s theorem (Theorem 6.4) to prove:1

Theorem 6.6 (Cayley-Hamilton). For any square matrix A we have chA(A) = 0.

Proof. We present here a proof that is typical of all the proofs using the Jordan canonical form in that they all follow the following pattern: Step 1. Prove the assertion for Jordan blocks. Step 2. Prove the statement for Jordan matrices (using step 1). Step 3. Deduce from step 2 and Jordan’s theorem that the assertion is true for a general matrix. We now apply this strategy to proving the Cayley-Hamilton theorem. Step 1. Let A = J(λ, k) be a Jordan block. k Then clearly chA(t) = (t − λ) , so

k k chA(A) = (J(λ, k) − λI) = J(0, k) = 0 because for any p ≤ k we have (cf. Example 6.13):

p + 1 0 0 0 1 ... 0 1 B . . . . . C B . . . . . C B . . . . . C B . C p B . .. .. C J(0, k) = B . . . 1 C p + 1 B C B . C B . .. C @ . . 0 A 0 ...... 0

Thus, the Cayley-Hamilton theorem holds for Jordan blocks.

Step 2. Let A = Diag(J11,...,Jij,...) be a Jordan matrix.

Put c(t) = chA(t). Since A is a diagonal block matrix, we have

c(t) = chJ11 (t) ··· chJij (t) ··· ,

so in particular for each i, j we have c(t) = gij(t) chJij (t) for some polynomial gij(t).

Thus, c(Jij) = gij(Jij) chJij (Jij) = 0 since by step 1 we have chJij (Jij) = 0. Thus

c(A)=c(Diag(J11,...,Jij,...))=Diag(c(J11), . . . , c(Jij),...)=Diag(0,..., 0,... )=0, and so the statement holds for Jordan matrices. 1In the appendix we shall give a direct proof of the Cayley-Hamilton Theorem; cf. Theorem 6.7 and the remark following it. Section 6.6: A Proof of the Cayley–Hamilton Theorem 303

Step 3. Let A be an arbitrary matrix. By Jordan’s theorem, there is a matrix P such that J = P −1AP is a Jordan matrix. Then chA(t) = chJ (t) and so, using step 2, we obtain

−1 −1 chA(A) = chJ (PJP ) = P chJ (J) P = 0. | {z } 0 Thus, the Cayley-Hamilton theorem holds for an arbitrary matrix A. As was mentioned in the above proof, the basic strategy used in the proof applies to many other situations as well. For example:

Example 6.19. Find all m × m matrices A such that A2 = I. Solution. We follow the above strategy. Step 1. Find all Jordan blocks J = J(λ, m) satisfying J 2 = I. By the explicit formula of powers of Jordan blocks (cf. Theorem 5.7), this can only happen if m = 1. Furthermore, in that case we must have that λ = ±1, i.e. either λ = 1 or λ = −1. 2 Step 2. Find all Jordan matrices J = Diag(J11,...,Jij ...) satisfying J = I. 2 Since J = Im implies that Jij = Ikij , we obtain from step 1 that J is a diagonal matrix with ±1 along the diagonal. (Conversely, every matrix J of this form satisfies J 2 = I.) Step 3. General case: A = PJP −1 is any matrix such that A2 = I. Then we also have J 2 = P −1A2J = I, so by step 2 A is similar to a diagonal matrix J = Diag(±1,..., ±1). Conversely, any matrix A of this form satisfies A2 = I. Conclusion. A matrix A satisfies A2 = I if and only if it is similar to a diagonal matrix of the form Diag(±1,..., ±1).

Exercises 6.6.

1. Let A be a matrix with characteristic polynomial

m1 m2 ms chA(t) = (t − λ1) (t − λ2) ··· (t − λs) .

Prove that for any polynomial f(t) ∈ C[t], the characteristic polynomial of f(A) is

m1 m2 ms chf(A)(t) = (t − f(λ1)) (t − f(λ2)) ··· (t − f(λs)) .

Hint: First prove it in the case that A is a Jordan block, then for a general Jordan matrix, and then use Jordan’s theorem.

2. Find all m × m matrices A satisfying A2 = A. (A matrix satisfying this equation is called an idempotent matrix.)