<<

MAS 5312 – Lecture for April 6, 2020

Richard Crew Department of Mathematics, University of Florida The determination of the rational canonical form of a A used the invariant factors of the K[X ]-module VA. A number of questions arise here. First, is there an irrational canonical form? Furthermore the elementary divisors of A are all, like “so what are we? chopped liver”? To these, uh questions we now turn. But first let’s review some basic facts about eigenvectors. Recall that a nonzero v ∈ K n is an eigenvector of A if Av = λv for some λ ∈ K. If this is so, v is annihilated by A − λIn, which must then be a singular matrix. Conversely if A − λIn is singular it annihilates a nonzero vector v, which is then an eigenvector of A with eigenvalue λ. On the other hand A − λIn is singular if and only if det(A − λIn) = ΦA(λ) = 0, so the eigenvalues of A are precisely the roots of the characteristic polynomial ΦA(X ). Of course this polynomial need not have any roots in K, but there is always an extension of K – that is, a field L containing K such that ΦA(X ) factors into linear factors. We will see later how to construct such fields. While we’re at it let’s recall another useful fact about eigenvectors: if v1,..., vr are eigenvectors of A belonging to distinct eigenvalues λ1, . . . , λr then v1,..., vr are independent. Recall the proof: suppose there is a nontrivial dependence relation

a1v1 + a2v2 + ··· + ar vr = 0.

We can assume that this relation is the shortest possible, and in particular all the ai are nonzero. Apply A; we find

λ1a1v1 + λ2a2v2 + ··· + λr ar vr = 0.

On the other hand, multiplying the original relation by λ1 yields

λ1a1v1 + λ1a2v2 + ··· + λ1ar vr = 0.

Subtracting the gives

(λ1 − λ2)a2v2 + ··· + (λ1 − λr )ar vr = 0 which is a shorter dependence relation, nontrivial because λ1 − λi 6= 0 for all i > 1 and all the ai are nonzero. Contradiction. If we are lucky, the characteristic polynomial ΦA(X ) has all its roots in K and all the roots have multiplicity one; then the n eigenvectors v1,..., vn are linearly independent elements of K and are thus a . If the corresponding eigenvalues are λ1, . . . , λn, the matrix of A for the basis v1,..., vn is the with diagonal entries λ1, . . . , λn. And this is good. Well, nice anyway... Of course the roots of ΦA(X ) need not lie in K and one will have to pass to a larger field to find them; this is why this diagonal form is an “irrational” canonical form. But usually replacing K by an extension is not a problem. More serious is the possibility that ΦA(X ) has multiple roots. Of course a “randomly chosen” matrix will be such that ΦA(X ) has only simple roots. In mathematics we usually spend most of our time worrying about the cases that, strictly speaking arise with probability zero. From now on we fix an n × n matrix A and assume that ΦA(X ) factors

n1 n2 nr ΦA(X ) = ±(X − λ1) (X − λ2) ··· (X − λr ) with distinct λ1, . . . , λr ∈ K, so that ni is the multiplicity of λi as P a root of ΦA(X ). Note in this case that i ni = n; the sign, by the way in the above equation is (−1)n. Recall now that ΦA(X ) is, up to a sign the product of the invariant factors of A, and thus of the elementary divisors of A. Then for any i the elementary divisors that are powers of X − λi can be written

mi1 mi1 mis (X − λi ) , (X − λi ) ,..., (X − λi ) i P and then j mij = ni . With this notation the K[X ]-module VK is the direct sum

∼ M mij VK −→ K[X ]/(X − λi ) ij where 1 ≤ i ≤ r and for each i, 1 ≤ j ≤ si . If we choose a basis of m each K[X ]/(X − λi ) ij and conglomerate these into a basis of VK , the matrix of A will be block diagonal, with the diagonal entries being the matrix representation of “multiplication by X ” on the m summand K[X ]/(X − λi ) ij . We could choose the same kind of basis we did before, and the resulting matrix would be the of the elementary m divisor K[X ]/(X − λi ) ij . There is, however a better way. To simplify notation we just look at the K[X ]-module K[X ]/(X − λ)m for some λ ∈ K and integral m > 0. As before we denote by xi the image of X i in K[X ]/(X − λ)m. The companion matrix of (X − λ)m arises from choosing the basis 1, x, x2,..., xm−1 but this is not the simplest possible matrix. Instead, we choose the basis

1, (x − λ), (x − λ)2,..., (x − λ)m−1.

That this is indeed a basis can be seen as follows: by the binomial theorem,

(x − λ)i = xi + linear combination of the xk , k < i.

From this we see that the matrix expressing the (x − λ)i in terms of the xi is upper triangular with 1s on the diagonal. In particular it is invertible, so the xi are all linear combinations of the (x − λ)i . In particular the (x − λ)i must span K[X ]/(X − λ)m, and are thus a basis since there are m of them. We can now compute the matrix of “multiplication by X ” (or by x; it is the same). We find ( (x − λ)i+1 + λ(x − λ)i 0 ≤ i < m − 1, x(x − λ)i = λ(x − λ)m−1 i = m − 1 since (x − λ)m = 0 in K[X ]/(X − λ)m. This can be written

x(1 (x − λ)(x − λ)2 ··· (x − λ)m−1) = λ 1 0 0 ··· 0 0 λ 1 0 ··· 0   0 0 λ 1 ··· 0 (1 (x − λ)(x − λ)2 ··· (x − λ)m−1)   .  . . . . .   . . . . .    0 0 0 0 ··· 1 0 0 0 0 ··· λ The matrix on the right hand side of the last equation is called a Jordan block of size m with eigenvalue λ. In fact it follows from (easy) direct calculation that the characteristic polynomial of this Jordan block is (λ − X )m, so λ is the unique eigenvalue; note that the (unique up to multiples) eigenvector is (x − λ)m−1, by the calculations we did above. That this eigenvector is unique up to multiples can also be seen by explicit calculation, or else from the observation that the kernel of multiplication by X − λ on K[X ]/(X − λ)m is the class of (X − λ)m−1. We have thus proven the following theorem: Theorem Suppose A is an n × n matrix with coefficients in K, and that the distinct roots λ1, . . . , λr of ΦA(X ) all lie in K. If the elementary m divisors of A are the (X − λi ) ij for 1 ≤ i ≤ r and 1 ≤ j ≤ si , then A is similar to the block diagonal matrix whose diagonal entries are the Jordan blocks of size mij and eigenvalue λi . This block diagonal matrix is unique up to permutation of the diagonal entries. Perhaps the only statement in the theorem that needs explanation is the very last one. In fact the number and sizes of the Jordan blocks is determined from the elementary divisors, and thus by the similarity class of A, by the theorem in our last lecture. We will write Jm(λ) to denote a Jordan block of size m with eigenvalue λ, i.e. an m × m matrix with λ on the diagonal, 1 immediately above the diagonal and zeros elsewhere. Let’s go back a bit. Recall that for any PID R, irreducible p ∈ R and R-module M the p-primary component Mp of M is the set of elements annihilated by some power of p. In our case if λ ∈ K is an eigenvalue of A on V = K n, X − λ is irreducible, and if we write V for the K[X ]-module VA then V(X −λ) is the set v ∈ V annihilated by some power of X − λ, or equivalently by some power of A − λ. In V(X −λ) is called the generalized λ-eigenspace for the matrix A:

k V(X −λ) = {v ∈ V | (A − λ) v = 0 for some k ∈ N}. It will be more convenient to write V (λ) for V(X −λ). With this notation, Theorem Suppose A is an n × n matrix with entries in K and set V = K n. If the characteristic polynomial of A factors into linear factors in K, M V = V (λ) λ where the sum is over the eigenvalues of A. Proof. In fact we know that as a K[X ]-module, V is the direct sum of modules of the form K[X ]/(X − λ)n for some eigenvalue λ of A and n ∈ N. The direct sum in the theorem arises by grouping together the terms with the same eigenvalue. We will now reprove some old results: Corollary

If A ∈ Mn(K) has n distinct eigenvalues in K, A is diagonalizable, i.e. similar to a diagonal matrix. Proof. We’ve already proven this, but let’s reprove it using our new technology. If the roots of ΦA are distinct, each V (λ) has the form K[X ]/(X − λ), and is in particular a one-dimensional K-. Thus A is similar to a block-diagonal matrix whose diagonal entries are 1 × 1 matrix, i.e. a diagonal matrix.

Corollary

Suppose λ1, . . . , λr are distinct eigenvalues of A and for 1 ≤ i ≤ r, vi is an eigenvector of A for the eigenvalue λi . Then v1,..., vr are linearly independent.

Proof. In fact the v1,..., vr all lie in distinct summands V (λi ) as in the theorem, and are thus independent. In general, a module M over a (not necessarily commutative) ring R is said to be semisimple (try working that word into your conversation) if it is a finite direct sum of simple modules. We have seen that if R is a PID the simple modules all have the form R/(p) for some irreducible p ∈ R. The first corollary can then be rephrased as saying that A is diagonalizable if and only if the K[X ]-module VA is semisimple; for this reason a diagonalizable matrix A is sometimes called a semisimple matrix. Here’s something new: Corollary

Suppose all of the eigevalues of A ∈ Mn(K) lie in K. Then there is a diagonalizable matrix D and a matrix N such that A = D + N and DN = ND.

Proof. The corollary holds for any Jordan block Jm(λ) – take D to be the scalar matrix with with diagonal entry λ and N to be the matrix with 1s just above the diagonal and 0s elsewhere. But then the corollary holds for any block diagonal matrix with Jordan blocks on the diagonal. Finally if C ∈ GLn(K) is such that CAC −1 is in , write CAC −1 = D0 + N0 with D0 diagonal, N0 nilpotent and D0N0 = N0D0. Then

A = C −1D0C + C −1N0C and we set D = C −1D0C and N = C −1N0C; evidently D is diagonalizable, and if (N0)k = 0, Nk = C −1(N0)k C = 0 as well. By construction A = D + N, and finally since D0 and N0 commute, so do D and N. In fact one can show (1) the hypothesis that the eigenvalues lie in K is unnecessary, (2) D and N can be written as polynomials in A with coefficients in K. The decomposition A = D + N is known as the Jordan-Chevalley decomposition and is important in the theory of Lie algebras and algebraic groups. One application of the Jordan-Chevalley decomposition is too famous to pass without mention. Suppose that dv = Av dt is a first-order system with constant matrix A ∈ Mn(C). Formal calculations show that a fundamental matrix solution is X = eAt . How does one compute the exponential? By writing A = D + N with D diagonalizable, N nilpotent and DN = ND. In fact the last relations implies that e(D+N)t = eDt eNt so all we have to do is compute eDt and eNt . If D = C −1EC with diagonal E, Dk = C −1E k C and consequently eDt = C −1eEt kC, and the exponential of a diagonal matrix is easy (exponentiate the diagonal entries). Finally if Nr = 0,

X Nk tk X Nk tk eNt = = k! k! k≥0 0≤k

i di (λ) = dimK Ker(A − λIn) .

i If mi (λ) is the number of times that (A − λIn) occurs as an invariant factor of VA, then

mi (λ) = −di+1(λ) + di (λ) − di−1(λ)

(c.f. the discussion at the end of the lecture for 3/23). Note, finally that mi (λ) is the number of Jordan blocks of size i and eigenvalue λ in the Jordan decomposition of A.