5312 – Lecture for April 6, 2020
Total Page:16
File Type:pdf, Size:1020Kb
MAS 5312 { Lecture for April 6, 2020 Richard Crew Department of Mathematics, University of Florida The determination of the rational canonical form of a matrix A used the invariant factors of the K[X ]-module VA. A number of questions arise here. First, is there an irrational canonical form? Furthermore the elementary divisors of A are all, like \so what are we? chopped liver"? To these, uh questions we now turn. But first let's review some basic facts about eigenvectors. Recall that a nonzero v 2 K n is an eigenvector of A if Av = λv for some λ 2 K. If this is so, v is annihilated by A − λIn, which must then be a singular matrix. Conversely if A − λIn is singular it annihilates a nonzero vector v, which is then an eigenvector of A with eigenvalue λ. On the other hand A − λIn is singular if and only if det(A − λIn) = ΦA(λ) = 0, so the eigenvalues of A are precisely the roots of the characteristic polynomial ΦA(X ). Of course this polynomial need not have any roots in K, but there is always an extension of K { that is, a field L containing K such that ΦA(X ) factors into linear factors. We will see later how to construct such fields. While we're at it let's recall another useful fact about eigenvectors: if v1;:::; vr are eigenvectors of A belonging to distinct eigenvalues λ1; : : : ; λr then v1;:::; vr are independent. Recall the proof: suppose there is a nontrivial dependence relation a1v1 + a2v2 + ··· + ar vr = 0: We can assume that this relation is the shortest possible, and in particular all the ai are nonzero. Apply A; we find λ1a1v1 + λ2a2v2 + ··· + λr ar vr = 0: On the other hand, multiplying the original relation by λ1 yields λ1a1v1 + λ1a2v2 + ··· + λ1ar vr = 0: Subtracting the gives (λ1 − λ2)a2v2 + ··· + (λ1 − λr )ar vr = 0 which is a shorter dependence relation, nontrivial because λ1 − λi 6= 0 for all i > 1 and all the ai are nonzero. Contradiction. If we are lucky, the characteristic polynomial ΦA(X ) has all its roots in K and all the roots have multiplicity one; then the n eigenvectors v1;:::; vn are linearly independent elements of K and are thus a basis. If the corresponding eigenvalues are λ1; : : : ; λn, the matrix of A for the basis v1;:::; vn is the diagonal matrix with diagonal entries λ1; : : : ; λn. And this is good. Well, nice anyway... Of course the roots of ΦA(X ) need not lie in K and one will have to pass to a larger field to find them; this is why this diagonal form is an \irrational" canonical form. But usually replacing K by an extension is not a problem. More serious is the possibility that ΦA(X ) has multiple roots. Of course a \randomly chosen" matrix will be such that ΦA(X ) has only simple roots. In mathematics we usually spend most of our time worrying about the cases that, strictly speaking arise with probability zero. From now on we fix an n × n matrix A and assume that ΦA(X ) factors n1 n2 nr ΦA(X ) = ±(X − λ1) (X − λ2) ··· (X − λr ) with distinct λ1; : : : ; λr 2 K, so that ni is the multiplicity of λi as P a root of ΦA(X ). Note in this case that i ni = n; the sign, by the way in the above equation is (−1)n. Recall now that ΦA(X ) is, up to a sign the product of the invariant factors of A, and thus of the elementary divisors of A. Then for any i the elementary divisors that are powers of X − λi can be written mi1 mi1 mis (X − λi ) ; (X − λi ) ;:::; (X − λi ) i P and then j mij = ni . With this notation the K[X ]-module VK is the direct sum ∼ M mij VK −! K[X ]=(X − λi ) ij where 1 ≤ i ≤ r and for each i, 1 ≤ j ≤ si . If we choose a basis of m each K[X ]=(X − λi ) ij and conglomerate these into a basis of VK , the matrix of A will be block diagonal, with the diagonal entries being the matrix representation of \multiplication by X " on the m summand K[X ]=(X − λi ) ij . We could choose the same kind of basis we did before, and the resulting matrix would be the companion matrix of the elementary m divisor K[X ]=(X − λi ) ij . There is, however a better way. To simplify notation we just look at the K[X ]-module K[X ]=(X − λ)m for some λ 2 K and integral m > 0. As before we denote by xi the image of X i in K[X ]=(X − λ)m. The companion matrix of (X − λ)m arises from choosing the basis 1; x; x2;:::; xm−1 but this is not the simplest possible matrix. Instead, we choose the basis 1; (x − λ); (x − λ)2;:::; (x − λ)m−1: That this is indeed a basis can be seen as follows: by the binomial theorem, (x − λ)i = xi + linear combination of the xk , k < i. From this we see that the matrix expressing the (x − λ)i in terms of the xi is upper triangular with 1s on the diagonal. In particular it is invertible, so the xi are all linear combinations of the (x − λ)i . In particular the (x − λ)i must span K[X ]=(X − λ)m, and are thus a basis since there are m of them. We can now compute the matrix of \multiplication by X " (or by x; it is the same). We find ( (x − λ)i+1 + λ(x − λ)i 0 ≤ i < m − 1; x(x − λ)i = λ(x − λ)m−1 i = m − 1 since (x − λ)m = 0 in K[X ]=(X − λ)m. This can be written x(1 (x − λ)(x − λ)2 ··· (x − λ)m−1) = 0λ 1 0 0 ··· 01 B0 λ 1 0 ··· 0C B C B0 0 λ 1 ··· 0C (1 (x − λ)(x − λ)2 ··· (x − λ)m−1) B C : B . C B . C B C @0 0 0 0 ··· 1A 0 0 0 0 ··· λ The matrix on the right hand side of the last equation is called a Jordan block of size m with eigenvalue λ. In fact it follows from (easy) direct calculation that the characteristic polynomial of this Jordan block is (λ − X )m, so λ is the unique eigenvalue; note that the (unique up to multiples) eigenvector is (x − λ)m−1, by the calculations we did above. That this eigenvector is unique up to multiples can also be seen by explicit calculation, or else from the observation that the kernel of multiplication by X − λ on K[X ]=(X − λ)m is the class of (X − λ)m−1. We have thus proven the following theorem: Theorem Suppose A is an n × n matrix with coefficients in K, and that the distinct roots λ1; : : : ; λr of ΦA(X ) all lie in K. If the elementary m divisors of A are the (X − λi ) ij for 1 ≤ i ≤ r and 1 ≤ j ≤ si , then A is similar to the block diagonal matrix whose diagonal entries are the Jordan blocks of size mij and eigenvalue λi . This block diagonal matrix is unique up to permutation of the diagonal entries. Perhaps the only statement in the theorem that needs explanation is the very last one. In fact the number and sizes of the Jordan blocks is determined from the elementary divisors, and thus by the similarity class of A, by the theorem in our last lecture. We will write Jm(λ) to denote a Jordan block of size m with eigenvalue λ, i.e. an m × m matrix with λ on the diagonal, 1 immediately above the diagonal and zeros elsewhere. Let's go back a bit. Recall that for any PID R, irreducible p 2 R and R-module M the p-primary component Mp of M is the set of elements annihilated by some power of p. In our case if λ 2 K is an eigenvalue of A on V = K n, X − λ is irreducible, and if we write V for the K[X ]-module VA then V(X −λ) is the set v 2 V annihilated by some power of X − λ, or equivalently by some power of A − λ. In linear algebra V(X −λ) is called the generalized λ-eigenspace for the matrix A: k V(X −λ) = fv 2 V j (A − λ) v = 0 for some k 2 Ng: It will be more convenient to write V (λ) for V(X −λ). With this notation, Theorem Suppose A is an n × n matrix with entries in K and set V = K n. If the characteristic polynomial of A factors into linear factors in K, M V = V (λ) λ where the sum is over the eigenvalues of A. Proof. In fact we know that as a K[X ]-module, V is the direct sum of modules of the form K[X ]=(X − λ)n for some eigenvalue λ of A and n 2 N. The direct sum in the theorem arises by grouping together the terms with the same eigenvalue. We will now reprove some old results: Corollary If A 2 Mn(K) has n distinct eigenvalues in K, A is diagonalizable, i.e. similar to a diagonal matrix. Proof. We've already proven this, but let's reprove it using our new technology. If the roots of ΦA are distinct, each V (λ) has the form K[X ]=(X − λ), and is in particular a one-dimensional K-vector space.