<<

Jim Lambers MAT 605 Fall Semester 2015-16 Lecture 14 and 15 Notes These notes correspond to Sections 4.4 and 4.5 in the text.

Eigenvalues and Eigenvectors

In order to compute the exponential eAt for a given matrix A, it is helpful to know the eigenvalues and eigenvectors of A.

Definitions and Properties Let A be an n × n matrix. A nonzero vector x is called an eigenvector of A if there exists a λ such that Ax = λx. The scalar λ is called an eigenvalue of A, and we say that x is an eigenvector of A corresponding to λ. We see that an eigenvector of A is a vector for which matrix-vector with A is equivalent to scalar multiplication by λ. Because x is nonzero, it follows that if x is an eigenvector of A, then the matrix A − λI is singular, where λ is the corresponding eigenvalue. Therefore, λ satisfies the equation det(A − λI) = 0. The expression det(A−λI) is a polynomial of degree n in λ, and therefore is called the characteristic polynomial of A (eigenvalues are sometimes called characteristic values). It follows from the fact that the eigenvalues of A are the roots of the characteristic polynomial that A has n eigenvalues, which can repeat, and can also be complex, even if A is real. However, if A is real, any complex eigenvalues must occur in complex-conjugate pairs. The set of eigenvalues of A is called the spectrum of A, and denoted by λ(A). This terminology explains why the magnitude of the largest eigenvalues is called the spectral radius of A. The trace of A, denoted by tr(A), is the sum of the diagonal elements of A. It is also equal to the sum of the eigenvalues of A. Furthermore, det(A) is equal to the of the eigenvalues of A. Example 1 A 2 × 2 matrix  a b  A = c d has trace tr(A) = a + d and det(A) = ad − bc. Its characteristic polynomial is

a − λ b det(A − λI) = c d − λ = (a − λ)(d − λ) − bc = λ2 − (a + d)λ + (ad − bc) = λ2 − tr(A)λ + det(A). From the quadratic formula, the eigenvalues are a + d p(a − d)2 + 4bc a + d p(a − d)2 + 4bc λ = + , λ = − . 1 2 2 2 2 2 It can be verified directly that the sum of these eigenvalues is equal to tr(A), and that their product is equal to det(A). 2

1 Decompositions n A subspace W of R is called an invariant subspace of A if, for any vector x ∈ W , Ax ∈ W . Suppose that dim(W ) = k, and let X be an n × k matrix such that range(X) = W . Then, because each column of X is a vector in W , each column of AX is also a vector in W , and therefore is a of the columns of X. It follows that AX = XB, where B is a k × k matrix. Now, suppose that y is an eigenvector of B, with eigenvalue λ. It follows from By = λy that

XBy = X(By) = X(λy) = λXy, but we also have XBy = (XB)y = AXy. Therefore, we have A(Xy) = λ(Xy), which implies that λ is also an eigenvalue of A, with corresponding eigenvector Xy. We conclude that λ(B) ⊆ λ(A). If k = n, then X is an n × n , and it follows that A and B have the same eigenvalues. Furthermore, from AX = XB, we now have B = X−1AX. We say that A and B are similar matrices, and that B is a similarity transformation of A. Even if A is not a normal matrix, it may be diagonalizable, meaning that there exists an invertible matrix P such that P −1AP = D, where D is a . If this is the case, then, because AP = PD, the columns of P are eigenvectors of A, and the rows of P −1 are eigenvectors of AT . By definition, an eigenvalue of A corresponds to at least one eigenvector. Because any nonzero scalar multiple of an eigenvector is also an eigenvector, corresponding to the same eigenvalue, an eigenvalue actually corresponds to an eigenspace, which is the span of any set of eigenvectors corresponding to the same eigenvalue, and this eigenspace must have a dimension of at least one. Any invariant subspace of a diagonalizable matrix A is a union of eigenspaces. Now, suppose that λ1 and λ2 are distinct eigenvalues, with corresponding eigenvectors x1 and x2, respectively. Furthermore, suppose that x1 and x2 are linearly dependent. This means that they must be parallel; that is, there exists a nonzero constant c such that x2 = cx1. However, this implies that Ax2 = λ2x2 and Ax2 = cAx1 = cλ1x1 = λ1x2. However, because λ1 6= λ2, this is a contradiction. Therefore, x1 and x2 must be linearly independent. More generally, it can be shown, using an inductive argument, that a set of k eigenvectors corresponding to k distinct eigenvalues must be linearly independent. It follows that if A has n distinct eigenvalues, then it has a set of n linearly independent eigenvectors. If X is a matrix whose columns are these eigenvectors, then AX = XD, where D is a diagonal matrix of the eigenvectors, and because the columns of X are linearly independent, X is invertible, and therefore X−1AX = D, and A is diagonalizable. Now, suppose that the eigenvalues of A are not distinct; that is, the characteristic polynomial has repeated roots. Then an eigenvalue with multiplicity m does not necessarily correspond to m linearly independent eigenvectors. The algebraic multiplicity of an eigenvalue λ is the number of times that λ occurs as a root of the characteristic polynomial. The geometric multiplicity of λ is the dimension of the eigenspace corresponding to λ, which is equal to the maximal size of a set of linearly independent eigenvectors corresponding to λ. The geometric multiplicity of an eigenvalue λ is always less than or equal to the algebraic multiplicity. When it is strictly less, then we say that the eigenvalue is defective. When both multiplicities are equal to one, then we say that the eigenvalue is simple.

2 The Jordan canonical form of an n × n matrix A is a decomposition that yields information about the eigenspaces of A. It has the form A = XJX−1 where J has the block diagonal structure   J1 0 ··· 0  .. .   0 J2 . .  J =   .  . .. ..   . . . 0  0 ··· 0 Jp

Each diagonal block Jp is a Jordan block that has the form   λi 1  ..   λi .  Ji =   , i = 1, 2, . . . , p.  ..   . 1  λi The number of Jordan blocks, p, is equal to the number of linearly independent eigenvectors of A. The diagonal element of Ji, λi, is an eigenvalue of A. The number of Jordan blocks associated with λi is equal to the geometric multiplicity of λi. The sum of the sizes of these blocks is equal to the algebraic multiplicity of λi. If A is diagonalizable, then each Jordan block is 1 × 1. Example 2 Consider a matrix with Jordan canonical form  2 1 0   0 2 1     0 0 2  J =   .  3 1     0 3  2 The eigenvalues of this matrix are 2, with algebraic multiplicity 4, and 3, with algebraic multiplicity 2. The geometric multiplicity of the eigenvalue 2 is 2, because it is associated with two Jordan blocks. The geometric multiplicity of the eigenvalue 3 is 1, because it is associated with only one Jordan block. Therefore, there are a total of three linearly independent eigenvectors, and the matrix is not diagonalizable. 2

Suppose that an n × n matrix A is not diagonalizable. Since the eigenvectors do not form a n for all of C , but only a proper subspace of it, how can the “rest” of the space be described? In other words, if A = XJX−1, where J is the Jordan canonical form of A, then only some of the columns of X (specifically, one column per Jordan block) are eigenvectors, so what are the other columns? Let Ji be a k × k Jordan block, with eigenvalue λ. Then, from AX = XJ, we have AXi = XiJi,   where Xi = x1 ··· xk . From the structure of Ji, we have the following:

Ax1 = λx1,

Ax2 = λx2 + x1, . .

Axk = λxk + xk−1.

3 The vector x1 is an eigenvector, while x2,..., xk are generalized eigenvectors, that satisfy

(A − λI)xj = xj−1, j = 2, . . . , k.

The vectors x1,..., xk form a chain of generalized eigenvectors. The chain begins with an eigen- vector, and ends when the system of linear equations (A − λI)xk+1 = xk cannot be solved because it is inconsistent.

Matrix Functions One of the most useful applications of eigenvalues and eigenvectors is the computation of matrix functions, such as the matrix exponential eAt. Given a f(x) and an n × n matrix A such that f(x) has a power series ∞ X k f(x) = ck(x − x0) k=0 whose interval of convergence contains all of the eigenvalues of A, the matrix function f(A) is defined by ∞ X k f(A) = ck(A − x0I) . k=0 If A has the Jordan decomposition A = XJX−1, then it can be shown using the above definition of f(A) that f(A) = Xf(J)X−1. This is useful because f(J) is generally far easier to compute than f(A). In fact,     J1 0 ··· 0 f(J1) 0 ··· 0  .. .   .. .   0 J2 . .   0 f(J2) . .  f   =   .  . .. ..   . .. ..   . . . 0   . . . 0  0 ··· 0 Jp 0 ··· 0 f(Jp) In particular, if A is diagonalizable, then A = XDX−1 and we have   f(λ1)  f(λ2)  f(A) = Xf(D)X−1 = X   X−1.  ..   .  f(λn) More generally, if   λi 1  ..   λi .  Ji =   ,  ..   . 1  λi as before, then  0 1 00 1 (n−1)  f(λi) f (λi) 2 f (λi) ··· (n−1)! f (λi)  .. .. .   f(λi) . . .  f(J ) =  . .  . i  .. .. 1 00   2 f (λi)   0   f(λi) f (λi)  f(λi)

4 Example 3 If  2 1  J = , 0 2 then  e2t te2t  eJt = . 0 e2t Note that to obtain the (1, 2) entry of eJt, the function f(λ) = eλt is differentiated with respect to λ, not t, meaning that f 0(λ) = teλt. 2

The of the Integral Curves

Now that we know how to compute a function of a matrix A in terms of its eigenvalues, eigenvec- tors and generalized eigenvectors, we can compute explicit solutions of constant coefficient linear systems.

Real Eigenvalues We first consider the simplest case in which A is an n × n matrix that is diagonalizable, with real eigenvalues λi, i = 1, 2, . . . , n, and corresponding linearly independent eigenvectors v1, v2,..., vn. Each eigenvector vi is a solution of the system (A − λiI)x = 0. It follows that

A = VDV −1, where   λ1  λ2  V =  v v ··· v  ,D =   , 1 2 n  ..   .  λn and therefore eAt = V eDtV −1. Therefore, the solution of the IVP

x0 = Ax, x(0) = c is n X λit α(t) = φt(c) = kivie , i=1 where the constants k1, . . . , kn are the components of the vector k such that V k = c. That is, k1, . . . , kn are the coefficients of c in the basis v1,..., vn.

Example 4 We will compute integral curves for the system x0 = Ax, where

 0 1  A = . 6 −1

First, we find the eigenvalues, which are the roots of the characteristic polynomial

 −λ 1  det(A − λI) = det = λ2 + λ − 6 = (λ + 3)(λ − 2), 6 −1 − λ

5 which are λ1 = 2 and λ2 = −3. Next, we need the corresponding eigenvectors. For λ1 = 2, we solve the system (A − 2I)v1 = 0, which is

−2v11 + v21 = 0,

6v11 − 3v21 = 0.

Because (A − 2I) is a singular matrix, these equations are not independent. Therefore, we can obtain a solution by setting v11 = 1 in the first equation, which yields v21 = 2. That is,

 1  v = 1 2 is an eigenvector. For λ2 = −3, we solve the system (A + 3I)v2 = 0, which is

3v12 + v22 = 0,

6v12 + 2v22 = 0.

We set v12 = 1 in the first equation, which yields v22 = −3. That is,

 1  v = 2 −3 is also an eigenvector. Because the eigenvectors v1 and v2 are not scalar multiples of one another, they are linearly 2 independent, so they form a basis for R . It follows that A is diagonalizable, and has the decom- A = VDV −1 where  1 1   2 0  V =  v v  = ,D = . 1 2 2 −3 0 −3 Therefore an integral curve α(t), with initial value c, is given by

α(t) = eAtc = V eDtV −1c = V eDtk,V k = c,    2t    1 1 e 0 k1 = −3t 2 −3 0 e k2  2t −3t  k1e + k2e = 2t −3t . 2k1e − 3k2e 2

In the case n = 2, the eigenvectors v1 and v2 serve as asymptotes in the xy-plane, as well as straight-line solutions. As t increases, these solutions are traversed toward the origin if the eigenvalue is negative, and away from the solution if it is positive. Other solution curves approach v1 as t → ∞ if λ1 > λ2, and approach v2 otherwise.

6 Complex Eigenvalues Next, we consider the case of complex eigenvalues of a real n × n matrix A, which must occur in complex-conjugate pairs. For simplicity, we consider the case n = 2. Let A have complex eigenvalues λ1 = a + bi, λ2 = a − bi, with corresponding eigenvectors v1 = u + iw and v2 = u − iw (because A is real and the eigenvalues are complex conjugates, so are the eigenvectors). Proceeding as in the case of real eigenvalues, we have the general solution

λ1t λ2t α(t) = c1v1e + c2v2e at a = c1(u + iw)e (cos bt + i sin bt) + c2(u − iw)e (cos bt − i sin bt) at = e {c1[(u cos bt − w sin bt) + i(w cos bt + u sin bt)]+

c2[(u cos bt − w sin bt) − i(w cos bt + u sin bt)]} at = e {(c1 + c2)(u cos bt − w sin bt) + i(c1 − c2)(w cos bt + u sin bt)} .

Because the initial data α(0) is real, the constants c1 and c2 are also complex conjugates. If we let c1 = c + di, which implies c2 = c − di, then the general solution becomes

α(t) = 2eat [(cu − dw) cos bt − (cw + du) sin bt] .

As expected, the solution is real. However, it is still expressed in terms of the coefficients c1, c2 of the initial data α(0) in the complex basis v1, v2, rather than the real basis u, w. If we equate

α(0) = c1v1 + c2v2 = ku + hw, then we have c1(u + iw) + c2(u − iw) = ku + hw which yields (c1 + c2)u + i(c1 − c2)w = ku + hw, or k = c1 + c2 = 2c, h = i(c1 − c2) = −2d. We conclude that α(t) = eat [(ku + hw) cos bt + (hu − kw) sin bt] .

What remains is to obtain the real vectors u and w that form the complex eigenvectors v1 = u + iw and v2 = u − iw. This can be accomplished by explicitly computing v1 and v2, but it would be preferable to avoid complex arithmetic. To that end, we begin with the relations

A(u + iw) = (a + bi)(u + iw),A(u − iw) = (a − bi)(u − iw). by matching real and imaginary parts of both sides, we obtain

Au = au − bw,Aw = bu + aw.

Putting these together yield the following equations that can be used to obtain u and w directly:

[(A − aI)2 + b2I]u = 0, 1 w = − (A − aI)u. b

7 The solution of the IVP x0 = Ax, x(0) = c is given by at φt(c) = e [(ku + hw) cos bt + (hu − kw) sin bt] where k, h are the components of c in the basis {u, w}. This solution can be factored into

 cos bt sin bt   k  φ (c) = eat  u w  . t − sin bt cos bt h

The matrix  cos bt sin bt  R(bt) = − sin bt cos bt performs a clockwise rotation of the vector by which it’s multiplied through the angle bt. It follows from this description that The vectors u and w serve as axes of integral curves that are concentric ellipses centered at the origin, when a = 0. If a > 0, then the integral curves spiral way from the origin, and if a < 0, then they spiral toward the origin.

Example 5 We will solve the IVP

x0 = Ax, x(0) = c where  2 3  A = . −3 2 The eigenvalues are the roots of

 2 − λ 3  det(A − λI) = det = (λ − 2)2 + 9, −3 2 − λ which are λ1 = 2 + 3i, λ2 = 2 − 3i. Instead of computing the eigenvectors for these complex eigenvalues, we will compute the vectors u and w such that

[(A − 2I)2 + 9I]u = 0, 1 w = − (A − 2I)u. 3 In this case, (A − 2I)2 + 9I is equal to the zero matrix, so we can choose an arbitrary vector for u. We make the simple choice u = (1, 0), and then apply the second equation above to obtain w = (0, 1). We conclude that the solution of the IVP is  cos 3t sin 3t   k  φ (c) = e2t , t − sin 3t cos 3t h where  1   0   k  c = ku + hw = k + h = . 0 1 h Because the eigenvalues have positive real part, the integral curve begins at c when t = 0 and spirals outward in a circular fashion (as opposed to elliptically). 2

8 Real and Complex Eigenvector Bases

Suppose that the n × n real matrix A is diagonalizable, with q real eigenvalues λ1, . . . , λq and corresponding eigenvectors v1,..., vq, and p complex-conjugate pairs of eigenvalues aj ± bji with corresponding eigenvectors uj ± iwj, for j = 1, . . . , p. Because A is diagonalizable, n = q + 2p. Then the solution of the IVP x0 = Ax, x(0) = c is q p   X λj t X aj t   kj φt(c) = mje vj + e uj wj R(bjt) hj j=1 j=1 q p X λj t X aj t = mje vj + e [cos(bjt)(kjuj + hjwj) + sin(bjt)(hjuj − kjwj)] . j=1 j=1 where q p X X c = mjvj + kjuj + hjwj. j=1 j=1

Repeated Eigenvalues The preceding discussion only applies to the case where A is diagonalizable; now, we construct a solution for a non-diagaonalizable matrix.

Example 6 We will solve the IVP

x0 = Ax, x(0) = c where  2 1  A = . 0 2

Because A is upper triangular, it is easily seen that it has a repeated eigenvalue λ1 = λ2 = 2, and because A is equal to its own Jordan canonical form that has one 2 × 2 Jordan block, it is not 2 diagonalizable. Its only linearly independent eigenvector is v1 = (0, 1). To complete a basis of R , we obtain the v2 by solving the system

(A − 2I)v2 = v1, which can be written as

v22 = 1 0 = 0, and therefore v2 = (0, 1). The form of the solution is

 2t 2t    At   e te d1 φt(c) = e c = v1 v2 2t 0 e d2 2t = e [(d1 + d2t)v1 + d2v2] .

9 where V d = c. In this particular case, the solution reduces to

2t φt(c) = e (c1 + c2t, c2) .

As t → ∞, because of the factor of t in the first component, the direction of the integral curve approaches that of v1. Had the repeated eigenvalue been negative, the solution would have rapidly approached the origin from c, with the direction of approach tending toward that of v1. 2

Exercises

Section 4.4: Exercises 1, 5ab Section 4.5: Exercise 8a

10