Chapter 4 Eigenvalue Problems
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 4 Eigenvalue Problems In this chapter we now look at a third class of important problems in Numerical Linear Algebra, which consist in finding the eigenvalues and eigenvectors of a given m m matrix A, if and when they exist. As discussed in Chapter 1, nu- merical methods⇥ for finding eigenvalues and eigenvectors di↵er significantly from what one may do analytically (i.e. construction and solution of the characteris- tic polynomial). Instead, eigenvalue algorithms are always based on iterative methods. In what follows, we first illustrate how very simply iterative methods can actually work to find specific eigenvalues and eigenvectors of a matrix A. For simplicity, these methods assume the matrix A is real and symmetric, so the eigenvalues are real and symmetric too, and the eigenvectors are orthogonal. Later, we relax these conditions to construct eigenvalue revealing algorithms that can find all the eigenvalues, real or complex, of any matrix A. Before we proceed, however, let’s see a few example of applied mathematical problems where we want to find the eigenvalues of a matrix. 1. Eigenvalue problems in applied mathematics The following examples are very basic examples that come up in simple ODE and PDE problems, that you may encounter in AMS 212A and AMS 214 for instance. Other more complex examples come up all the time in fluid dynamics, control theory, etc. 1.1. A simple Dynamical Systems problem Consider the set of m nonlinear autonomous ODEs for m variables written as x˙ i = fi(x) for i =1...m (4.1) T where x =(x1,x2,...,xm) , and the functions fi are any nonlinear function of the coefficients of x. Suppose a fixed point of this system is known, for which fi(x?) = 0 for all i. Then, to study the stability of this fixed point, we consider a small displacement ✏ away from it such that m m @fi @fi fi(x? + ✏)=fi(x?)+ ✏j = ✏j (4.2) @xj @xj i=1 x? i=1 x? X X 91 92 The ODE system becomes, near the fixed point, m @fi ✏˙i = ✏j for i =1...m (4.3) @xj i=1 x? X or in other words ✏˙ = J✏ (4.4) where J is the Jacobian matrix of the original system at the fixed point. This is λt a simple linear system now, and we can look for solutions of the kind ✏i e , which implies solving for the value(s) of λ for which / J✏ = λ✏ (4.5) If any of the eigenvalues λ has a positive real part, then the fixed point is unstable. 1.2. A simple PDE problem Suppose you want to solve the di↵usion equation problem @f @ @f = D(x) (4.6) @t @x @x This problem is slightly more complicated than usual because the di↵usion co- efficient is a function of x. The first step would consist in looking for separable solutions of the kind f(x, t)=A(x)B(t) (4.7) where it is easy to show that dB = λB (4.8) dt − d dA D(x) = λA (4.9) dx dx − where, on physical grounds, we can argue that λ 0. If the domain is periodic, say of period 2⇡, we can expand the solution A(x≥) and the di↵usion coefficient D(x) in Fourier modes as imx imx A(x)= ame and D(x)= dme (4.10) m m X X where the d are known, but the a are not. The equation for A becomes { m} { m} d d einx ima eimx = λ a eikx (4.11) dx n m − k " n m # X X Xk and then i(m+n)x ikx m(m + n)dname = λ ake (4.12) m,n X Xk 93 Projecting onto the mode eikx we then get mkdk mam Bkmam = λak (4.13) − ⌘ m m X X or, in other words, we have another matrix eigenvalue problem Bv = λv where the coefficients of the matrix B were given above, and the elements of v are the Fourier coefficients an . The solutions to that problem yield both the desired λs and the eigenmodes{ }A(x), which can then be used to construct the solution to the PDE. Many other examples of eigenvalue problems exist. You are unlikely to go through your PhD without having to solve at least one! 1.3. Localizing Eigenvalues: Gershgorin Theorem For some purposes it suffices to know crude information on eigenvalues, instead of determining their values exactly. For example, we might merely wish to know rough estimations of their locations, such as bounding circles or disks. The simplest such “bound” can be obtained as ⇢(A) A . (4.14) || || This can be easily shown if we take λ to be λ = ⇢(A), and if we let x be an associated eigenvector x = 1 (recall we can| always| normalize eigenvectors!). Then || || ⇢(A)= λ = λx = Ax A x = A . (4.15) | | || || || || || || · || || || || A more accurate way of locating eigenvalues is given by Gershgorin’s The- orem which is stated as the following: Theorem: (Gershgorin’s Theorem) Let A = a be an n n matrix and let { ij} ⇥ λ be an eigenvalue of A.Thenλ belongs to one of the circles Zi given by Zk = z R or C : z akk rk , (4.16) { 2 | − | } where n r = a ,k =1, ,n. (4.17) k | kj| ··· j=1,j=k X6 Moreover, if m of the circles form a connected set S, disjoint from the remain- ing n m circles, then S contains exactly m of the eigenvalues of A, counted according− to their algebraic multiplicity. Proof: Let Ax = λx. Let k be the subscript of a component of x such that x = max x = x , then we see that the k-th component satisfies | k| i | i| || || n λxk = akjxj, (4.18) Xj=1 94 so that n (λ a )x = a x . (4.19) − kk k kj j j=1,j=k X6 Therefore λ a x a x a x . (4.20) | − kk|·| k| | kj|·| j| | kj|·|| || j=1,j=k j=1,j=k X6 X6 ⇤ Example: Consider the matrix 41 0 A = 10 1 . (4.21) 2 11−4 3 − 4 5 Then the eigenvalues must be contained in the circles Z : λ 4 1+0=1, (4.22) 1 | − | Z : λ 1+1=2, (4.23) 2 | | Z : λ +4 1+1=2. (4.24) 3 | | Note that Z is disjoint from Z Z , therefore there exists a single eigenvalue 1 2 [ 3 in Z1. Indeed, if we compute the true eigenvalues, we get λ(A)= 3.76010, 0.442931, 4.20303 . (4.25) {− − } ⇤ 2. Invariant Transformations As before we seek for a simpler form whose eigenvalues and eigenvectors are determined in easier ways. To do this we need to identify what types of trans- formations leave eigenvalues (or eigenvectors) unchanged or easily recoverable, and for what types of matrices the eigenvalues (or eigenvectors) are easily de- termined. Shift: A shift subtracts a constant scalar σ from each diagonal entry of a • matrix, e↵ectively shifting the origin. Ax = λx = (A σI)x =(λ σ)x. (4.26) ) − − Thus the eigenvalues of the matrix A σI are translated, or shifted, from those of A by σ, but the eigenvectors− are unchanged. Inversion: If A is nonsingular and Ax = λx with x = 0,then • 6 1 1 A− x = x. (4.27) λ 1 Thus the eigenvalues of A− are reciprocals of the eigenvalues of A, and the eigenvectors are unchanged. 95 Powers: Raising power of a matrix also raises the same power of the eigen- • values, but keeps the eigenvectors unchanged. Ax = λx = A2x = λ2x = = Akx = λkx. (4.28) ) )··· ) Polynomials: More generally, if • p(t)=c + c t + c t2 + c tk (4.29) 0 1 2 ··· k is a polynomial of degree k,thenwedefine p(A)=c I + c A + c A2 + c Ak. (4.30) 0 1 2 ··· k Now if Ax = λx then p(A)x = p(λ)x. Similarity: We already have seen this in Eq. 1.39 – Eq. 1.43. • 3. Iterative ideas See Chapter 27 from the textbook In this section, as discussed above, all matrices are assumed to be real and symmetric. 3.1. The Power Iteration We can very easily construct a simple algorithm to reveal the eigenvector corre- sponding to the largest eigenvalue of a matrix. To do so, we simply apply the matrix A over, and over, and over again on any initial seed vector x.Bythe properties of the eigenvalues and eigenvectors of real, symmetric matrices, we know that the eigenvectors vi , for i =1...m, form an orthogonal basis in which the vector x can be written{ } as m x = ↵ivi (4.31) Xi=1 Then m m Ax = λ ↵ v Anx = λn↵ v (4.32) i i i ) i i i Xi=1 Xi=1 If we call the eigenvalue with the largest norm λ1,then m n n n λi A x = λ1 ↵ivi (4.33) λ1 Xi=1 ✓ ◆ where, by construction, λi/λ1 < 1 for i>1. As n , all but the first term in that sum tend to zero,| which| implies that !1 n n lim A x =limλ ↵1v1 (4.34) n n 1 !1 !1 96 which is aligned in the direction of the first eigenvector v1. In general, we see that the iteration yields a sequence x(n+1) = Anx converges to the eigenvector { } { } v1 with normalization (n+1) n+1 x λ1 ↵1v1 xn+1 = v1. (4.35) ⌘ x(n+1) ⇡ λn+1↵ v ± || || || 1 1 1|| To approximate the corresponding value λ1, we compute (n+1) T T T 2 λ = xn+1Axn+1 ( v1) A( v1)=( v1) λ1( v1)=λ v1 = λ1.