<<

Notes on Numerical

Dr. George W Benthien December 9, 2006

E-mail: [email protected] Contents

Preface 5

1 Mathematical Preliminaries 6

1.1 Matrices and Vectors ...... 6

1.2 Vector Spaces ...... 7

1.2.1 and Bases ...... 8

1.2.2 Inner Product and ...... 8

1.2.3 Matrices As Linear Transformations ...... 9

1.3 Derivatives of Vector Functions ...... 9

1.3.1 Newton’s Method ...... 10

2 Solution of Systems of Linear 11

2.1 Gaussian Elimination ...... 11

2.1.1 The Basic Procedure ...... 12

2.1.2 Row Pivoting ...... 14

2.1.3 Iterative Refinement ...... 16

2.2 Cholesky ...... 16

2.3 Elementary Unitary Matrices and the QR Factorization ...... 19

2.3.1 Gram-Schmidt Orthogonalization ...... 19

1 2.3.2 Householder Reflections ...... 20

2.3.3 Complex Householder Matrices ...... 22

2.3.4 Givens Rotations ...... 26

2.3.5 Complex Givens Rotations ...... 27

2.3.6 QR Factorization Using Householder Reflectors ...... 28

2.3.7 Uniqueness of the Reduced QR Factorization ...... 29

2.3.8 Solution of Least Squares Problems ...... 32

2.4 The Singular Value Decomposition ...... 32

2.4.1 Derivation and Properties of the SVD ...... 33

2.4.2 The SVD and Least Squares Problems ...... 36

2.4.3 Singular Values and the Norm of a ...... 39

2.4.4 Low Matrix Approximations ...... 39

2.4.5 The Condition Number of a Matrix ...... 41

2.4.6 Computationof the SVD ...... 42

3 Eigenvalue Problems 44

3.1 Reduction to Tridiagonal Form ...... 46

3.2 The Power Method ...... 46

3.3 The Rayleigh Quotient ...... 47

3.4 Inverse Iteration with Shifts ...... 47

3.5 Rayleigh Quotient Iteration ...... 48

3.6 The Basic QR Method ...... 48

3.6.1 The QR Method with Shifts ...... 52

3.7 The Divide-and-Conquer Method ...... 55

4 Iterative Methods 61

2 4.1 The Lanczos Method ...... 61

4.2 The Conjugate Gradient Method ...... 64

4.3 Preconditioning ...... 69

Bibliography ...... 71

3 List of Figures

2.1 Householder reflection ...... 20

2.2 Householder reduction of a matrix to bidiagonal form...... 42

3.1 Graphof f./ 1 :5 :5 :5 :5 ...... 58 D C 1 C 2 C 3 C 4 3.2 Graphof f./ 1 :5 :01 :5 :5 ...... 59 D C 1 C 2 C 3 C 4

4 Preface

The purpose of these notes is to present some of the standard procedures of numerical linear al- gebra from the perspective of a user and not a computer specialist. You will not find extensive error analysis or programming details. The purpose is to give the user a general idea of what the numerical procedures are doing. You can find more extensive discussions in the references

Applied by J. Demmel, SIAM 1997  Numerical Linear Algebra by L. Trefethen and D. Bau, Siam 1997  Matrix Computations by G. Golub and C. Van Loan, Johns Hopkins University Press 1996 

The notes are divided into four chapters. The first chapter presents some of the notation used in this paper and reviews some of the basic results of Linear Algebra. The second chapter discusses methods for solving linear systems of equations, the third chapter discusses eigenvalue problems, and the fourth discusses iterative methods. Of course we cannot discuss every possible method, so I have tried to pick out those that I believe are the most used. I have assumed that the user has some basic knowledge of linear algebra.

5 Chapter 1

Mathematical Preliminaries

In this chapter we will describe some of the notation that will be used in these notes and review some of the basic results from Linear Algebra.

1.1 Matrices and Vectors

A matrix is a two-dimensional array of real or complex numbers arranged in rows and columns. If a matrix A has m rows and n columns, we say that it is an m n matrix. We denote the element in  the i-th row and j -th column of A by aij . The matrix A is often written in the form

a11 a1n :  : A : : : D am1 amn 

We sometimes write A .a1;:::;an/ where a1;:::;anare the columns of A. A vector (or n-vector) is an n 1 matrix.D The collection of all n-vectors is denoted by Rn if the elements (components) are all real and by Cn if the elements are complex. We define the sum of two m n matrices componentwise, i.e., the i,j entry of A B is aij bij . Similarly, we define the  C C multiplication of a ˛ times a matrix A to be the matrix whose i,j component is ˛aij .

T If A is a real matrix with components aij , then the of A (denoted by A ) is the matrix whose i,j component is aji , i.e. rows and columns are interchanged. If A is a matrix with complex components, then AH is the matrix whose i,j -th component is the complex conjugate of the j ,i-th H component of A. We denote the complex conjugate of a by a. Thus, .A /ij aji . A real matrix A is said to be symmetric if A AT . A complex matrix is said to be HermitianD if A AH . Notice that the diagonal elementsD of a Hermitian matrix must be real. The n n matrixD whose diagonal components are all one and whose off-diagonal components are all zero is called the and is denoted by I .

6 If A is an m k matrix and B is an k n matrix, then the product AB is the m n matrix with components given by   k

.AB/ij airbrj : D rD1 X The matrix product AB is only defined when the number of columns of A is the same as the number of rows of B. In particular, the product of an m n matrix A and an n-vector x is given by  n

.Ax/i aikxk i 1;:::;m: D D kXD1 It can be easily verified that IA A if the number of columns in I equals the number of rows in A. It can also be shown that .AB/D T BT AT and .AB/H BH AH . In addition, we have .AT /T A and .AH /H A. D D D D

1.2 Vector Spaces

Rn and Cn together with the operations of addition and are examples of a structure called a . A vector space V is a collection of vectors for which addition and scalar multiplication are defined in such a way that the following conditions hold:

1. If x and y belong to V and ˛ is a scalar, then x y and ˛x belong to V. C 2. x y y x for any two vectors x and y in V. C D C 3. x .y z/ .x y/ z for any three vectors x, y, and z in V. C C D C C 4. There is a vector 0 in V such that x 0 x for all x in V. C D 5. For each x in V there is a vector x in V such that x . x/ 0. C D 6. .˛ˇ/x ˛.ˇx/ for any scalars ˛, ˇ and any vector x in V. D 7. 1x x for any x in V. D 8. ˛.x y/ ˛x ˛y for any x and y in V and any scalar ˛. C D C 9. .˛ ˇ/x ˛x ˇx for any x in V and any scalars ˛, ˇ. C D C

A subspace of a vector space V is a subset that is also a vector space in its own right.

7 1.2.1 Linear Independence and Bases

A set of vectors v1;:::;vr is said to be linearly independent if the only way we can have ˛1v1 C ˛r vr 0 is for ˛1 ˛r 0. A set of vectors v1;:::;vn is said to span a vector C D D  D D space V if every vector x in V can be written as a of the vectors v1;:::;vn, i.e., x ˛1x1 ˛nxn. The set of all linear combinations of the vectors v1;:::;vr is a subspace D CC denoted by < v1;:::;vr > and called the span of these vectors. If a set of vectors v1;:::;vn is linearly independent and spans V itis called a for V. If a vector space V has a basis consisting of a finite number of vectors, then the space is said to be finite dimensional. In a finite-dimensional vector space every basis has the same number of vectors. This number is called the dimension of n n n n the vector space. Clearly R and C have dimension n. Let ek denote the vector in R or C that consists of all zeroes except for a one in the k-th position. It is easily verified that e1;:::;en is a basis for either Rn or Cn.

1.2.2 Inner Product and Orthogonality

If x and y are two n-vectors, then the inner (dot) product x y is the scaler value defined by xH y. If the vector space is real we can replace xH by xT . The inner product x y has the properties:  1. y x x y  D  2. x .˛y/ ˛.x y/  D  3. x .y z/ x y x z  C D  C  4. x x 0 and x x 0 if and only if x 0.    D D

Vectors x and y are said to be orthogonal if x y 0. A basis v1;:::;vn is said to be orthonormal if  D 0 i j vi vj ¤ :  D 1 i j ( D 2 2 We define the norm x of a vector x by x px x x1 xn . The norm has k k k k D  D j j CCj j the properties q

1. ˛x ˛ x k k D j jk k 2. x 0 implies that x 0 k kD D 3. x y x y . k C kÄk kCk k

If v1;:::;vn is an orthonormal basis and x ˛1v1 ˛nvn, then it can be shown that 2 2 2 D CC x ˛1 ˛n . The norm and inner product satisfy the inequality k k D j j CCj j x y x y : Cauchy Inequality j  jÄk kk k 8 1.2.3 Matrices As Linear Transformations

An m n matrix A can be considered as a mapping of the space Rn (Cn) into the space Rm (Cm) where the image of the n-vector x is the matrix-vector product Ax. This mapping is linear, i.e., A.x y/ Ax Ay and A.˛x/ ˛Ax. The range of A (denoted by Range.A/) is the space of allCm-vectorsD yCsuch that y AxDfor some n-vector x. It can be shown that the range of A is the space spanned by the columnsD of A. The null space of A (denoted by Null.A/) is the vector space consisting of all n-vectors x such that Ax 0. An n n A is said to be invertible if it is a one-to-one mapping of the spaceD Rn (Cn) onto itself. It can be shown that a square matrix A is invertible if and only if the null space Null.A/ consists of only the zero vector. If A is invertible, then the inverse A1 of A is defined by A1y x where x is the unique n-vector satisfying Ax y. The inverse has the properties A1A AAD1 I and .AB/1 B1A1. We denote .AD1/T and .AT /1 by AT . D D D

If A is an m n matrix, x is an n-vector, and y is an m-vector; then it can be shown that  .Ax/ y x .AH y/:  D 

1.3 Derivatives of Vector Functions

The central idea behind differentiation is the local approximation of a function by a linear func- tion. If f is a function of one variable, then the locus of points x;f.x/ is a plane curve C. The tangent line to C at x;f.x/ is the graphical representation of the best local linear approximation to f at x. We call this local linear approximation the differential. We represent this local linear approximation by the  dy f 0.x/dx. If f is a function of two variables, then the locus of points x;y;f.x;y/ representsD a surface S. Here the best local linear approximation to f at .x;y/ is graphically represented by the tangent plane to the surface S at the point x;y;f.x;y/ . We will generalize this idea of a local linear approximation to vector-valued functions of n vari- ables. Let f be a function mapping n-vectors into m-vectors. We define the derivative Df.x/ of f at the n-vector x to be the unique linear transformation (m n matrix) satisfying  f .x h/ f.x/ Df.x/h o. h / (1.1) C D C C k k whenever such a transformation exists. Here the o notation signifies a function with the property

o. h / lim k k 0: khk!0 h D k k Thus, Df.x/ is a linear transformation that locally approximates f .

We can also define a directional derivative ıhf.x/ in the direction h by f .x h/ f.x/ d ıhf.x/ lim C f .x h/ (1.2) D !0  D d C D0 ˇ ˇ 9 ˇ whenever the limit exists. This directional derivative is also referred to as the variation of f in the direction h. If Df.x/ exists, then ıhf.x/ Df.x/h: D However, the existence of ıhf.x/ for every direction h does not imply the existence of Df.x/. If @f.x/ we take h ei , then ıhf.x/ is just the partial derivative . D @xi

1.3.1 Newton’s Method

Newton’s method is an iterative scheme for finding the zeroes of a smooth function f . If x is a guess, then we approximate f near x by

f .x h/ f.x/ Df.x/h: C D C If x h is the zero of this linear approximation, then C 1 h Df.x/ f.x/ D or  1 x h x Df.x/ f.x/: (1.3) C D We can take x h as an improved approximation to the nearby zero of f . If we keep iterating with equation (1.3C ), then the .k 1/-iterate x.kC1/ is related to the k-iterate x.k/ by C 1 x.kC1/ x.k/ Df.x.k// f .x.k//: (1.4) D 

10 Chapter 2

Solution of Systems of Linear Equations

2.1 Gaussian Elimination

Gaussian elimination is the standard way of solving a system of linear equations Ax b when A is a square matrix with no special properties. The first known use of this methodD was in the Chinese text Nine Chapters on the Mathematical Art written between 200 BC and 100 BC. Here it was used to solve a system of three equations in three unknowns. The coefficients (including the right-hand-side) were written in tabular form and operations were performed on this table to produce a triangular form that could be easily solved. It is remarkable that this was done long before the development of matrix notation or even a notation for variables. The method was used by Gauss inthe early1800s tosolve a least squares problem for determiningthe orbit of the asteroid Pallas. Using observations of Pallas taken between 1803 and 1809, he obtained a system of six equations in six unknowns which he solved by the method now known as Gaussian elimination. The concept of treating a matrix as an object and the development of an algebra for matrices were first introduced by Cayley [2] in the paper A Memoir on the Theory of Matrices.

In this paper we will first describe the basic method and show that it is equivalent to factoring the matrix into the product of a lower triangular and an upper , i.e., A LU . We will then introduce the method of row pivoting that is necessary in order to keep the methodD stable. We will show that row pivoting is equivalent to a factorization PA LU or A PLU where P is the identity matrix with its rows permuted. Having obtained this factorization,D D the solution for a given right-hand-side b is obtained by solving the two triangular systems Ly P b and Ux y by simple processes called forward and backward substitution. D D

There are a number of good computer implementations of Gaussian elimination with row pivoting. Matlab has a good implementation obtained by the call [L,U,P]=lu(A). Another good implemen- tation is the LAPACK routine SGESV (DGESV,CGESV). It can be obtained in either Fortran or C from the site www..org.

We will end by showing how the accuracy of a solution can be improved by a process called

11 iterative refinement.

2.1.1 The Basic Procedure

Gaussian elimination begins by producing zeroes below the diagonal in the first column, i.e.,

: : : : : :   : : :  0  : : :  : : : ! : : : : (2.1) : : : : : : : : : 0 : : : ˙   ˙  

If aij is the element of A in the i-th row and the j -th column, then the first step in the Gaussian elimination process consists of multiplying A on the left by the lower triangular matrix L1 given by 1 0 0 :::0 a21=a11 1 0 ::: 0 : L1 a31=a11 0 1 : ; (2.2) D : : : : : :: 0 an1=a11 0 ::: 0 1 i.e., zeroes are produced in the first column by adding appropriate multiples of the first row to the other rows. The next step is to produce zeroes below the diagonal in the second column, i.e., : : : : : :        0 0 : : :    : : : ! 0 0 : (2.3) : : : : :  : : : : 0 : : : ˙   0 0 : : :  

This can be obtained by multiplying L1A on the left by the lower triangular matrix L2 given by 1 0 00:::0  0 1 00 0 .1/ .1/ 0 a32 =a22 10 0 L2 .1/ .1/ (2.4) D 0 a42 =a22 01 0 : : : : : : : :: 0 0 a.1/=a.1/ 0 ::: 0 1 n2 22 .1/ where aij is the i;j -th element of L1A. Continuing in this manner, we can define lower triangular matrices L3;:::;Ln1 so that Ln1 L1A is upper triangular, i.e., 

Ln1 L1A U: (2.5)  D 12 Taking the inverses of the matrices L1;:::;Ln1, we can write A as

A L1 L1 U: (2.6) D 1  n1 Let L L1 L1 : (2.7) D 1  n1 Then it follows from equation (2.6) that

A LU: (2.8) D

We will now show that L is lower triangular. Each of the matrices Lk can be written in the form

.k/ T Lk I u e (2.9) D k .k/ where ek is the vector whose components are all zero except for a one in the k-th position and u .k/ T is a vector whose first k components are zero. The term u ek is an n n matrix whose elements are all zero except for those below the diagonal in the k-th column. In fact, the components of u.k/ are given by

.k/ 0 1 i k ui .k1/ .k1/ Ä Ä (2.10) D ( aik =akk k < i

.k1/ T .k/ .k/ where a is the i;j -th element of Lk1 L1A. Since e u u 0, it follows that ij  k D k D I u.k/eT I u.k/eT I u.k/eT u.k/eT u.k/eT u.k/eT C k k D C k k k k I u.k/ eT u.k/ eT   D k k I; (2.11) D  i.e., L1 I u.k/eT : (2.12) k D C k 1 Thus, Lk is the same as Lk except for a change of sign of the elements below the diagonal in column k. Combining equations (2.7) and(2.12), we obtain

L I u.1/eT I u.n1/eT I u.1/eT u.n1/eT : (2.13) D C 1  C n1 D C 1 CC n1 In this expression the cross terms dropped out since

u.i/eT u.j/eT u.j/u.i/eT 0 for i

1 0 0:::0 a21=a11 1 0 0 .1/ .1/ L a31=a11 a32 =a22 1 0 : (2.14) D : : : : : : :: : .1/ .1/ an1=a11 an2 =a22 1 13  Having the LU factorization given in equation (2.8), it is possible to solve the system of equations Ax LUx b D D for any right-hand-side b. Ifwe let y Ux, then y can be found by solving the triangular system Ly b. Having y, x can be obtainedD by solving the triangular system Ux y. Triangular systemsD are very easy to solve. For example, in the system Ux y, the lastD equation can be D solved for xn (the only unknown in this equation). Having xn, the next to the last equation can be solved for xn1 (the only unknown left in this equation). Continuing in this manner we can solve for the remaining components of x. For the system Ly b, we start by computing y1 and then work our way down. Solving an upper triangular systemD is called back substitution. Solving a lower triangular system is called forward substitution.

To compute L requires approximately n3=3 operations where an operation consists of an addition and a multiplication. For each right-hand-side, solving the two triangular systems requires approx- imately n2 operations. Thus, as far as solving systems of equations is concerned, having the LU factorization of A is just as good as having the inverse of A and is less costly to compute.

2.1.2 Row Pivoting

There is one problem with Gaussian elimination that has yet to be addressed. It is possible for .k1/ one of the diagonal elements akk that occur during Gaussian elimination to be zero or to be very small. This causes a problem since we must divide by this diagonal element. If one of the diagonals is exactly zero, the process obviously blows up. However, there can still be a problem if one of the diagonals is small. In this case large elements are produced in both the L and U matrices. These large entries lead to a loss of accuracy when there are subtractions involving these big numbers. This problem can occur even for well behaved matrices. To eliminate this problem we introduce row pivoting. In performing Gaussian elimination, it is not necessary to take the equations in the order they are given. Suppose we are at the stage where we are zeroing out the elements below the diagonal in the k-th column. We can interchange any of the rows from the k-th row on without changing the structure of the matrix. In row pivoting we find the largest in .k1/ .k1/ .k1/ magnitude of the elements akk ;akC1;k;:::;ank and interchange rows to bring that element to the k;k-position. Mathematically we can perform this row interchange by multiplying on the left by the matrix Pk that is like the identity matrix with the appropriate rows interchanged. The matrix Pk has the property PkPk I , i.e., Pk is its own inverse. With row pivoting equation (2.5) is replaced by D Ln1Pn1 L2P2L1P1A U: (2.15)  D We can write this equation in the form

1 1 1 Ln1 Pn1Ln2Pn1 Pn1Pn2Ln3Pn2Pn1  1 1 Pn1 P2L1P :::P Pn1 P1 A U: (2.16)    2 n1  D 0 Define L Ln1 and   n1 D 0 1 1 L Pn1 PkC1LkP P k 1;:::;n 2: (2.17) k D  kC1  n1 D 14 Then equation (2.16) can be written

0 0 L L Pn1 P1 A U: (2.18) n1  1  D Note that multiplying by Pj on the left only modifies rows j up to n. Similarly, multiplying by 1 P Pj on the right only modifies columns j up to n. Therefore, j D 0 .k/ T L Pn1 PkC1 I u e PkC1 Pn1 k D  C k  .k/ T I Pn1 PkC1 u e PkC1 Pn1 D C   k    I v.k/eT (2.19) D C k   .k/ .k/ 0 where v is like u except the components k 1 to n are permuted by Pn1 PkC1. Since Lk C 0 1 0 1  has the same form as Lk, it follows that the matrix L .L / .L / is lower triangular. D 1  n1 Thus, if we define P Pn1 P1, equation (2.18) becomes D  PA LU: (2.20) D Of course, in practice we don’t need to explicitly construct the matrix P since the interchanges can be kept tract of using a vector. To solve a system of equations Ax b we replace the system by PAx P b and proceed as before. D D It is also possible to do column interchanges as well as row interchanges, but this is seldom used in practice. By the construction of L all its elements are less than or equal to one in magnitude. The elements of U are usually not very large, but there are some peculiar cases where large entries can appear in U even with row pivoting. For example, consider the matrix

1 0 0::: 01 1 1 0::: 01 : : 1 1 1 : : A : : : : D : : :: 0 1 1 1 11 1 1 1 ::: 1 1 In the first step no pivoting is necessary, but the elements 2 through n in the last column are doubled. In the second step again no pivoting is necessary, but the elements 3 through n are doubled. Continuing in this manner we arrive at

1 1 1 2 U 1 4 : D : : :: : 2n1

Although growth like this in the size of the elements of U is theoretically possible, there are no reports of this ever having happened in the solution of a real-world problem. In practice Gaussian elimination with row pivoting has proven to be very stable.  15 2.1.3 Iterative Refinement

If the solution of Ax b is not sufficiently accurate, the accuracy can be improved by applying Newtons method to theD function f.x/ Ax b. If x.k/ is an approximate solution to f.x/ 0, then a Newton iteration produces an approximationD x.kC1/ given by D

1 x.kC1/ x.k/ Df.x.k// f .x.k// x.k/ A1 Ax.k/ b : (2.21) D D An iteration step can be summarized as follows:  

1. compute the residual r.k/ Ax.k/ b; D 2. solve the system Ad .k/ r.k/ using the LU factorization of A; D 3. Compute x.kC1/ x.k/ d .k/. D The residual is usually computed in double precision. If the above calculations were carried out exactly, the answer would be obtained in one iteration as is always true when applying Newton’s method to a linear function. However, because of roundoff errors, it may require more than one iteration to obtain the desired accuracy.

2.2 Cholesky Factorization

Matrices that are Hermitian (AH A) and positive definite (xH Ax > 0 for all x 0) occur sufficiently often in practice that itD is worth describing a variant of Gaussian elimination¤ that is often used for this class of matrices. Recall that Gaussian elimination amounted to a factorization of a square matrix A into the product of a lower triangular matrix and an upper triangular matrix, i.e., A LU . The Cholesky factorization represents a Hermitian positive definite matrix A by the productD of a lower triangular matrix and its conjugate transpose, i.e., A LLH . Because of the symmetries involved, this factorization can be formed in roughly half theD number of operations as are needed for Gaussian elimination.

Let us begin by looking at some of the properties of positive definite matrices. If ei is the i-th T column of the identity matrix and A .aij / is positive definite, then aii ei Aei > 0, i.e., the diagonal components of A are real andD positive. Suppose X is a nonsingularD matrix of the same size as the Hermitian, positive definite matrix A. Then

xH .X H AX/x .Xx/H A.Xx/ > 0 for all x 0: D ¤ Thus, A Hermitian positive definite implies that X H AX is Hermitian positive definite. Conversely, suppose X H AX is Hermitian positive definite. Then

A .XX 1/H A.XX 1/ .X 1/H .X H AX/.X 1/ is Hermitian positive definite. D D 16 Next we will show that the component of largest magnitude of a Hermitian positive definite matrix iÂkl A always lies on the diagonal. Suppose that akl maxi;j aij and k l. If akl akl e , let iÂkl j jD j j ¤ D j j ˛ e and x ek ˛el . Then D D C H T T T 2 T x Ax e Aek ˛e Aek ˛e Ael ˛ e Ael akk all 2 akl 0: D k C l C k C j j l D C j jÄ

This contradicts the fact that A is positive definite. Therefore, maxi;j aij maxi aii . Suppose we partition the Hermitian positive definite matrix A as follows j j D

B C H A D C D Â Ã If y is a nonzero vector compatible with D, let xH .0;yH /. Then D B C H 0 xH Ax .0;yH / yH Dy >0; D C D y D Â ÃÂ Ã i.e., D is Hermitian positive definite. Similarly, letting xH .yH ;0/, we can show that B is Hermitian positive definite. D

We will now show that if A is a Hermitian, positive-definite matrix, then there is a unique lower triangular matrix L with positive diagonals such that A LLH . This factorization is called the Cholesky factorization. We will establish this result by inDduction on the dimension n. Clearly, the result is true for n 1. For in this case we can take L .pa11/. Suppose the result is true for matrices of dimensionD n 1. Let A be a Hermitian, positive-definiteD matrix of dimension n. We can partition A as follows a wH A 11 (2.22) D w K  à where w is a vector of dimension n 1 and K is a .n 1/ .n 1/ matrix. It is easily verified that  H a11 w H 1 0 A B H B (2.23) D w K D 0 K ww  à a11 ! where wH pa11 B pa11 : (2.24) D 0 I ! We will first show that the matrix B is invertible. If

H H w w x2 pa11 x1 pa11x1 Bx pa11 C pa11 0; D x2 D D 0 I ! à x2 !

then x2 0 and pa11x1 x1 0. Therefore, B is invertible. From our discussion at the beginningD of this section it followsD D from equation (2.23) that the matrix

1 0 H 0 k ww a11 ! 17 is Hermitian positive definite. By the results on the partitioning of a positive definite matrix, it follows that the matrix wwH K a11 is Hermitian positive definite. By the induction hypothesis, there exists a unique lower triangular matrix L with positive diagonals such that O wwH K LLH : (2.25) a11 D O O Substituting equation (2.25) into equation (2.23), we get

wH 1 0 1 0 1 0 pa11 0 pa11 H H pa11 A B H B B H B w L (2.26) D 0 LL D 0 L 0 L D O 0 LH Â O O Ã Â O ÃÂ O Ã pa11 ! O ! which is the desired factorization of A. To show uniqueness, suppose that

a wH l 0 l vH A 11 11 11 (2.27) D w K D v L 0 LH Â Ã Â O ÃÂ O Ã 2 is a Cholesky factorization of A. Equating components in equation (2.27), we see that l11 a11 H HD and hence that l11 pa11. Also l11v w or v w=l11 w=pa11. Finally, vv LL K H D H DH D H D C O O D or K vv K ww =a11 LL . Since LL is the unique factorization of the .n 1/ D D O O O O H  .n 1/ Hermitian, positive-definite matrix K ww =a11, we see that the Cholesky factorization of A is unique. It now follows by induction that there is a unique Cholesky factorization of any Hermitian, positive-definite matrix.

The factorization in equation (2.23) is the basis for the computation of the Cholesky factorization. H H The matrix B is lower triangular. Since the matrix K ww =a11 is positive definite, it can be factored in the same manner. Continuing in this manner until the center matrix becomes the identity matrix, we obtain lower triangular matrices L1;:::;Ln such that

H H A L1 LnL L : D  n  1

Letting L L1 Ln, we have the desired Cholesky factorization. D  As was mentioned previously, the number of operations in the Cholesky factorization is about half the number in Gaussian elimination. Unlike Gaussian elimination the Cholesky method does not need pivoting in order to maintain stability. The Cholesky factorization can also be written in the form A LDLH D where D is diagonal and L now has all ones on the diagonal.

18 2.3 Elementary Unitary Matrices and the QR Factorization

In Gaussian elimination we saw that a square matrix A could be reduced to triangular form by multiplying on the left by a series of elementary lower triangular matrices. This process can also be expressed as a factorization A LU where L is lower triangular and U is upper triangular. In least squares problems the numberD of rows m in A is usually greater than the number of columns n. The standard technique for solving least-squares problems of this type is to make use of a factorization A QR where Q is an m m unitary matrix and R has the form D  R R O D 0 Â Ã with R an n n upper triangular matrix. The usual way of obtaining this factorization is to O reduce the matrix A to triangular form by multiplying on the left by a series of elementary unitary matrices that are sometimes called Householder matrices (reflectors). We will show how to use this QR factorization to solve least square problems. If Q is the m n matrix consisting of the O first n columns of Q, then  A QR: D O O This factorization is called the reduced QR factorization. Elementary unitary matrices are also used to reduce square matrices to a simplified form (Hessenberg or tridiagonal) prior to eigenvalue calculation.

There are several good computer implementations that use the Householder QR factorization to solve the least squares problem. The LAPACK routine is called SGELS (DGELS,CGELS). In Matlab the solution of the least squares problem is given by A b. The QR factorization can be obtained with the call [Q,R]=qr(A). n

2.3.1 Gram-Schmidt Orthogonalization

A reduced QR factorization can be obtained by an orthogonalization procedure known as the Gram-Schmidtprocess. Suppose we wouldlike to construct an orthonormalset of vectors q1;:::;qn from a given linearly independent set of vectors a1;:::;an. The process is recursive. At the j -th step we construct a unit vector qj that is orthogonal to q1;:::;qj 1 using

j 1 H vj aj .q aj /qi D i iD1 X qj vj = vj : D k k The orthonormal basis constructed has the additional property

< q1;:::;qj > j 1;2;:::;n: D D

19 If we consider a1;:::;an as columns of a matrix A, then this process is equivalent to the matrix factorization A QR where Q .q ;:::;q / and R is upper triangular. Although the Gram- O O O 1 n O Schmidt processD is very useful in theoreticalD considerations, it does not lead to a stable numerical procedure. In the next section we will discuss Householder reflectors, which lead to a more stable procedure for obtaining a QR factorization.

2.3.2 Householder Reflections

Let us begin by describing the Householder reflectors. In this section we will restrict ourselves to real matrices. Afterwards we will see that there are a number of generalizations to the complex case. If v is a fixed vector of dimension m with v 1, then the set of all vectors orthogonal to v is an .m 1/-dimensional subspace called a hyperplane.k kD If we denote this hyperplane by H, then H u vT u 0 : (2.28) D f W D g Here vT denotes the transpose of v. If x is a point not on H, let x denote the orthogonal projection of x onto H (see Figure 2.1). The difference x x must be orthogonalN to H and hence a multiple of v, i.e., N x x ˛v or x x ˛v: (2.29) N D N D C x

x¯ v

H

Figure 2.1: Householder reflection

Since x lies on H and vT v v 2 1, we must have N Dk k D vT x vT x ˛vT v vT x ˛ 0: (2.30) N D C D C D Thus, ˛ vT x and consequently D x x .vT x/v x vvT x .I vvT /x: (2.31) N D D D 20 Define P I vvT . Then P is a projection matrix that projects vectors orthogonally onto H. The projectionD x is obtained by going a certain distance from x in the direction v. Figure 2.1 suggests that theN reflection x of x across H can be obtained by going twice that distance in the same direction, i.e., O

x x 2.vT x/v x 2vvT x .I 2vvT /x: (2.32) O D D D With this motivation we define the Householder reflector Q by

Q I 2vvT v 1: (2.33) D k kD An alternate form for the Householder reflector is 2uuT Q I (2.34) D u 2 k k where here u is not restricted to be a unit vector. Notice that, in this form, replacing u by a multiple of u does not change Q. The matrix Q is clearly symmetric, i.e., QT Q. Moreover, D QT Q Q2 .I 2vvT /.I 2vvT / I 2vvT 2vvT 4vvT vvT I; (2.35) D D D C D i.e., Q is an . As with all orthogonal matrices Q preserves the norm of a vector, i.e., Qx 2 .Qx/T Qx xT QT Qx xT x x 2: (2.36) k k D D D Dk k To reduce a matrixto one that is upper triangular it is necessary to zero out columns below a certain position. We will show how to construct a Householder reflector so that its action on a given vector x is a multiple of e1, the first column of the identity matrix. To zero out a vector below row k we can use a matrix of the form I 0 Q D 0 Q Â Ã where I is the .k 1/ .k 1/ identity matrix and Q is a .m k 1/ .m k 1/ Householder matrix. Thus, for a given vector x we would like to choose a vectorC uso that QxC is a multiple of the unit vector e1, i.e., 2.uT x/ Qx x u ˛e1: (2.37) D u 2 D k k Since Q preserves norms, we must have ˛ x . Therefore, equation (2.37) becomes j jDk k 2.uT x/ Qx x u x e1: (2.38) D u 2 D ˙k k k k It follows from equation (2.38) that u must be a multiple of the vector x x e1. Since u can be replaced by a multiple of u without changing Q, we let k k

u x x e1: (2.39) D k k It follows from the definition of u in equation (2.39) that

T 2 u x x x x1 (2.40) Dk k k k 21 and 2 T 2 2 2 u u u x x x1 x x1 x 2. x x x1/: (2.41) k k D Dk k k k k k Ck k D k k k k Therefore, 2.uT x/ 1; (2.42) u 2 D k k and hence Qx becomes

2.uT x/ Qx x u x u x e1 (2.43) D u 2 D D ˙k k k k as desired. From what has been discussed so far, either of the signs in equation (2.39) would produce the desired result. However, if x1 is very large compared to the other components, then it is possible to lose accuracy through subtraction in the computation of u x x e1. To prevent this we choose u to be D k k u x sign.x1/ x e1 (2.44) D C k k where sign.x1/ is defined by

1 x1 0 sign.x1/ C  (2.45) D 1 x1 < 0: ( With this choice of u, equation (2.43) becomes

Qx sign.x1/ x e1: (2.46) D k k

In practice, u is often scaled so that u1 1, i.e., D x sign.x1/ x e1 u C k k : (2.47) D x1 sign.x1/ x C k k With this choice of u, 2 x u 2 k k : (2.48) k k D x x1 k k C j j The matrix Q applied to a general vector y is given by

uT y Qy y 2 u: (2.49) D u 2 k k

2.3.3 Complex Householder Matrices

Thee are several ways to generalize Householder matrices to the complex case. The most obvious is to let uuH U I 2 D u 2 k k where the superscript H denotes conjugate transpose. It can be shown that a matrix of this form is both Hermitian (U U H ) and unitary (U H U I ). However, it is sometimes convenient D D 22 H to be able to construct a U such that U x is a real multiple of e1. This is especially true when converting a Hermitian matrix to tridiagonal form prior to an eigenvalue computation. For in this case the tridiagonal matrix becomes a real symmetric matrix even when starting with a complex Hermitian matrix. Thus, it is not necessary to have a separate eigenvalue routine for the complex case. It turnsout that there is no Hermitian unitary matrix U , as defined above, that is guaranteed to produce a real multiple of e1. Therefore, linear algebra libraries such as LAPACK use elementary unitary matrices of the form U I wwH (2.50) D where  can be complex. These matrices are not in general Hermitian. If U is to be unitary, we must have

I U H U .I wwH /.I wwH / I .   2 w 2/wwH D D D C j j k k and hence  2 w 2 2 Re./: (2.51) j j k k D Notice that replacing w by w=Á and  by Á 2 in equation (2.50) leaves U unchanged. Thus, a scaling of w can be absorbed in . We wouldj j like to choose w and  so that

H H U x x .w x/w x e1 (2.52) D D k k where 1. It can be seen from equation (2.52) that w must be proportional to the vector D ˙ x x e1. Since the factor of proportionality can be absorbed in , we choose k k

w x x e1: (2.53) D k k Substituting this expression for w into equation (2.52), we get

H H H H U x x .w x/.x x e1/ .1 w x/x  .w x/ x e1 x e1: (2.54) D k k D C k k D k k Thus, we must have 1 .wH x/ 1 or  : (2.55) D D xH w This choice of  gives H U x x e1: D k k It follows from equation (2.53) that

H 2 x w x x x1 (2.56) Dk k k k and

2 H T 2 2 w .x x e1 /.x x e1/ x x x1 x x1 x k k D 2 k k k k Dk k k k k k Ck k 2 x x Re.x1/ (2.57) D k k k k 

23 Thus, it follows from equations (2.55)–(2.57) that 2 Re./   1 1 C xH w wH x  2 D  D  C  D C j j 2 2 x x x1 x x x1 D k k 2 k k C k k k k 2 x 2 x Re.x1/ D k k k k  w 2; Dk k  i.e., the condition in equation (2.51) is satisfied. It follows that the matrix U defined by equation (2.50) is unitary when w is defined by equation (2.53) and  is defined by equation (2.55). As before we choose to prevent the loss of accuracy due to subtraction in equation(2.53). In this case we choose sign Re.x1/ . Thus, w becomes D

w x sign Re.x1/ x e1: (2.58) D C k k Let us define a real constant  by 

 sign Re.x1/ x : (2.59) D k k With this definition w becomes  w x e1: (2.60) D C It follows that H 2 2 x w x x1  x1 . x1/; (2.61) Dk k C D C D C and hence 1  : (2.62) D . x1/ C In LAPACK w is scaled so that w1 1, i.e., D x e1 w C : (2.63) D x1  C With this w,  becomes

2 x1  .x1 /.x1 / x1   j C j C C C : (2.64) D . x1/ D . x1/ D  C C Clearly this  satisfies the inequality

x1 x1  1 j j j j 1: (2.65) j jD  D x Ä j j k k It follows from equation (2.64) that  is real when x1 is real. Thus, U is Hermitian when x1 is real.

An alternate approach to defining a complex Householder matrix is to let

2wwH U I : (2.66) D w 2 k k 24 This U is Hermitian and 2wwH 2wwH 2wwH 2wwH 4 w 2wwH U H U I I I k k I; (2.67) D w 2 w 2 D w 2 w 2 C w 4 D  k k àk k à k k k k k k i.e., U is unitary. We want to choose w so that

H H 2w x U x Ux x w x e1 (2.68) D D w 2 D k k k k where 1. Multiplying equation (2.68) by xH , we get j jD H H H H x Ux x U x x Ux x x1: (2.69) D D D k k H iÂ1 Since x Ux is real, it follows that x1 is real. If x1 x1 e , then must have the form D j j eiÂ1: (2.70) D˙ iÂ1 It follows from equation (2.68) that w must be proportional to the vector x e x e1. Since multiplying w by a constant factor doesn’t change U , we take  k k

iÂ1 w x e x e1: (2.71) D  k k Again, to avoid accuracy problems, we choose the plus sign in the above formula, i.e.,

iÂ1 w x e x e1: (2.72) D C k k It follows from this definition that

2 H iÂ1 T iÂ1 w x e x e1 x e x e1 k k D 2C k k C k k2 x x1 x x1 x x 2 x x x1 (2.73) Dk k C j jk k C j jk kCk k D k k k k C j j and 

H H iÂ1 T 2 iÂ1 w x x e x e x x e x1 x x x x1 : (2.74) D C k k 1 Dk k C k kDk k k k C j j Therefore,   2wH x 1; (2.75) w 2 D k k and hence iÂ1 iÂ1 Ux x w x x e x e1 e x e1: (2.76) D D C k k D k k This alternate form for the Householder matrix has the advan tage that it is Hermitian and that the multiplier of wwH is real. However, it can’t in general map a given vector x into a real multiple of e1. Both EISPACK and LINPACK use elementary unitary matrices similar to this. The LAPACK form is not Hermitian, involves a complex multiplier of wwH , but can produce a real multiple of e1 when acting on x. As stated before, this can be a big advantage when reducing matrices to triangular form prior to an eigenvalue computation.

25 2.3.4 Givens Rotations

Householder matrices are very good at producing long strings of zeroes in a row or column. Some- times, however, we want to produce a zero in a matrix while altering as little of the matrix as possible. This is true when dealing with matrices that are very sparse (most of the elements are al- ready zero) or when performing many operations in parallel. The Givens rotations can sometimes be used for this purpose. We will begin by considering the case where all matrices and vectors are real. The complex case will be considered in the next section.

The two-dimensional matrix cos  sin  R D sin  cos   à rotates a 2-vector counterclockwise through an angle Â. If we let c cos  and s sin Â, then the matrix R can be written as D D c s R D s c  à 2 2 where c s 1. If x is a 2-vector, we can determine c and s so that Rx is a multiple of e1. Since C D cx1 sx2 Rx ; D sx1 cx2  C à 2 2 2 2 R will have the desired property if c x1= x x and s x2= x x . In fact Rx D 1 C 2 D 1 C 2 D 2 2 q q x x e1. 1 C 2 q Givens matrices are an extension of this two-dimensional rotation to higher dimensions. For j > i, the givens matrix G.i;j/ is an m m matrix that performs a counterclockwise rotation in the .i;j/ coordinate plane. It can be obtained by replacing the .i;i/ and .j;j/ components of the m m identity matrix by c, the .i;j/ component by s and the .j;i/ component by s. It has the matrix form col i col j

1 1 : : : row i c s : G.i;j/ :: (2.77) D row j s c : :: 1 ˇ 1

26 where c2 s2 1. The matrix G.i;j/ is clearly orthogonal. In terms of components C D 1 k l, k i and k j D ¤ ¤ c k l, k i or k j D D D G.i;j/kl s k i, l j : (2.78) D D D s k j , l i D D 0 otherwise

Multiplying a vector by G.i;j/ only affects the i and j components. If y G.i;j/x, then D x˚k k i and k j ¤ ¤ yk cxi sxj k i : (2.79) D D sxi cxj k j C D

Suppose we want to make yj 0.€ We can do this by setting D xi xj c and s : (2.80) D x2 x2 D x2 x2 i C j i C j q q With this choice for c and s, y becomes

xk k i and k j ¤ ¤ 2 2 yk x x k i : (2.81) D i C j D q0 k j ‚ D Multiplying a matrix A on the left by G.i;j/ only alters rows i and j . Similarly, Multiplying A on the right by G.i;j/ only alters columns i and j

2.3.5 Complex Givens Rotations

For the complex case we replace R in the previous section by

c s R where c isreal. (2.82) D s c  à It can be easily verified that R is unitary if and only if c and s satisfy

c2 s 2 1: C j j D

Given a 2-vector x, we want to choose R so that Rx is a multiple of e1. For R unitary, we must have Rx x e1 where 1. (2.83) D k k j jD

27 Multiplying equation (2.83) by RH , we get

H H c x RR x x R e1 x (2.84) D D k k D k k s  à or x1 x2 c and s : (2.85) D x D x k k k k We define sign.u/ for u complex by

u= u u 0 sign.u/ j j ¤ (2.86) D 1 u 0: ( D If c is to be real, must have the form

 sign.x1/  1: D D˙ With this choice of , c and s become

x1 x2 c j j and s : (2.87) D  x D  sign.x1/ x k k k k If we want the complex case to reduce to the real case when x1 and x2 are real, then we can choose  sign Re.x1/ . As before, we can construct G.i;j/ by replacing the .i;i/ and .j;j/ componentsD of the identity matrix by c, the .i;j/ component by s, and the .j;i/ component by  s. In the expressions for c and s in equation (2.87), we replace x1 by xi , x2 by xj , and x by k k 2 2 xi xj . j j C j j q

2.3.6 QR Factorization Using Householder Reflectors

Let A be an m n matrix with m>n. Let Q1 be a Householder matrix that maps the first column  of A into a multiple of e1. Then Q1A will have zeroes below the diagonal in the first column. Now let 1 0 Q2 D 0 Q2  O à where Q2 is an .m 1/ .m 1/ Householder matrix that will zero out the entries below the O  diagonal in the second column of Q1A. Continuing in this manner, we can construct Q2;:::;Qn1 so that R Qn1 Q1A O (2.88)  D 0  à where R is an n n triangular matrix. The matrices Qk have the form O  I 0 Qk (2.89) D 0 Qk  O à 28 where Qk is an .m k 1/ .m k 1/ Householder matrix. If we define O C  C

H R Q Qn1 Q1 and R O ; (2.90) D  D 0 Â Ã then equation (2.88) can be written QH A R: (2.91) D Moreover, since each Qk is unitary, we have

H H H Q Q .Qn1 Q1/.Q Q / I; (2.92) D  1  n1 D i.e., Q is unitary. Therefore, equation (2.91) can be written

A QR: (2.93) D Equation (2.93) is the desired factorization. The operations count for this factorization is approxi- mately mn2 where an operation is an addition and a multiplication. In practice it is not necessary to construct the matrix Q explicitly. Usually only the vectors v defining each Qk are saved.

If Q is the matrix consisting of the first n columns of Q, then O A QR (2.94) D O O where Q is a m n matrix with orthonormal columns and R is a n n upper triangular matrix. O O The factorization in equation (2.94) is the reduced QR factorization. 

2.3.7 Uniqueness of the Reduced QR Factorization

In this section we will show that a matrix A of full rank has a unique reduced QR factorization if we require that the triangular matrix R has positive diagonals. All other reduced QR of A are simply related to this one with positive diagonals.

The reduced QR factorization can be written

r11 r12 r1n  r22 r2n A .a1;a2;:::;an/ .q1; q2;:::;qn/ : : : (2.95) D D :: : ˙ rnn

If A has full rank, then all of the diagonal elements rjj must be nonzero. Equating columns in equation (2.95), we get j j 1

aj rkj qk rjj qj rkj qk D D C XkD1 XkD1

29 or j 1 1 qj .aj rkj qk/: (2.96) D rjj kXD1 When j 1 equation (2.96) reduces to D a1 q1 : (2.97) D r11

Since q1 must have unit norm, it follows that

r11 a1 : (2.98) j jDk k

Equations (2.97) and (2.98) determine q1 and r11 up to a factor having absolute value one, i.e., there is a d1 with d1 1 such that j jD q1 r11 d1r11 q1 O D O D d1

where r11 a1 and q1 a1=r11. O Dk k O D O For j 2, equation (2.96) becomes D 1 q2 .a2 r12q1/: D r22

Since the columns q1 and q2 must be orthonormal, it follows that

H 1 H 0 q1 q2 .q1 a2 r12/ D D r22 and hence that H H r12 q a2 d1q a2: (2.99) D 1 D O1 Here we have used the fact that d1 1=d1. Since q2 has unit norm, it follows that D

1 1 H 1 H 1 q2 a2 r12q1 a2 .d1q1 a2/q1=d1 a2 .q1 a2/q1 Dk kD r22 k kD r22 k O O kD r22 k O O k j j j j j j and hence that H r22 a2 .q a2/q1 r22: j jDk O1 O kÁ O Therefore, there exists a scalar d2 with d2 1 such that j jD

r22 d2r22 and q2 q2=d2 D O DO H where q2 a2 .q a2/q1 =r22. O D O1 O O For j 3, equation (2.96) becomes D 1 q3 .a3 r13q1 r23q2/: D r33

30 Since the columns q1, q2 and q3 must be orthonormal, it follows that

H 1 H 0 q1 q3 .q1 a3 r13/ D D r33 H 1 H 0 q2 q3 .q2 a3 r23/ D D r33 and hence that

H H r13 q a3 d1q a3 D 1 D O1 H H r23 q a3 d2q a3: D 2 D O2

Since q3 has unit norm, it follows that

1 1 H H 1 q3 a3 r13q1 r23q2 a3 .q1 a3/q1 .q2 a3/q2 Dk kD r33 k kD r33 k O O O O k j j j j and hence that H H r33 a3 .q1 a3/q1 .q2 a3/q2 r33: j jDk O O O O kÁ O Therefore, there exists a scalar d3 with d3 1 such that j jD

r33 d3r33 and q3 q3=d3 (2.100) D O DO H H where q3 a3 .q1 a3/q1 .q2 a3/q2 =r33. Continuing in this way we obtain the unitary O D O O O O O matrix Q .q1;:::; qn/ and the triangular matrix O D O O 

r11 r12 r1n O O  O r22 r2n R O : O: O D :: : rnn ˙ O  such that A QR is the unique reduced QR factorization of A with R having positive diagonal O O elements. If DA QR is any other reduced QR factorization of A, then D

d1 : R :: R D O dn

and   1=d1 d1 : : Q Q :: Q :: D O D O 1=dn dn

where d1 dn 1.     j jDDj jD

31 2.3.8 Solution of Least Squares Problems

In this section we will show how to use the QR factorization to solve the least squares problem. Consider the system of linear equations Ax b (2.101) D where A is an m n matrix with m>n. In general there is no solution to this system of equa- tions. Instead we seek to find an x so that Ax b is as small as possible. In view of the QR factorization, we have k k

Ax b 2 QRx b 2 Q.Rx QH b/ 2 Rx QH b 2: (2.102) k k Dk k Dk k Dk k

We can write Q in the partitioned form Q .Q1;Q2/ where Q1 is an m n matrix. Then D  H H H Rx Q1 b Rx Q1 b Rx Q b O H O : (2.103) D 0 Q b D QH b  à  2 à  2 à It follows from equation (2.103) that

Rx QH b 2 Rx QH b 2 QH b 2: (2.104) k k Dk O 1 k Ck 2 k Combining equations (2.102) and(2.104), we get

Ax b 2 Rx QH b 2 QH b 2: (2.105) k k Dk O 1 k Ck 2 k It can be easily seen from this equation that Ax b is minimized when x is the solution of the triangular system k k Rx QH b (2.106) O D 1 when such a solution exists. This is the standard way of solving least square systems. Later we will discuss the singular value decomposition (SVD) that will provide even more information relative to the least squares problem. However, the SVD is much more expensive to compute than the QR decomposition.

2.4 The Singular Value Decomposition

The Singular Value Decomposition (SVD) is one of the most important and probably one of the least well known of the matrix factorizations. It has many applications in statistics, signal process- ing, image compression, pattern recognition, weather prediction, and modal analysis to name a few. It is also a powerful diagnostic tool. For example, it provides approximations to the rank and the condition number of a matrix as well as providing orthonormal bases for both the range and the null space of a matrix. It also provides optimal low rank approximations to a matrix. The SVD is applicable to both square and rectangular matrices. In this regard it provides a general solution to the least squares problem.

32 The SVD was first discovered by differential geometers in connection with the analysis of bilinear forms. Eugenio Beltrami [1] (1873) and Camille Jordan [10] (1874) independently discovered that the singular values of the matrix associated with a bilinear form comprise a complete set of invariants for the form under orthogonal substitutions. The first proof of the singular value decomposition for rectangular and complex matrices seems to be by Eckart and Young [5] in 1939. They saw it as a generalization of the principal axis transformation for Hermitian matrices.

We will begin by deriving the SVD and presenting some of its most important properties. We will then discuss its application to least squares problems and matrix approximation problems. Follow- ing this we will show how singular values can be used to determine the condition of a matrix (how close the rows or columns are to being linearly dependent). We will conclude with a brief outline of the methods used to compute the SVD. Most of the methods are modifications of methods used to compute eigenvalues and vectors of a square matrix. The details of the computational methods are beyond the scope of this presentation, but we will provide references for those interested.

2.4.1 Derivation and Propertiesof the SVD

Theorem 1. (Singular Value Decomposition) Let A be a nonzero m n matrix. Then there exists  an orthonormal basis u1;:::;um of m-vectors, an orthonormal basis v1;:::;vn of n-vectors, and positive numbers 1;:::;r such that

1. u1;:::;ur is a basis of the range of A

2. vrC1;:::;vn is a basis of the null space of A

r H 3. A kukv : D kD1 k P Proof: AH A is a Hermitian n n matrix that is positive semidefinite. Therefore, there is an  2 2 orthonormal basis v1;:::;vn and nonnegative numbers 1 ;:::;n such H 2 A Avk  vk k 1;:::;n: (2.107) D k D 2 Since A is nonzero, at least one of the eigenvalues k must be positive. Let the eigenvalues be arranged so that  2  2  2 > 0 and  2  2 0. Consider now the vectors 1  2  r rC1 DD n D Av1;:::;Avn. We have

H H H 2 H .Avi / Avj v A Avj  v vj 0 i j; (2.108) D i D j i D ¤ i.e., Av1;:::;Avn are orthogonal. When i j D 2 H H 2 H 2 Avi v A Avi  v vi  > 0 i 1;:::;r k k D i D i i D i D 0 i > r: (2.109) D Thus, AvrC1 Avn 0 and hence vrC1;:::;vn belong to the null space of A. Define D  D D u1;:::;ur by ui .1=i /Avi i 1;:::;r: (2.110) D D 33 Then u1;:::;ur is an orthonormal set of vectors in the range of A that span the range of A. Thus, u1;:::;ur is a basis for the range of A. The dimension r of the range of A is called the rank of A. If r

n r r H H H Ax .v x/Avk .v x/kuk kukv x: (2.112) D k D k D k kXD1 kXD1 XkD1 Since x in equation (2.112) was arbitrary, we must have

r H A kukv : (2.113) D k XkD1 The representation of A in equation (2.113) is called the singular value decomposition (SVD). If x belongs to the null space of A (Ax 0), then it follows from equation (2.112) and the linear D H independence of the vectors u1;:::;ur that vk x 0 for k 1;:::;r. It then follows from equation (2.111) that D D n H x .v x/vk; D k kDXrC1 i.e., vrC1;:::;vn span the null space of A. Since vrC1;:::;vn are orthonormal vectors belonging to the null space of A, they form a basis for the null space of A.

We will now express the SVD in matrix form. Define U .u1;:::;um/, V .v1;:::;vn/, and D D S diag.1;:::;r /. If r < min.m:n/, then the SVD can be written in the matrix form D S 0 A U V H : (2.114) D 0 0 Â Ã If r m

We next give a geometric interpretation of the SVD. For this purpose we will restrict ourselves to the real case. Let x be a point on the unit sphere, i.e., x 1. Since u1;:::;ur is a basis for the k kD range of A, there exist numbers y1;:::;yk such that

r

Ax ykuk D kD1 Xr T k.v x/uk : D k XkD1 T Therefore, yk k.vk x/, k 1;:::;r. Since the columns of V form an orthonormal basis, we have D D n T x .v x/vk: D k kXD1 Therefore, n x 2 .vT x/2 1: k k D k D XkD1 It follows that y2 y2 1 r .vT x/2 .vT x/2 1: 2 2 1 r 1 CC r D CC Ä Here equality holds when r n. Thus, the image of x lies on or interior to the hyper ellipsoid D with semi axes 1u1;:::;r ur. Conversely, if y1;:::;yr satisfy

y2 y2 1 r 1; 2 2 1 CC r Ä

2 r 2 we define ˛ 1 .yk=k/ and D kD1 P r yk x vk ˛vrC1: D k C XkD1

Since vrC1 is in the null space of A and Avk kuk (k r), it follows that D Ä r r yk Ax Avk ˛AvrC1 ykuk: D k C D XkD1 kXD1 In addition, r 2 2 yk 2 x 2 ˛ 1: k k D k C D XkD1

35 Thus, we have shown that the image of the unit sphere x 1 under the mapping A is the hyper ellipsoid k kD y2 y2 1 r 1 2 2 1 CC r Ä

relative to the basis u1;:::;ur . When r n, equality holds and the image is the surface of the hyper ellipsoid D y2 y2 1 r 1: 2 2 1 CC n D

2.4.2 The SVD and Least Squares Problems

In least squares problems we seek an x that minimizes Ax b . In view of the singular value decomposition, we have k k

2 S 0 Ax b 2 U V H x b k k D 0 0 Â Ã 2 S 0 H H U V x U b D 0 0 ÄÂ Ã  2 S 0 H H V x U b : (2.118) D 0 0 Â Ã

If we define y y 1 V H x (2.119) D y2 D Â Ã bO1 H bO U b: (2.120) D bO2! D then equation (2.118) can be written

2 2 Sy1 b1 2 2 Ax b O Sy1 b1 b2 : (2.121) k k D b Dk O k Ck O k O2 !

1 It is clear from equation (2.121) that Ax b is minimized when y1 S bO1. Therefore, the y that minimizes Ax b is given byk k D k k 1 S b1 y O y2 arbitrary. (2.122) D y2 Â Ã In view of equation (2.119), the x that minimizes Ax b is given by k k 1 S b1 x Vy V O y2 arbitrary. (2.123) D D y2 Â Ã 36 Since V is unitary, it follows from equation (2.123) that

2 1 2 2 x S b1 y2 : k k Dk O k Ck k Thus, there is a unique x of minimum norm that minimizes Ax b , namely the x corresponding k k to y2 0. This x is given by D S 1b x V O1 D 0  à S 1 0 b V O1 D 0 0 b  à O2! S 1 0 V U H b: D 0 0  à The matrix multiplying b on the right-hand-side of this equation is called the generalized inverse of A and is denoted by AC, i.e.,

S 1 0 AC V U H : (2.124) D 0 0 Â Ã Thus, the minimum norm solution of the least squares problem is given by x ACb. The n m matrix AC plays the same role in least squares problems that A1 plays in theD solution of linear equations. We will now show that this definition of the generalized inverse gives the same result as the classical Moore-Penrose conditions.

Theorem 2. If A has a singular value decomposition given by

S 0 A U V H ; D 0 0 Â Ã then the matrix X defined by S 1 0 X AC V U H D D 0 0 Â Ã is the unique solution of the Moore-Penrose conditions:

1. AXA A D 2. XAX X D 3. .AX/H AX D 4. .XA/H XA. D

37 Proof: S 0 S 1 0 S 0 AXA U V H V U H U V H D 0 0 0 0 0 0 Â Ã Â Ã Â Ã S 0 I 0 U V H D 0 0 0 0 Â ÃÂ Ã S 0 U V H D 0 0 Â Ã A; D i.e., X satisfies condition (1). S 1 0 S 0 S 1 0 XAX V U H U V H V U H D 0 0 0 0 0 0 Â Ã Â Ã Â Ã S 1 0 V U H D 0 0 Â Ã X; D i.e., X satisfies condition (2). Since S 0 S 1 0 I 0 AX U V H V U H U U H D 0 0 0 0 D 0 0 Â Ã Â Ã Â Ã and S 1 0 S 0 I 0 XA V U H U V H V V H ; D 0 0 0 0 D 0 0 Â Ã Â Ã Â Ã it follows that both AX and XA are Hermitian, i.e., X satisfies conditions (3) and (4). To show uniqueness let us suppose that both X and Y satisfy the Moore-Penrose conditions. Then X XAX by (2) D X.AX/H XX H AH by (3) D D XX H .AYA/H XX H AH Y H AH by (1) D D XX H AH .AY/H XX H AH AY by (3) D D X.AX/H AY XAXAY by (3) D D XAY by (2) D X.AYA/Y by (1) D XA.YA/Y XA.YA/H Y XAAH Y H Y by (4) D D D .XA/H AH Y H Y AH X H AH Y H Y by (4) D D .AXA/H Y H Y AH Y H Y by (1) D D .YA/H Y YAY by (4) D D Y by (2): D Thus, there is only one matrix X satisfying the Moore-Penrose conditions.

38 2.4.3 Singular Values and the Norm of a Matrix

Let A be an m n matrix. By virtue of the SVD, we have  r H Ax k.v x/uk for any n-vector x: (2.125) D k kXD1 Since the vectors u1;:::;ur are orthonormal, we have

r r Ax 2  2 vH x 2  2 vH x 2  2 x 2: (2.126) k k D k j k j Ä 1 j k j Ä 1 k k kXD1 XkD1 n H The last inequality comes from the fact that x has the expansion x .v x/vk in terms of D kD1 k the orthonormal basis v1;:::;vn and hence P n x 2 vH x 2: k k D j k j kXD1 Thus, we have Ax 1 x for all x. (2.127) k kÄ k k Since Av1 1u1, we have Av1 1 1 v1 . Hence, D k kD D k k Ax max k k 1; (2.128) x¤0 x D k k i.e., A can’t stretch the length of a vector by a factor greater than 1. One of the definitions of the norm of a matrix is Ax A sup k k: (2.129) k kD x x¤0 k k It follows from equations (2.128)and (2.129) that A 1 (the maximum singular value of A). If A is of full rank (r=n), then it follows by a similark argumentkD that Ax min k k n: x¤0 x D k k If A is an m n matrix and B is an n p matrix, then for every p-vector x we have   ABx A Bx A B x k kÄk kk kÄk kk kk k and hence AB A B . k kÄk kk k

2.4.4 Low Rank Matrix Approximations

You can think of the rank of a matrix as a measure of redundancy. Matrices of low rank should have lots of redundancy and hence should be capable of specification by less parameters than the

39 total number of entries. For example, if the matrix consists of the pixel values of a digital image, then a lower rank approximation of this image should represent a form of image compression. We will make this concept more precise in this section.

k H One choice for a low rank approximation to A is the matrix Ak iD1 i ui vi for k < r. Ak is a truncated SVD expansion of A. Clearly D P r H A Ak i ui v : (2.130) D i iDXkC1

Since the largest singular value of A Ak is kC1, we have

A Ak kC1: (2.131) k kD Suppose B is another m n matrix of rank k. Then the null space N of B has dimension n k. Let  w1;:::;wnk be a basis for N . The n 1n-vectors w1;:::;wnk ; v1;:::;vkC1 must be linearly C dependent, i.e., there are constants ˛1;:::;ank and ˇ1;:::;ˇkC1, not all zero, such that

nk kC1

˛i wi ˇi vi 0: C D iD1 iD1 X X Not all of the ˛i can be zero since v1;:::;vkC1 are linearly independent. Similarly, not all of the ˇi can be zero. Therefore, the vector h defined by

nk kC1

h ˛i wi ˇi vi D D iD1 iD1 X X is a nonzero vector that belongs to both N and < v1;:::;vkC1 >. By proper scaling, we can assume that h is a vector with unit norm. Since h belongs to < v1;:::;vkC1 >, we have

kC1 H h .v h/vi : (2.132) D i iD1 X Therefore, kC1 h 2 vH h 2: (2.133) k k D j i j iD1 X Since Avi i ui for i 1;:::;r, it follows from equation (2.132) that D D kC1 kC1 H H Ah .v h/Avi .v h/i ui : (2.134) D i D i iD1 iD1 X X Therefore, kC1 kC1 Ah 2 vH h 2 2  2 vH h 2  2 h 2: (2.135) k k D j i j i  kC1 j i j D kC1k k iD1 iD1 X X 40 Since h belongs to the null space N , we have

A B 2 .A B/h 2 Ah 2  2 h 2  2 : (2.136) k k k k Dk k  kC1k k D kC1 Combining equations (2.131) and(2.136), we obtain

A B kC1 A Ak : (2.137) k k Dk k

Thus, Ak is the rank k matrix that is closest to A.

2.4.5 The Condition Number of a Matrix

Suppose A is an n n and x is the solution of the system of equations Ax b. We want to see how sensitive x is to perturbations of the matrix A. Let x ıx be the solutionD to the perturbed system .A ıA/.x ıx/ b. Expanding the left-hand-sideC of this equation and neglecting the second orderC perturbationsC DıA ıx, we get

ıA x A ıx 0 or ıx A1ıA x: (2.138) C D D It follows from equation (2.138) that

ıx A1 ıA x k kÄk kk kk k or ıx = x k k k k A1 A : (2.139) ıA = A Äk kk k k k k k The quantity A1 A is called the condition number of A and is denoted by Ä.A/, i.e., k kk k Ä.A/ A1 A : Dk kk k Thus, equation (2.139) can be written

ıx = x k k k k Ä.A/: (2.140) ıA = A Ä k k k k 1 We have seen previously that A 1, the largest singular value. Since A has the singular 1 k k1 DH 1 value decomposition A VS U , it follows that A 1=n. Therefore, the condition number is given by D k k D  Ä.A/ 1 : (2.141) D n The condition number is sort of an aspect ratio of the hyper ellipsoid that A maps the unit sphere into.

41 2.4.6 Computation of the SVD

The methods for calculating the SVD are all variations of methods used to calculate eigenvalues and eigenvectors of Hermitian Matrices. The most natural procedure would be to follow the deriva- tion of the SVD and compute the squares of the singular values and the unitary matrix V by solving the eigenproblem for AH A. The U matrix would then be obtained from AV . Unfortunately, this procedure is not very accurate due to the fact that the singular values of AH A are the squares of the singular values of A. As a result the ratio of largest to smallest singular value can be much larger for AH A than for A. There are, however, implicit methods that solve the eigenproblem for AH A without ever explicitly forming AH A. Most of the SVD first reduce A to bidiagonal form (all elements zero except the diagonal and first superdiagonal). This can be accomplished using householder reflections alternately on the left and right as shown in figure 2.2.

x xxx xx 0 0 0xxx 0xx x H ! ! A1 U1 A 0xxx A2 A1V1 0xx x D D 0xxx D D 0xx x ˇ0xxx ˇ0xx x

xx0 0 xx 0 0 0xxx 0xx 0 H ! ! A3 U2 A2 0 0xx A4 A3V2 0 0xx D D 0 0xx D D 0 0xx ˇ0 0xx ˇ0 0xx

xx0 0 xx 0 0 0xx 0 0xx 0 H ! H A5 U3 A4 0 0xx A6 U4 A5 0 0xx : D D 0 0 0x D D 0 0 0x ˇ0 0 0x ˇ0 000

Figure 2.2: Householder reduction of a matrix to bidiagonal form.

Since the application of the Householder reflections on the right don’t try to zero all the elements to the right of the diagonal, they don’t affect the zeroes already obtained in the columns. We have seen that, even in the complex case, the Householder matrices can be chosen so that the resulting bidiagonal matrix is real. Notice also that when the number of rows m is greater than the number of columns n, the reduction produces zero rows after row n. Similarly, when n>m, the reduction produces zero columns after column m. If we replace the products of the Householder reflections by the unitary matrices U and V , the reduction to a bidiagonal B can be written as O O B U H AV or A UBV H : (2.142) D O O D O O 42 If B has the SVD B U˙V T , then A has the SVD D N N A U.U˙V T /V H .U U/˙.V V/H U˙V H ; D O N N O D O N O N D where U U U and V V V . Thus, it is sufficient to find the SVD of the real bidiagonal matrix O N O N B. Moreover,D it is not necessaryD to carry along the zero rows or columns of B. For if the square T portion B1 of B has the SVD B1 U1˙1V , then D 1 B U ˙ V T U 0 ˙ B 1 1 1 1 1 1 V T (2.143) D 0 D 0 D 0 I 0 1 Â Ã Â Ã Â ÃÂ Ã or T T V1 0 B .B1;0/ .U1˙1V ;0/ U1.˙1;0/ : (2.144) D D 1 D 0 I Â Ã Thus, it is sufficient to consider the computation of the SVD for a real, square, bidiagonal matrix B.

In addition to the implicit methods of finding the eigenvalues of BT B, some methods look instead 0 BT 0 BT at the symmetric matrix . If the SVD of B is B U˙V T , then has the B 0 D B 0 eigenequation  à  à 0 BT V V V V ˙ 0 : (2.145) B 0 U U D U U 0 ˙  Ã à  Ã à 0 BT In addition, the matrix can be reduced to a real tridiagonal matrix T by the relation B 0  à T P T BP (2.146) D where P .e1;enC1;e2;enC2;:::;en;e2n/ is a permutation matrix formed by a rearrangement D of the columns e1;e2;:::;e2n of the 2n 2n identity matrix. The matrix P is unitary and is sometimes called the perfect shuffle since its operation on a vector mimics a perfect card shuffle of the components. The algorithms based on this double size Symmetric matrix don’t actually form the double size matrix, but make efficient use of the symmetries involved in this eigenproblem. For those interested in the details of the various SVD algorithms, I would refer you to the book by Demmel [4].

In Matlab the SVD can be obtained by the call [U,S,V]=svd(A). In LAPACK the general driver routines for the SVD are SGESVD, DGESVD, and CGESVD depending on whether the matrix is real single precision, real double precision, or complex.

43 Chapter 3

Eigenvalue Problems

Eigenvalue problems occur quite often in Physics. For example, in Quantum Mechanics eigen- values correspond to certain energy states; in structural mechanics problems eigenvalues often correspond to resonance frequencies of the structure; and in time evolution problems eigenvalues are often related to the stability of the system.

Let A be an m m square matrix. A nonzero vector x is an eigenvector of A and  is its corre- sponding eigenvalue, if Ax x: D The set of vectors V x Ax x D f W D g is a subspace called the eigenspace corresponding to . The equation Ax x is equivalent to .A I /x 0. If  is an eigenvalue, then the matrix A I is singular andD hence D det.A I / 0: D Thus, the eigenvalues of A are roots of a equation of order n. This polynomial equation n1 n is called the characteristic equation of A. Conversely, if p.z/ a0 a1z an1z anz D C CC C is an arbitrary polynomial of degree n (an 0), then the matrix ¤

0 a0=an 1 0 a1=an 1 0 a2=an : : 1 :: : :: : 0 an2=an 1 an1=an has p.z/ 0 as its characteristic equation. D   In some problems an eigenvalue  might correspond to a multiple root of the characteristic equa- tion. The multiplicity of the root  is called its algebraic multiplicity. The dimension of the space

44 V is called its geometric multiplicity. If for some eigenvalue  of A, the geometric multiplicity of  does not equal its geometric multiplicity, this eigenvalue is said to be defective. A matrix with one or more defective eigenvalues is said to be a defective matrix. An example of a defective matrix is the matrix 2 1 0 0 2 1 : 0 0 2 This matrix has the single eigenvalue 2 with algebraic multiplicity 3. However, the eigenspace corresponding to the eigenvalue 2 has dimension 1. All the eigenvectors are multiples of e1. In these notes we will only consider eigenvalue problems involving Hermitian matrices (AH A). We will see that all such matrices are non defective. D

If S is a nonsingular m m matrix, then the matrix S 1AS is said to be similar to A. Since  det.S 1AS I / det S 1.A I/S det.S 1/ det.A I / det.S/ det.A I/; D D D it follows that S 1AS and Ahave the same characteristic equation and hence the same eigenvalues. It can be shown that a Hermitian matrix A always has a complete set of orthonormal eigenvectors. If we form the unitary matrix U whose columns are the eigenvectors belonging to this orthonormal set, then AU U or U H AU  (3.1) D D where  is a diagonal matrix whose diagonal entries are the eigenvalues. Thus, a Hermitian matrix is similar a diagonal matrix. Since a diagonal matrix is clearly non defective, it follows that all Hermitian matrices are non defective.

If e is a unit eigenvector of the Hermitian matrix A and  is the corresponding eigenvalue, then

Ae e and hence  eH Ae: D D It follows that  .eH Ae/H eH AH e eH Ae , i.e., the eigenvalues of a Hermitian matrix are real. D D D D

It was shown by Abel, Galois and others in the nineteenth century that there can be no alge- braic expression for the roots of a polynomial equation whose order is greater than four. Since eigenvalues are roots of the characteristic equation and since the roots of any polynomial are the eigenvalues of some matrix, there can be no purely algebraic method for computing eigenvalues. Thus, algorithms for finding eigenvalues must at some stage be iterative in nature. The methods to be discussed here first reduce the Hermitian matrix A to a real, symmetric, tridiagonal matrix T by means of a unitary similarity transformation. The eigenvalues of T are then found using certain iterative procedures. The most common iterative procedures are the QR and the divide-and-conquer algorithm.

45 3.1 Reduction to Tridiagonal Form

The reduction to tridiagonal form can be done with Householder reflectors. I will illustrate the procedure with a 5 5 matrix A, i.e., 

 A  : D  ˇ  We can zero out the elements in the first column from row three to the end using a Householder reflector of the form 1 0 U1 : D 0 Q1 Â Ã This reflector does not alter the elements of the first row. Thus, multiplying U1A on the right H by U1 zeros out the elements of the first row from column three on and doesn’t affect the first column. Hence, 0 0 0   H  Q1AQ1 0 : D 0  ˇ0   Moreover, the Householder reflector can be chosen so that the 12 element and the 21 element are real. We can continue in this manner to zero out the elements below the first subdiagonal and above the first superdiagonal. Furthermore, the Householder reflectors can be chosen so that the super and sub diagonals are real. The diagonals of the resulting tridiagonal matrix are real since the transformations have preserved the Hermitian property. Collecting the products of the Householder reflectors into a unitary matrix U , we have

UAU H T or A U H TU D D where T is a real, symmetric, tridiagonal matrix. Since A and T are similar, they have the same eigenvalues. Thus, we only need eigenvalue routines for real symmetric matrices. In the following sections we will assume that the matrix A is real and symmetric

3.2 The Power Method

The power method is one of the oldest methods for obtaining the eigenvectors of a matrix. It is no longer used for this purpose because of its slow convergence, but it does underlie some of the practical algorithms. Let v1; v2;:::;vn be an orthonormal basis of eigenvectors of the matrix A

46 and let 1;:::;n be the corresponding eigenvalues. We will assume that the eigenvalues and eigenvectors are so ordered that

1 2 n : j j  j jj j

We will assume further that 1 > 2 . Let v be an arbitrary vector with v 1. Then there j j j j k k D exist constants c1;:::;cn such that

v c1v1 cnvn: (3.2) D CC

We will make the further assumption that c1 0. Successively applying A to equation (3.2), we obtain ¤ k k k k k A v c1A v1 cnA vn c1 v1 cn vn: (3.3) D CC D 1 CC n k k You can see from equation (3.3) that the term c11 v1 will eventually dominate and thus A v, if properly scaled at each step to prevent overflow, will approach a multiple of the eigenvector v1. This convergence can be slow if there are other eigenvalues close in magnitude to 1. The condition c1 0 is equivalent to the condition ¤

< v2;:::;vn > 0 : \ D f g

3.3 The Rayleigh Quotient

The Rayleigh quotient of a vector x is the real number xT Ax r.x/ : D xT x If x is an eigenvector of A corresponding to the eigenvalue , then r.x/ . If x is any nonzero vector, then D

Ax ˛x 2 .xT AT ˛xT /.Ax ˛x/ k k D xT AT Ax 2˛xT Ax ˛2xT x D C xT AT Ax 2˛r.x/xT x ˛2xT x r2.x/xT x r2.x/xT x D C 2 C xT AT Ax xT x ˛ r.x/ r2.x/xT x: D C Thus, ˛ r.x/ minimizes Ax ˛x . If x is an approximate eigenvector, then r.x/ is an approximateD eigenvalue. k k

3.4 Inverse Iteration with Shifts

For any  that is not an eigenvalue of A, the matrix .A I /1 has the same eigenvectors as A 1 and has eigenvalues .j / where j are the eigenvalues of A. Suppose  is close to the f g 47 1 1 eigenvalue i . Then .i / will be large compared to .j / for j i. If we apply power 1 ¤ iteration to .A I / , the process will converge to a multiple of the eigenvector vi corresponding to i . This procedure is called inverse iteration with shifts. Although the power method is not used in practice, the inverse power method with shifts is frequently used to compute eigenvectors once an approximate eigenvalue has been obtained.

3.5 Rayleigh Quotient Iteration

The Rayleigh quotient can be used to obtain the shifts at each stage of inverse iteration. The procedure can be summarized as follows.

1. Choose a starting vector v.0/ of unit magnitude.

2. Let .0/ .v0/T Av0 be the corresponding Rayleigh quotient. D 3. For k 1;2;::: D 1 Solve A .k1/ w v.k1/ for w, i.e., compute A .k1/ v.k1/. D Normalize w to obtain v.k/ w= w .  D k k Let .k/ .v.k//T Av.k/ be the corresponding Rayleigh quotient. D

It can be shown that the convergence of Rayleigh quotient iteration is ultimately cubic. Cubic convergence triples the number of significant digits on each iteration.

3.6 The Basic QR Method

The QR method was discovered independently by Francis [6] and Kublanovskaya [11] in 1961. It is one of the standard methods for finding eigenvalues. The discussion in this section is based largely on the paper Understanding the QR Algorithm by Watkins [13]. As before, we will assume that the matrix A is real and symmetric. Therefore, there is an orthonormal basis v1;:::;vn such that Avj j vj for each j . We will assume that the eigenvalues j are ordered so that 1 D j j  2 n . j jj j The QR algorithm can be summarized as follows:

48 1. Choose A0 A D 2. For m 1;2;::: D

Am1 QmRm QR factorization D Am RmQm D

3. Stop when Am is approximately diagonal.

It is probablynot obvious what this algorithmhas todo witheigenvalues. We will show that the QR method is a way of organizing simultaneous iteration, which in turn is a generalization of the power method.

We can apply the power method to subspaces as well as to single vectors. Suppose S is a k- dimensional subspace. We can compute the sequence of subspaces S; AS; A2S;::: . Under certain conditions this sequence will converge to the subspace spanned by the eigenvectors v1; v2;:::;vk corresponding to the k largest eigenvalues of A. We will not provide a rigorous convergence proof, but we will attempt to make this result seem plausible. Assume that k > kC1 and define the subspaces j j j j T < v1;:::;vk > U < vkC1;:::;vn > : D D We will first show that all the null vectors of A lie in U . Suppose v is a null vector of A, i.e., Av 0. We can expand v in terms of the basis v1;:::;vn giving D

v c1v1 ckvk ckC1vkC1 cnvn: D CC C CC Thus, Av c11v1 ckk vk ckC1kC1vkC1 cnnvn 0: D CC C CC D Since the vectors vj are linearly independent and 1 k > 0, it follows that c1 f g j jj j D c2 ck 0, i.e., v belongs to the subspace U . We will now make the additional assumption DD D S U 0 . This assumption is analogous to the assumption c1 0 in the power method. If x is\ a nonzeroD f vectorg in S, then we can write ¤

x c1v1 c2v2 ckvk .component in T/ D C CC ckC1vkC1 cnvn:.component in U/ C CC Thus,

m m m m A x=.k / c1.1=k/ v1 ck1.k1=k/ vk1 ckvk D CCm mC ckC1.kC1=k/ vkC1 cn.n=k/ vn: C CC

Since x doesn’t belong to U , at least one of the coefficients c1;:::;ck must be nonzero. Notice that the first k terms on the right-hand-side do not decrease in absolute value as m whereas the remaining terms approach zero. Thus, Amx, if properly scaled, approaches the!1 subspace T as m . Inthelimit AmS must approach a subspace of T . Since S U 0 , A can have no null !1 \ D f g 49 vectors in S. Thus, A is invertible on S. It follows that all of the subspaces AmS have dimension k and hence the limit can not be a proper subspace of T , i.e., AmS T as m . ! !1 Numerically, we can’t iterate on an entire subspace. Therefore, we pick a basis of this subspace 0 0 0 0 and iterate on this basis. Let q1 ;:::;qk be a basis of S. Since A is invertible on S, Aq1 ;:::;Aqk m 0 m 0 m is a basis of AS. Similarly, A q1 ;:::;A qk is a basis of A S for all m. Thus, in principle we can iterate on a basis of S to obtain bases for AS; A2S;::: . However, for large m these bases become ill-conditioned since all the vectors tend to point in the direction of the eigenvector corresponding to the eigenvalue of largest absolute value. To avoid this we orthonormalizethe basis m m m m m at each step. Thus, given an orthonormal basis q1 ;:::;qk of A S, we compute Aq1 ;:::;Aqk and then orthonormalize these vectors (using something like the Gram-Schmidt process) to obtain mC1 mC1 mC1 an orthonormal basis q1 ;:::;qk of A S. This process is called simultaneous iteration. Notice that this process of orthonormalization has the property

< Aqm;:::;Aqm > < qmC1;:::;qmC1 > for i 1;:::;k: 1 i D 1 i D

Let us consider now what happens when we apply simultaneous iteration to the complete set of orthonormal vectors e1 :::;en where ek is the k-th column of the identity matrix. Let us define

Sk ; Tk < v1;:::;vk >; Uk < vkC1;:::;vn > D D D

for k 1;2;:::;n 1. We also assume that Sk Uk 0 and k > kC1 > 0 for each D \ D f gm j j j j 1 k n 1. It follows from our previous discussion that A Sk Tk as m . In terms Ä Ä m m ! ! 1 of bases, the orthonormal vectors q1 ;:::;qn will converge to and orthonormal basis q1;:::;qn such that Tk < q1;:::;qk > for each k 1;:::;n 1. Each of the subspaces Tk is invariant D D under A, i.e., ATk Tk. We will now look at a property of invariant subspaces. Suppose T is an  invariant subspace of A. Let Q .Q1;Q2/ be an orthogonal matrix such that the columns of Q1 is a basis of T . Then D

T T T T Q1 AQ1 Q1 AQ2 Q1 AQ1 0 Q AQ T T T D Q AQ1 Q AQ2 D 0 Q AQ2 Â 2 2 Ã Â 2 Ã , i.e., the basis consisting of the columns of Q block diagonalizes A. Let Q be the matrix with T columns q1;:::;qn. Since each Tk is invariant under A, the matrix Q AQ has the block diagonal form T A1 0 Q AQ where A1 is k k D 0 A2  Â Ã for each k 1;:::;n 1. Therefore, QT AQ must be diagonal. The diagonal entries are the D T m m eigenvalues of A. If we define Am QmAQm where Qm < q1 ;:::;qn >, then Am will become approximately diagonal for largeD m. D

We can summarize simultaneous iteration as follows:

50 1. We start with the orthogonal matrix Q0 I whose columns form a basis of n-space D

2. For k 1;2;::: we compute D

Zm AQm1 Power iteration step D (3.4a)

Zm QmRm Orthonormalize columns of Zm (3.4b) D T Am Q AQm Test for diagonal matrix: (3.4c) D m

The QR algorithm is an efficient way to organize these calculations. Equations (3.4a) and (3.4b) can be combined to give AQm1 QmRm: (3.5) D Combining equations (3.4c) and(3.5), we get

T T T Am1 Q AQm1 Q .QmRm/ .Q Qm/Rm QmRm (3.6) D m1 D m1 D m1 D O T where Qm Q Qm. Equation (3.5) can be rewritten as O D m1 T Q AQm1 Rm: (3.7) m D Combining equations (3.4c) and(3.7), we get

T T T T Am Q AQm .Q AQm1/Q Qm Rm.Q Qm/ RmQm: (3.8) D m D m m1 D m1 D O Equation (3.6) is a QR factorization of Am1. Equation (3.8) shows that Am has the same Q and R factors but with their order reversed. Thus, the QR algorithm generates the matrices Am recursively without having to compute Zm and Qm at each step. Note that the orthogonal matrices Q and Q satisfy the relation O m m T T T Q1Q2 Qk .Q Q1/.Q Q2/ .Q Qk/ Qk: O O  O D 0 1  k1 D

We have now seen that the QR method can be considered as a generalization of the power method. We will see that the QR algorithm is also related to inverse power iteration. In fact we have the following duality result. Theorem 3. If A is an n n symmetric nonsingular matrix and if S and S ? are orthogonal complementary subspaces. Then AmS and AmS ? are also orthogonal complements.

Proof. If x and y are n-vectors, then x y xT y xT AT .AT /1y .Ax/T .AT /1y .Ax/T A1y Ax A1y:  D D D D D  Applying this result repeatedly, we obtain x y Amx Amy:  D  51 It is clear from this relation that every element in AmS is orthogonal to every element in AmS ?. ? m m Let q1;:::;qk be a basis of S and let qkC1;:::;qn be a basis of S . Then A q1;:::;A qk m m m m ? is a basis of A S and A qkC1;:::;A qn is a basis of A S . Suppose there exist scalars c1;:::;cn such that m m m m c1A q1 ck A qk ckC1A qkC1 cnA qn 0: (3.9) CC C CC D m m Taking the of this relation with c1A q1 ckA qk, we obtain CC m m c1A q1 ckA qk 0 k CC kD m m m m and hence c1A q1 ckA qk 0. Since A q1;:::;A qk are linearly independent, it follows CC D that c1 c2 ck 0. In a similar manner we obtain ckC1 cn 0. Therefore, m D mDDm D m DD D A q1;:::;A qk; A qkC1;:::;A qn are linearly independent and hence form a basis for n- space. Thus, AmS and AmS ? are orthogonal complements.

It can be seen fromthis theorem that performingpower iteration on subspaces Sk is also performing ? inverse power iteration on Sk . Since m m m m < q ;:::;q > < A e1;:::;A ek >; 1 k D Theorem 3 implies that m m m m < q ;:::;q > < A ekC1;:::;A en > : kC1 n D m m m For k n 1 we have < qn > < A en >. Thus, qn is the result at the m-th step of D D m applying the inverse power method to en. It follows that qn should converge to an eigenvector corresponding to the smallest eigenvalue n. Moreover, the element in the n-th row and n-th T column of Am Q AQm should converge to the smallest eigenvalue n. D m The convergence of the QR method, like that of the power method, can be quite slow. To make the method practical, the convergence is accelerated using shifts as in the inverse power method.

3.6.1 The QR Method with Shifts

Suppose we apply a shift m at the m-th step, i.e., we replace A by A mI . Then the algorithm becomes

1. Set A0 A. D 2. for k 1;2;::: D

Ak1 kI QkRk QR factorization D O O Ak RkQk kI: D O O C 3. Deflate when eigenvalue converges

52 It follows from the QR factorization of Ak1 k I that T T T Q Ak1Qk kI Q .Ak1 kI/Qk Q QkRkQk RkQk: (3.10) O k O D O k O D O k O O O D O O Equation (3.10) implies that T Ak Q Ak1Qk: (3.11) D O k O It follows by induction on equation (3.11) that

T T Ak Q Q AQ1 Qk: (3.12) D O k  O 1 O  O If we define Qk Q1 Qk; D O  O then equation (3.12) can be written T Ak Q AQk: (3.13) D k Thus, each Ak has the same eigenvalues as A. Theorem 4. For each k 1 we have the relation 

.A kI/ .A 1I/ Q1 QkRk R1 QkRk  D O  O O  O D where Qk Q1 Qk and Rk Rk R1. D O  O D O  O

Proof. For k 1 the result is just the k 1 step. Assume that the result holds for some k, i.e., D D

.A k I/ .A 1I/ QkRk: (3.14)  D From the k 1 step we have C Ak kC1I QkC1RkC1: (3.15) D O O Combining equations (3.13) and(3.15), we get

T T Ak kC1I Q AQk kC1I Q .A kC1I/Qk QkC1RkC1; D k D k D O O and hence T T A kC1I QkQkC1RkC1Q QkC1RkC1Q : (3.16) D O O k D O k Combining equations (3.14) and(3.16), we get

T .A kC1I/.A kI/ .A 1I/ QkC1RkC1Q QkRk QkC1RkC1;  D O k D which is the result for k 1. This completes the proof by induction C

It follows from Theorem 4 that

.A k I/ .A 1I/e1 QkRke1:  D

53 Since Rk is upper triangular, QkRk e1 is proportional to the first column of Qk. Thus, the first column of Qk, apart from a constant multiplier, is the result of applying the power method with shifts to e1. Taking the inverse of the result in Theorem 4, we obtain

1 1 1 T .A 1I/ .A kI/ R Q : (3.17)  D k k 1 Since for each j the factor A j I is symmetric, its inverse .A j I/ is also symmetric. Taking the transpose of equation (3.17), we get

1 1 1 T .A kI/ .A 1I/ Qk R : (3.18)  D k

Applying equation (3.18) to en, we get 

1 1 1 T .A kI/ .A 1I/ en Qk R en:  D k 1 T 1 T  Since Rk is lower triangular, Rk en is a multiple of the last column of Qk. Therefore, the last column of Qk, apart from a constant multiplier, is the result of applying the inverse power   method with shifts to en. We have yet to say how the shifts are to be chosen. One choice is to choose k to be the Rayleigh quotient corresponding the last column of Qk1. This is readily available to us since, by equation (3.13), it is equal to the .n;n/ element of Ak1. By our remarks on Rayleigh quotient iteration, we should expect cubic convergence to the eigenvalue n. This choice of shifts generally leads to convergence, but there are a few matrices for which the process fails to converge. For example, consider the matrix

0 1 A : D 1 0 Â Ã The unshifted QR algorithm doesn’t converge since

0 1 1 0 A Q1R1 D O O D 1 0 0 1 Â ÃÂ Ã 1 0 0 1 A1 R1Q1 A: D O O D 0 1 1 0 D Â ÃÂ Ã

Thus, all the iterates are equal to A. The Rayleigh quotient shift doesn’t help since A22 0. A shift that does work all the time is the Wilkinson Shift. This shift is obtained by consideringD the lower-rightmost 2 2 submatrix of Ak1 and choosing k to be the eigenvalue of this 2 2   submatrix that is closest to the .n;n/ element of Ak1. When there is sufficient convergence to the eigenvalue n, the off-diagonal elements in the last row and column of the Ak matrices will be very small. We can deflate these matrices by removing the first and last columns, and then n1 can be obtained using the deflated matrices. Continuing in this manner we can obtain all of the eigenvalues.

Until recently, the QR method with shifts (or one of its variants) was the primary method for computing eigenvalues and eigenvectors. Recently a competitor has emerged called the Divide- and-Conquer algorithm.

54 3.7 The Divide-and-Conquer Method

The Divide-and-Conqueralgorithm was first introduced by Cuppen [3] in 1981. As firstintroduced, the algorithm suffered from certain accuracy and stability problems. These were not overcome until a stable algorithm was introduced in 1993 by Gu and Eisenstat [8]. The divide-and-conquer algorithm is faster than the shifted QR algorithm if the size is greater than about 25 and both eigenvalues and eigenvectors are required. Let us begin by discussing the basic theory underlying the method. Let T denote a symmetric tridiagonal matrix for which we desire the eigenvalues and eigenvectors, i.e., T has the form

a1 b1 :: :: b1 : : :: : am1 bm1 b a b T m1 m m : (3.19) D bm amC1 bmC1 :: bmC1 : :: : bn1 ˇ bn1 an

55 The matrix T can be split into the sum of two matrices as follows:

a1 b1 :: :: b1 : : :: : am1 bm1 bm1 am bm T D amC1 bm bmC1 :: bmC1 : :: : bn1 ˇ bn1 an

bm bm C bm bm

0 :  0 : 1  0 B C T1 0 B 1 C T1 0 T bm B C .0;:::;0;1;1;0;:::;0/ bm vv (3.20) D 0 T2 C  B 1 C D 0 T2 C Â Ã B C Â Ã B 0 C B : C B : C B C B 0 C B C @ A where m is roughly one half of n, T1 and T2 are tridiagonal, and v is the vector v em emC1. D C Suppose we have the following eigen decompositions of T1 and T2

T T T1 Q11Q T2 Q22Q (3.21) D 1 D 2 where 1 and 2 are diagonal matrices of eigenvalues. Then T can be written

T1 0 T T bm vv D 0 T2 C Â Ã T Q11Q1 0 T T bm vv D 0 Q22Q2 C Â Ã T Q1 0 1 0 T Q1 0 bmuu T (3.22) D 0 Q2 0 2 C 0 Q Â Ã ÄÂ Ã Â 2 Ã where QT 0 u 1 v: D 0 QT Â 2 Ã 56 T Therefore, T is similar to a matrix of the form D uu where D diag.d1;:::;dn/. Thus, it suffices to look at the eigen problem for matricesC of the form D DuuT . Let us assume first that  is an eigenvalue of D uuT , but is not an eigenvalue of DC. Let x be an eigenvector of D uuT corresponding to C. Then C .D uuT /x Dx .uT x/u x: C D C D and hence x .uT x/.D I /1u: (3.23) D Multiplying equation (3.23) by uT and collecting terms, we get

n u2 .uT x/ 1 uT .D I /1u .uT x/ 1  k 0: (3.24) C D C dk  D kD1 !  X Since  is not an eigenvalue of D, we must have uT x 0. Thus, ¤ n u2 f./ 1  k 0: (3.25) D C dk  D XkD1 Equation (3.25) is called the secular equation and f./ is called the secular function. The eigen- values of D uuT that are not eigenvalues of D are roots of the secular equation. It follows from equationC (3.23) that the eigenvector corresponding to the eigenvalue  is proportional to .D I /1u. Figure 3.1 shows a plot of an example secular function. The slope of f./ is given by n u2 f 0./  k : 2 D .dk / XkD1 Thus, the slope (when it exists) is positive if >0 and negative if <0. Suppose the di are such that d1 > d2 > > dn and that all the components of u are nonzero. Then there must be a root  between each pair .di ; diC1/. This gives n 1 roots. Since f./ 1 as  or as  , ! !1 ! 1 there is another root greater than d1 if >0 and a root less than dn if <0. This gives n roots. The only way the secular equation will have less than n roots is if one or more of the components of u are zero or if one or moreof the di are equal. Suppose  is a root of the secular equation. We will show that x .D I /1u is an eigenvector of D uuT corresponding to the eigenvalue . Since  is a rootD of the secular equation, we have C

f./ 1 uT .D I /1u 1 uT x 0 D C D C D or uT x 1. Since x .D I /1u, we have D D .D I /x Dx x u or Dx u x: D D D It follows that .D uuT /x Dx .uT x/u Dx u x C D C D D as was to be proved.

57 6

4

2

0

−2

−4

−6 −1 0 1 2 3 4 5 6

Figure 3.1: Graph of f./ 1 :5 :5 :5 :5 D C 1 C 2 C 3 C 4

Let us now look at the special cases where there are less than n roots of the secular equation. If ui 0, then D T T .D uu /ei Dei .u ei /u Dei ui u Dei di ei ; C D C D C D D T i.e., ei is an eigenvector of D uu corresponding to the eigenvalue di . C

If di dj for i j and either ui or uj is nonzero, then the vector x ˛ei ˇej is an eigenvector D ¤ D C of D corresponding to the eigenvalue di for any ˛ and ˇ that are not both zero. We can choose ˛ and ˇ so that T u x ˛ui ˇuj 0: D C D For example, ˛ uj and ˇ ui would work. With this choice of ˛ and ˇ, the vector x D D T D ˛ei ˇej is an eigenvector of D uu corresponding to the eigenvalue di . In this waywe can obtainC n eigenvalues and vectors evenC when the secular equation has less that n roots.

Finding Roots of the Secular Equation The first thought would be to use Newton’s method to find the roots of f./. However, when one or more of the ui are small but not small enough to neglect, the function f./ behaves pretty much like it would if the terms corresponding to the small ui were not present until  is very close to one of the corresponding di where it abruptly approaches . Thus, almost any initial guess will lead away from the desired root. This is ˙1 58 illustrated in Figure 3.2 where the .5 factor multiplying 1=.2 / in the previous example is replaced by 0.01. Notice that the curve is almost vertical at the zero crossing near 2. To solve this 6

4

2

0

−2

−4

−6 −1 0 1 2 3 4 5 6

Figure 3.2: Graph of f./ 1 :5 :01 :5 :5 D C 1 C 2 C 3 C 4 problem, a modified form of Newton’s method is used. Newton’s method approximates the curve near the guess by the tangent line at the guess and then finds the place where this line crosses zero. Alternatively, we could approximate f./ near the guess by another curve that is tangent to f./ at the guess as long as we can find the nearby zero crossing of this curve. If we are looking for a root between di and diC1, we could use a function of the form

c2 c3 g./ c1 (3.26) D C di  C diC1  to approximate f./. Once c1, c2 and c3 are chosen, the roots of g./ can be found by solving the quadratic equation

c1.di /.diC1 / c2.diC1 / c3.di / 0: (3.27) C C D Let us write f./ as follows

f./ 1  1./  2./ (3.28) D C C where i 2 n 2 uk uk 1./ and 2./ : (3.29) D dk  D dk  XkD1 kDXiC1 59 Notice that 1 has only positive terms and 2 has only negative terms for diC1 <

˛2 g1./ ˛1 (3.30) D C di  where ˛1 and ˛2 are chosen so that

0 0 g1.j / 1.j / and g .j / .j /: (3.31) D 1 D 1 0 2 0 It is easily shown that ˛1 1.j / .di j / .j / and ˛2 .di j / .j /. Similarly, we D 1 D 1 approximate 2 near j by the function g2 given by

˛4 g2./ ˛3 (3.32) D C diC1  where ˛3 and ˛4 are chosen so that

0 0 g2.j / 2.j / and g .j / .j /: (3.33) D 2 D 2 0 2 0 Again it is easily shown that ˛3 1.j / .diC1 j / .j / and ˛4 .diC1 j / .j /. D 2 D 2

Putting these approximations together, we have the following approximation for f near j

: ˛2 ˛4 f./ 1 g1./ g2./ .1 ˛1 ˛3/ D C C D C C C di  C diC1  Á c2 c3 c1 : (3.34) C di  C diC1  This modified Newton’s method generally converges very fast.

Recursive Procedure We have shown how the eigenvalues and eigenvectors of T can be ob- tained from the eigenvalues and eigenvectors of the smaller matrices T1 and T2. The procedure we have applied to T can also be applied to T1 and T2. Continuing in this manner we can reduce the original eigen problem to the solution of a series of 1-dimensional eigen problems and the solution of a series of secular equations. In practice the recursive procedure is not carried all the way down to 1-dimensional problems, but stops at some size where the QR method can be applied effec- tively. We saw previously that the eigenvector corresponding to the eigenvalue  is proportional to .D I /1u as in equation (3.23). There are a number of subtle issues involved in computing the eigenvectors this way when there are closely spaced pairs of eigenvalues. The interested reader should consult the book by Demmel [4] for a discussion of these issues.

60 Chapter 4

Iterative Methods

Direct methods for solving systems of equations Ax b or computing eigenvalues/eigenvectors of a matrix A become very expensive when the size n ofD A becomes large. These methods generally involve order n3 operations and order n2 storage. For large problems iterative methods are often used. Each step of an generally involves the multiplication of the matrix A by a vector v to obtain Av. Since the matrix A is not modified in this process it is often possible to take advantage of special structure of the matrix in forming Av. The special structure most often exploited is sparseness (many elements of A zero). Taking advantage of the structure of A can often drastically reduce the cost of each iteration. The cost of iterative methods also depends on the rate of convergence. Convergence is usually better when the matrix A is well conditioned. Therefore, preconditioning of the matrix is often employed prior to the start of iteration. There are many iterative methods. In this section we will discuss only two: the Lanczos method for eigen problems and the conjugate gradient method for equation solution.

4.1 The Lanczos Method

As before, we will restrict our attention here to real symmetric matrices. We saw previously that the power method is an iterative method whose m-th iterate x.m/ is given by x.m/ Ax.m1/. Lanczos had the idea that better convergence could be obtained if we made use of allD the iterates x.0/; Ax.0/; A2x.0/;:::;Amx.0/ at the m-th step instead of just the final iterate x.m/. The subspace .0/ .0/ m1 .0/ generated by x ; Ax ;:::;A x is called the m-th Krylov subspace and is denoted by Km. Lanczos showed that you could generate an orthonormal basis q1;:::;qm of the Krylov subspace Km recursively. He then showed that the eigen problem restricted to this subspace is equivalent T to finding the eigenvalues/eigenvectors of the tridiagonal matrix Tm Q AQm where Qm is D m the matrix whose columns are q1;:::;qm. As m becomes larger some of the eigenvalues of Tm converge to eigenvalues of A.

61 Let q1 be defined by .0/ .0/ q1 x = x (4.1) D k k and let q2 be given by

q2 r1= r1 where r1 Aq1 .Aq1 q1/q1: (4.2) D k k D 

It is easily verified that r1 q1 q2 q1 0. We generate the remaining vectors qk recursively.  D  D Suppose q1;:::;qp have been generated. We form

rp Aqp .Aqp qp/qp .Aqp qp1 /qp1 (4.3) D   qpC1 rp= rp : (4.4) D k k

Clearly, rp qp rp qp1 0 by construction. For s p 2 we have  D  D Ä

rp qs Aqp qs qp Aqs : (4.5)  D  D  But, it follows from equations (4.3)–(4.4) that

Aqs rs .Aqs qs/qs .Aqs qs1/qs1 D C  C  rs qsC1 .Aqs qs/qs .Aqs qs1/qs1: (4.6) Dk k C  C 

Thus, rp qs qp Aqs 0 since Aqs is a linear combination of vectors qk with k < p. It follows  D  D that qpC1 is orthogonal to all of the preceding qk vectors. We will now show that q1;:::;qm is a basis for the space Km. It follows from equations (4.3)and (4.4) that

.0/ .0/ .0/ < x > < q1 > and < x ; Ax > < q1; q2 > : D D Suppose for some k we have

.0/ .0/ k1 .0/ < x ; Ax ;:::;A x > < q1; q2;:::;qk > : D k .0/ Then, A x can be written as a linear combination of Aq1;:::;Aqk. It follows from equations k .0/ (4.3)and (4.4) that Aqi can be written as a linear combination of qi1, qi , qiC1. Therefore, A x can be written as a linear combination of q1;:::;qkC1 and hence

.0/ .0/ k .0/ < x ; Ax ;:::;A x > < q1; q2;:::;qkC1 > : D .0/ .0/ m1 .0/ It follows by induction that q1;:::;qm is a basis for Km < x ; Ax ;:::;A x >. D

Define ˛p Aqp qp and ˇp Aqp qp1. Then D  D 

ˇp Aqp qp1 qp Aqp1 D  D  qp rp1 qp .Aqp1 qp1/qp1 .Aqp1 qp2/qp2 D  k k C  C  rp1 : (4.7) Dk  k It follows from equations (4.3),(4.4), and (4.7) that

Aqp ˇpC1qpC1 ˛p qp ˇpqp1: (4.8) D C C 62 T In view of equation (4.8), the matrix Tm Q AQm has the tridiagonal form D m

˛1 ˇ2 ˇ2 ˛2 ˇ3 T : : : Tm Q AQm :: :: :: : (4.9) D m D ˇm1 ˛m1 ˇm ˇm ˛m

The original eigenvalue problem can be given a variational interpretation. Let a function  be defined by  Ax x .x/  : (4.10) D x x  We will show that .x/ is an eigenvalue of A if and only if x is a stationary point of , i.e., ıh.x/ 0 for all h. Since D d .x x/.2Ax h/ .Ax x/.2x h/ ıh.x/ .x h/     Á d C D0 D .x x/2 ˇ  2 Axˇ x Ax ˇ  x h; (4.11) D x x x x   h  i we have Ax x ıh.x/ 0 for all h Ax  x h 0 for all h (4.12a) D , x x  D h Ax  x i Ax  x .x/x: (4.12b) , D x x D  Suppose in this variational principle we restrict both x and h to the subspace Km. Then x and h m can be expressed in the form x Qmy and h Qmw for some y;w R . With these relations equation (4.12a) becomes D D 2

.AQm/y Qmy .AQm/y  x Qmw 0 for all w Qmy Qmy  D h  i or

T T .QmAQm/y y .Q AQm/y  y w 0 for all w: (4.13) m y y  D h  i Thus the variational principle restricted to Km leads to the reduced eigenvalue problem

T Tmy .Q AQm/y y: (4.14) D m D It has been found that the extreme eigenvalues usually converge the fastest with this method. The biggest numerical problem with this method is that round-off errors cause the vectors qk gener- ated in this way tend to loose their orthogonality as the number of steps increases. It hasf beeng found that this loss of orthogonality increases rapidly whenever one of the eigenvalues of Tm approaches

63 an eigenvalue of A . There are a number of methods that counteract this loss of orthogonality by periodically reorthogonalizing the vectors qk based on the convergence of the eigenvalues. f g

We can give another way of looking at the Lanczos algorithm. Let Km denote the matrix whose .0/ .0/ m1 .0/ columns are x ; Ax ;:::;A x . We will show that Km has a reduced QR factorization Km QmRm where Qm is the matrix occurring in the Lanczos method with columns q1;:::;qm. We haveD shown previously that

.0/ < x > < q1 > D .0/ .0/ < x ; Ax > < q1; q2 > D : : .0/ .0/ m1 .0/ < x ; Ax ;:::;A x > < q1; q2;:::;qm > : D We can express this result in matrix form as

.0/ .0/ m1 .0/ Km .x ; Ax ;:::;A x / .q1;:::;qm/Rm QmRm (4.15) D D D where Rm is an upper triangular matrix. This is the reduced QR factorization that we set out to establish. Of course we don’t want to determine Qm and Rm directly since the matrix Km becomes poorly conditioned for large m.

4.2 The Conjugate Gradient Method

The conjugate gradient (CG) method is a widely used iterative method for solving a system of equations Ax b when A is symmetric and positive definite. It was first introduced in 1952 by Hestenes andD Stiefel [9]. Although this was not the original motivation, the CG method can be considered as a Krylov subspace method related to the Lanczos method. We assume that q1; q2;::: are orthonormal vectors generated using the Lanczos recursion starting with the initial vector b. As T before we let Qk .q1;:::;qk/ and Tk Qk AQk. Since A is positive definite, we can define an A-norm by D D x 2 xT Ax: (4.16) k kA D We will show that each iterate xm in the CG method is the unique element of the Krylov subspace Km that minimizes the error x xm A where x is the solution of Ax b. k k D

Let rk denote the residual rk b Axk. Since q1 b= b , it follows that D D k k

64 T T T T Q rk Q .b Axk/ Q b Q Axk k D k D k k T q1 : T : b TkQk xk D T qk T b e1 TkQ xk Dk k  k T 1 TkQ . b QkT e1 xk /: (4.17) D k k k k If xk is chosen to be 1 xk b QkT e1; (4.18) Dk k k T then Q rk 0, i.e., rk is orthogonal to each of the vectors q1;:::;qk and hence to every vector k D in Kk . It follows from equation (4.18) that xk is a linear combination of q1;:::;qk and hence is a member of Kk. If x is an arbitrary element of Kk, then O x xk ı for some ı in Kk : O D C Since rk is orthogonal to every vector in Kk , we have x x 2 .x x/T A.x x/ k OkA D O O T .x xk ı/ A.x xk ı/ D 2 2 T x xk ı 2ı A.x xk/ Dk kA Ck kA 2 2 T x xk A ı A 2ı rk Dk k2 Ck k2 x xk ı : (4.19) Dk kA Ck kA 2 Thus x x is minimized for ı 0, i.e., when x xk. We will now develop a simple recursive k OkA D O D method to generate the iterates xk.

T The matrix Tk Q AQk is also positive definite and hence has a Cholesky factorization D k T Tk LkDkL (4.20) D k where Lk is unit lower triangular and Dk is diagonal with positive diagonals. Combining equations (4.18) and(4.20), we get

T 1 1 xk b Qk.L D L /e1 Dk k k k k Pkyk (4.21) D Q T 1 1 where Pk QkL and yk b D L e1. We denote the columns of Pk by p1::::; pk and Q D k Dk k k k Q Q Q the components of y by Á ;:::;Á . We will show that the columns of P are p ::::; p and k 1 k Qk1 1 k1 the components of y are Á ;:::;Á . It follows from equation (4.20) and theQ definitionQ of P k1 1 k1 Qk that

T 1 T T 1 T P APk L Q AQkL L TkL Qk Q D k k k D k k 1 T T L .LkDkL /L Dk: D k k k D

65 Thus T p Apj 0 for all i j : (4.22) Qi Q D ¤ It is easy to see from equation (4.9) that Tk1 is the leading .k 1/ .k 1/ submatrix of Tk. Equation (4.20) can be written 

T 1 d1 1 :: : :: l1 : : : l1 : Tk : : : : D :: :: dk1 :: :: d ˙ lk1 1˙ k˙ lk1 1 T Lk1 0 Dk1 0 Lk1 0 T T D lk1e 1 0 dk lk1e 1 Â k1 ÃÂ ÃÂ k1 Ã T Lk1Dk1Lk1 ? D ? ? Â Ã

where ? denotes terms that are not significant to the argument. Thus, Lk1 and Dk1 are the leading .k 1/ .k 1/ submatrices of Lk and Dk respectively. Since Lk has the form 

Lk1 0 Lk ; D ? 1 Â Ã 1 the inverse Lk must have the form L1 0 L1 k1 : k D ? 1 Â Ã

Therefore, it follows from the definition of yk that

1 1 yk b D L e1 Ák k k k 1 1 Dk1 0 Lk1 0 b e1 Dk k 0 1=dk ? 1  1 1 àà Dk1Lk1 0 b e1 Dk k ? 1=dk  1 1 à Dk1Lk1 0 e1 b e1 is here a .k 1/-vector Dk k ? 1=dk 0  1 1 àà b D L e1 yk1 k k k1 k1 ; D Ák D Ák  à  à i.e., yk1 consists of the first k 1 components of yk. It follows from the definition of Pk that Q T Pk QkL Q Á k T Lk1 ? .Qk1; qk/ D 0 1  à T .Qk1L ; pk/ .Pk1; pk /; D k1 Q D Q Q

66 i.e., Pk1 consists of the first k 1 columns of Pk. Q Q

We now develop a recursion relation for xk. It follows from equation (4.21) that

xk Pkyk D Q yk1 .Pk1; pk/ D Q Q Ák  à Pk1yk1 Ákpk D Q C Q xk1 Ákpk: (4.23) D C Q

We now develop a recursion relation for pk. It follows from the definition of Pk that Q Q T PkL Qk Q k D or 1 l1 : : :: :: .p ;:::; p / .q ;:::;q /: (4.24) 1 k :: 1 k Q Q : lk1 D ˙ 1  Equating the k-th columns in equation (4.24), we get

lk1pk1 pk qk Q CQ D or pk qk lk1pk1: (4.25) Q D Q

Next we develop a recursion relation for the residuals rk. Multiplying equation (4.23) by A and subtracting from b, we obtain rk rk1 ÁkApk: (4.26) D Q Since xk1 belongs to Kk1, it follows that Axk1 belongs to Kk. Since b also belongs to Kk , it is clear that rk1 b Axk1 is a member of Kk. Since rk1 and qk both belong to Kk and both D are orthogonal to Kk1, they must be parallel. Thus,

rk1 qk : (4.27) D rk1 k k We now define pk by pk rk1 pk: (4.28) Dk k Q Substituting equations (4.27)and (4.28) into equations (4.23), (4.26), and (4.25), we get

Ák xk xk1 pk xk1 kpk (4.29a) D C rk1 D C k Ák k rk rk1 pk rk1 kApk (4.29b) D rk1 D k k lk1 rk1 pk rk1 qk k kpk1 rk1 kpk1: (4.29c) Dk k rk2 D C k k 67 Here we have used the definitions Ák k D rk1 k k and lk1 rk1 k k k: D rk2 k k Equations (4.29a),(4.29b), and (4.29c) are our three basic recursion relations. We next develop a formula for k. Since rk1 rk1 qk and rk is orthogonal to Kk , multiplication of equation T D k k (4.29b) by rk1 gives T 2 T 0 r rk rk1 kr Apk: D k1 Dk k k1 Thus 2 rk1 k kT k : (4.30) D rk1Apk T Multiplying equation (4.29c) by pk A, we get

T T T p Apk p Ark1 0 r Apk: (4.31) k D k C D k1 Combining equations (4.30) and(4.31), we obtain the desired formula

2 rk1 k k T k : (4.32) D pk Apk

We next develop a formula for k. In view of equations (4.22)and (4.28), multiplication of equa- T tion (4.29c) by pk1A, gives

T T T 0 p Apk p Ark1 kp Apk1 D k1 D k1 C k1 or T pk1Ark1 k T : (4.33) D pk1Apk1 T Multiplying equation (4.29b) by rk , we get

T T r rk 0 kr Apk k D k or T 2 rk rk rk k T kT k : (4.34) D rk Apk D rk Apk Combining equations (4.32) and(4.34), we get

2 2 rk rk1 kT k k T k : (4.35) rk Apk D pk Apk Evaluating equation (4.35) for k k 1 and combining the result with equation (4.33), we obtain the desired formula D T 2 p Ark1 r  k1 k1 : k T k k2 (4.36) D p Ap D rk2 k1 k1 k k 68 We can now summarize the CG algorithm

1. Compute the initial values x0 0, r0 b, and p1 b. D D D 2. For k 1;2;::: compute D

z Apk Save Apk D 2 T k rk1 =p z New step length Dk k k xk xk1 kpk Update approximation D C rk rk1 kz New residual D 2 2 kC1 rk = rk1 Improvement of residual Dk k k k pkC1 rk kC1pk New search direction D C

3. Stop when rk is small enough. k k

Notice that the algorithm at each step only involves one matrix vector product, two dot products 2 (by saving rk at each step), and three linear combinations of vectors. The storage required is only four vectorsk k (current values of z, r, x, and p) in addition to the matrix A. As with all iterative methods, the convergence is fastest when the matrix is well conditioned. The convergence also depends on the distribution of eigenvalues.

4.3 Preconditioning

The convergence of iterative methods often depends on the condition of the underlying matrix as well as the distribution of its eigenvalues. The convergence can often be improved by applying a M 1 to A, i.e., we consider the matrix M 1A in place of A. If we are solving a system of equations Ax b, this system can be replaced by M 1Ax M 1b. The matrix M 1A might be better suitedD for an iterative method. Of course M mustD be fairly simple to compute, or the advantage might be lost. We often try to choose M so that it approximates A in some sense. If the original A was symmetric and positive definite, we generally choose M to be symmetric and positive definite. However, M 1A is generally not symmetric and positive definite even when both A and M are. If M is symmetric and positive definite, then M EET for some E (possible obtained by a Cholesky factorization). The system of equations AxD b can be replaced by .E1AET /x E1b where x ET x. The matrix E1AET is symmetricD and positive definite. Since O D O D

ET .E1AET /ET M 1A; Similarity Transformation D E1AET has the same eigenvalues as M 1A.

The choice of a good preconditioner is more of an art than a science. The following are some of

69 the ways M might be chosen:

1. M can be chosen to be the diagonal of A, i.e., M diag.a11;a22;:::;ann/. D 2. M can be chosen on the basis of an incomplete Cholesky or LU factorization of A. If A is sparse, then the Cholesky factorization A LLT will generally produce an L that is not sparse. Incomplete Cholesky factorizationD uses Cholesky-like formulas, but only fills in those positions that are nonzero in the original A. If L is the factor obtained in this manner, O we take M LLT . D O O 3. If a system of equations is obtained by a discretization of a differential or integral equation, it is sometimes possible to use a coarser discretization and interpolation to approximate the system obtained using a fine discretization.

4. If the underlying physical problem involves both short-range and long-range interactions, a preconditioner can sometimes be obtained by neglecting the long-range-interactions.

5. If the underlying physical problem can be broken up into nonoverlapping domains, then a preconditioner might be obtained by neglecting interactions between domains. In this way M becomes a block diagonal matrix.

6. Sometimes the inverse operator A1 can be expressed as a matrix power series. An approx- imate inverse can be obtained by truncating this series. For example, we might approximate A1 by a few terms of the Neumann series A1 I .I A/ .I A/2 . D C C C

There are many more designed for particular types of problems. The user should survey the literature to find a preconditioner appropriate to the problem at hand.

70 Bibliography

[1] Beltrami, E., Sulle Funzioni Bilineari, Gionale di Mathematiche 11, pp. 98–106 (1873).

[2] Cayley, A., A Memoir on the Theory of Matrices, Phil. Trans. 148, pp. 17–37 (1858).

[3] Cuppen, J., A divide and conquer algorithm for the symmetric tridiagonal eigenproblem, Numer. Math. 36, pp. 177–195 (1981).

[4] Demmel,J.W., Applied Numerical Linear Algebra, SIAM (1997).

[5] Eckart, C. and Young, G., A Principal Axis Transformation for Non-Hermitian Matrices, Bull. Amer. Math. Soc. 45, pp. 118–121 (1939).

[6] Francis, J., The QR transformation: A unitary analogue to the LR transformation, parts I and II, Computer J. 4, pp. 256–272 and 332–345 (1961).

[7] Golub, G. and Van Loan, C., Matrix Computations, Johns Hopkins University Press (1996)

[8] Gu, M. and Eisenstat, S., A stable algorithm for the rank-1 modification of the symmetric eigenproblem, Computer Science Dept. Report YaleU/DCS/RR-967, Yale University (1993).

[9] Hestenes, M. and Stiefel, E., Methods of Conjugate Gradients for Solving Linear Systems, J. Res. Nat. Bur. Stand. 49, pp. 409–436 (1952).

[10] Jordan, C., Sur la r´eduction des formes bilin´eares, Comptes Redus de l’Acad´emie des Sci- ences, Paris 78, pp. 614–617 (1874).

[11] Kublanovskaya, V., On some algorithms for the solution of the complete eigenvalue problem, USSR Comp. Math. Phys. 3, pp. 637–657 (1961).

[12] Trefethen, L. and Bau, D., Numerical Linear Algebra, SIAM (1997).

[13] Watkins, D., Understanding the QR Algorithm, SIAM Review, vol. 24, No. 4 (1982).

71