Preconditioned Techniques For Large Eigenvalue Problems

Kesheng Wu

June

Abstract

This research fo cuses on nding a large number of eigenvalues and eigenvectors of a sparse symmetric

or Hermitian for example nding eigenpairs of a matrix These eigenvalue

problems are challenging b ecause the matrix size is to o large for traditional QR based and the

number of desired eigenpairs is to o large for most common sparse eigenvalue algorithms In this thesis

we approach this problem in two steps First we identify a sound preconditioned eigenvalue pro cedure

for computing multiple eigenpairs Second we improve the basic through new preconditioning

schemes and sp ectrum transformations

Through careful analysis we see that b oth the Arnoldi and Davidson metho ds have an appropriate

structure for computing a large number of eigenpairs with preconditioning We also study three variations

of these two basic algorithms Without preconditioning these metho ds are mathematically equivalent but

they dier in and complexity However the Davidson metho d is much more successful

when preconditioned Despite its success the preconditioning scheme in the Davidson metho d is seen as

awed b ecause the b ecomes illconditioned near convergence After comparison with other

metho ds we nd that the eectiveness of the Davidson metho d is due to its preconditioning step b eing

an inexact Newton metho d We pro ceed to explore other Newton metho ds for eigenvalue problems to

develop preconditioning schemes without the same aws We found that the simplest and most eective

preconditioner is to use the Conjugate metho d to approximately solve equations generated by the

Newton metho ds Also a dierent strategy of enhancing the p erformance of the Davidson metho d is to

alternate b etween the regular Davidson iteration and a p olynomial metho d for eigenvalue problems To use

these p olynomials the user must decide which intervals of the sp ectrum the p olynomial should suppress We

studied dierent schemes of selecting these intervals and found that these hybrid metho ds with p olynomials

can b e eective as well Overall the Davidson metho d with the CG preconditioner was the most successful

metho d the eigenvalue problems we tested

Chapter

Electronic Structure Simulation

Before entering into the main topic of this thesis we use this chapter to introduces the context of our

research which is also serve as motivation for our research In this chapter we fo cus on one application that

is the main driving force b ehind this research in eigensystem solvers the electronic structure simulation

pro ject at the University of Minnesota We will give some background information ab out electronic structure

simulation and characteristics of the eigenvalue problems generated from the simulation The eigenvalue

algorithm developed in later chapters are designed to solve the eigenvalue problem of the same characteristics

as these matrices

Introduction

Many scientic and engineering applications give rise to eigenvalue problems that is they require the solution

of the following equation

Ax x

nn

n

where A C is n n matrix C is an eigenvalue and x C is an eigenvector

The particular eigenvalue problem we are interested in is generated from a materials science

research pro ject The aim of the research pro ject is to understand the dynamics of microscopic particles

sp ecically electronic structures of complex systems Using quantum physics it is p ossible to explain and

predict certain material prop erties at a microscopic scale A key step to the simulation of electronic structure

is to nd a steady state or a quasisteady state Intuitively the pro cess of nding this steady state is adjusting

the electron distribution to minimize the total energy of the system The total energy is a nonlinear function

of the electron distribution Finding the minimal energy and the corresp onding electron distribution can

b e viewed as solving for the smallest eigenvalue and the corresp onding eigenvector of a nonlinear eigenvalue

problem The SelfConsistent Field SCF iteration is the primary scheme of solving this nonlinear eigenvalue

problem At each step of the SCF iteration a linear eigenvalue problem is generated This is source of our

matrix eigenvalue problem

The Schrodinger equation is an nonlinear eigenvalue problem At each step of the SCF iteration a

matrix eigenvalue problem is generated to contrast with the overall nonlinear eigenvalue problem we will

refer to this matrix eigenvalue problem as the linear eigenvalue problem in this chapter However after this

chapter we will b e concentrating on this linear eigenvalue problem the eigenvalue problem will refer the

linear eigenvalue problem

In quantum physics the electron distribution is represented by a wavefunction The governing equation

is the Schrodinger equation

H E

where H is the Hamiltonian op erator for the system and E the total energy is the wavefunction

The Hamiltonian op erator describ es the motion and interaction of the particles of the system The

wavefunction describ es where the particles are The Schrodinger equation for any nontrivial system is

complex nonlinear Partial Dierential Equation PDE There are many numerical metho ds for solving a

PDE One of the most common numerical approaches to solve the Schrodinger equation is

to discretize it in a planewave basis which is similar to sp ectral techniques for solving partial dierential

equations This discretization scheme turns the Hamiltonian into a large dense matrix that is often to o large

to store in a computers main memory Usually the matrix is only used in the linear eigenvalue problem

one workaround to this diculty is to use a Lanczostype eigenvalue routine which only need to access

the matrix through matrixvector multiplications Because the Fast Fourier Transformation FFT can

b e used to accomplish the most computation intensive op eration in the matrixvector multiplication the

matrixvector multiplication is relatively inexp ensive with this alternative

With the planewave basis the problem is solved in the Fourier Space A dierent approach is to solve

the problem in real space by discretizing the Schrodinger equation with the nite dierence scheme

For lo calized systems such as a cluster of atoms highorder nite dierence scheme has shown to b e more

ecient than planewave techniques in nding solutions of same accuracy The matrices

resulting from b oth nite dierence metho ds and planewave techniques are large if the number of particles

in the system is large In addition the size of the matrix is also aected by the desired accuracy of the

solution characteristics of the atoms involved and the physical quantities to b e computed For complex

systems involving hundreds of atoms the matrix size could b e on the order of millions

Complex systems may also require one to solve many more linear eigenvalue problems b efore an acceptable

solution for the nonlinear system is reached If we solve the Schrodinger equation directly only the

smallest eigenvalue is needed However if we try to do so the p otential function V would b e to o complex

to compute In next section we will show a scheme of simplifying this V This scheme makes V tractable at

the same time it requires computation of a large number of eigenvalues from the linear eigenvalue problem

The number of eigenvalues required is prop ortional to the number of atoms in the system Most of the

eigenvalue metho ds traditionally used for this computation are not very eective for nding a large number

of eigenvalues This is an additional challenge for an eective simulation

Ab Initio pseudop otential simulation

The electronic structure of a condensed matter system eg cluster liquid or solid is describ ed by a quantum

wavefunction which can b e obtained by solving the Schrodinger equation This equation is very complex

b ecause the Hamiltonian op erator H describ es motions of all particles in the system and the interactions

among all of them Complete analytical solution is only p ossible for the simplest atoms Signicant sim

plication is required to compute any large system Most theories of condensed matter systems make the

following three fundamental approximations to make them manageable

BornOpp enheimer approximation BornOpp enheimer approximation neglects the kinetic energy of

the nuclei This is a go o d approximation b ecause of two main reasons First the mass of nuclei is much

larger than the mass of electrons in the system typically more times larger nuclei move very slowly

compared to electrons Second we are not interested in the average motion of the whole system in other

word we will solve the system in the nuclei frame of reference Because of this approximation the wave

function we use only involves the electrons Thus the original electronnuclear problem now b ecomes a pure

electron problem Under this circumstance the wavefunction describ es the distribution of electrons only

Let r denote a p oint in space the density of electron distribution at r is dened by

H

r r r

where the sup erscript H denotes complex conjugate

Lo cal density approximation To explain the concept of Lo cal Density Approximation LDA we will

rst describ e a more general theory the density functional theory The density functional theory transforms

a manyelectron problem into an oneelectron problem The simplied Schrodinger equation in this

case is called the KohnSham equation

r V r r r

tot i i i

where r is the usual Laplacian op erator V r denotes the total p otential exp erienced by an electron

tot

at lo cation r is the energy of the ith state and is the corresp onding oneelectron wavefunction For

i i

our purp oses we can simply take as the ith smallest eigenvalue and the corresp onding eigenvector

i i

Because of the Pauli exclusion principle each state can b e o ccupied by two electrons at most The electrons

will ll the state with the lowest energy if the temp erature is at absolute zero ab out C In this case

if the number of electrons is q we need to nd q eigenvalues If the temp erature is higher than absolute

zero or if there there are degenerate states ie eigenvalues are equal to each other some states may b e

partially o ccupied then more eigenvalues needs to b e computed In terms of the oneelectron wavefunction

the electron distribution is dened as

X

r j rj

i

o ccupied states

The total p otential V is the sum of three terms the p otential due to all nuclei and the core

tot

electrons V the Coulomb p otential from all other valence electrons also known as the HartreeFork

ion

p otential V and the exchangecorrelation p otential V

H xc

V r r V V V

tot ion H xc

The Local Density Approximation LDA further simplies the computation of the exchange correlation

p otential V It states that if changes slowly with resp ect to r the exchangecorrelation p otential can b e

xc

approximated by assuming the electron density to b e lo cally uniform The p otential V is obtained by

xc

Monte Carlo simulations in this simulation

Pseudop otential approximation Because most core electrons do not change their state during normal

circumstance thus it is p ossible to treat a nuclear and the surrounding core electrons as one entity ie an

ion This is the basis of the pseudop otential approximation The core of this approximation is to replace

the complex p otential function of this ion with a smo othly varying p otential without changing its eect on

the chemically active valence electrons This approximation impacts our computation in two ways First it

reduces the number of electrons that needs to b e treated as unknowns ie reduce the number of eigenpairs

needed Second it simplies the electronic p otential due to ions V Much research has b een devoted to

ion

the construction of pseudop otential for dierent atoms In our simulations we use the p otential developed

in for the silicon atoms

For convenience the ionic p otential is broken into two parts the lo cal term and the nonlo cal term

Z

X X

V r

l a

V V jr j u r u r V r d r

ion loc a lm a lm a l a

a

V

lm

a

a lm

a

is the normalization factor where V

lm

Z

a

V u r V r u r d r

lm a l a lm a

lm

Here the sup erscript a denotes an atom in p osition R r r R u is the atomic pseudop otential

a a a lm

wavefunction asso ciated with angular momentum quantum number l m V r is an l dep endent ionic

l

pseudop otential generated from u V r V r V r is the dierence b etween the l comp onent of

lm l l loc

the ionic pseudop otential and the lo cal ionic p otential If we set U r V r u r and consider

alm a l a lm a

the discretization of the equation on a uniform grid the nonlo cal term b ecomes a sum of rankone

up dates

X X

T

V V t U U

ion loc alm alm

alm

a

alm

where t are normalization co ecients and all other variables are the discretized matrix and vector forms

alm

of the op erators and functions

The KohnSham equation is a nonlinear equation A part of its nonlinearity comes from the HartreeFork

p otential V which dep ends on the charge density r The charge density function in turn dep ends on

H

Guess initial V r

tot in

y

r V r r r Solve

tot in i i i

y

P

occ

H

r r r

out i

i

i

y

Solve r V r

H out

y

V r V r V r V

tot out ion H xc out

y

if kV r V r k

tot in tot out

y

V r F V r V r

tot in tot in tot out

Calculate total energy forces etc

Figure Flowchart of the selfconsistent iteration for KohnSham equation

the solution of the KohnSham equation via equation Once the charge density r is known the

HartreeFork p otential can b e found by solving the Poisson equation

r V r

H

The discrete form of this equation is easily solved by either the Conjugate Gradient metho d or by

a fast Poisson solver Both p otential V and V have a lo cal character and are represented by

H xc

diagonal matrices when discretized using nite dierence scheme on a uniform grid

Now that we have shown how to compute most of the quantities of the KohnSham equation we can

describ e the Self Consistent Field SCF iteration in more detail The SCF iteration can b e viewed as

new

an inexact Newton metho d for accelerating the xedp oint iteration V MV in which MV

tot tot

tot

represents the pro cess of computing a new p otential from the old p otential A owchart of the SCF iteration

is depicted in gure The iteration starts by solving a linear eigenvalue problem based on the current

approximate p otential V This gives a new series of eigenpairs The eigenvalues are used to

tot i i

determine which states are o ccupied and the eigenvectors of the o ccupied states are used to compute the

new electron density r This new electron density is in turn used to nd a new HartreeFork p otential

out

V and a new exchange correlation p otential V If the dierence b etween the newly computed total

H xc

p otential and the previous one is larger than than the prescrib ed tolerance a new initial guess for V is

tot

computed through extrap olation and the iteration is rep eated At the end of the selfconsistent iteration a

Matrix size Eigenvalues

S i H

S i H

S i H

S i H

S i H

S i H

Table Matrix size of an simulation series

Order of

Accuracy

Table Weights for computing second order derivative on uniform grid

steady state is reached ie a lo cal energy minimum for the current conguration of nuclei is found

If the system is timedep endent eg the nuclei are in motion the problem is discretized in time rst to

pro duce a series of quasistatic states which can b e solved using the same pro cess shown in gure

Characteristics of the eigenvalue problems

Having simplied the manybo dy Schrodinger system into an oneelectron problem now we pro ceed to

discretize the KohnSham equation equation into a matrix eigenvalue problems Traditionally the

KohnSham equation is discretized in Fourier space In the past few years we have started

using highorder nite dierence schemes A typical problem solved in our simulations

might consist of a few hundreds of atoms using th order nite dierence of the KohnSham equation on a

uniform grid Table shows the matrix sizes and the numbers of eigenvalues computed for a

series of Silicon cluster simulations Because the simulation software excludes the region where the electron

density is negligible the matrix size is usually less than the numbers of the total grid p oints for the problems

solved

In matrix form the Hamiltonian is the sum of a Laplacian matrix three diagonal matrices and a matrix

representing the nonlo cal contributions Two of three diagonal matrices come from discretizing V and V

xc H

The third one is due to the lo cal part of the ionic p otential V The nonlo cal matrix is from the nonlo cal

ion

ionic p otential which can b e expressed as sum of a sequence of simple rankone up dates over all atoms and

quantum numbers see equation Typically a small number of steps is required for the SCF iteration

to converge The matrix is symmetric for simple isolated systems it can b e complex Hermitian when any

p erio dicity is presented in the system eg crystal or surface of crystal

The nite dierence scheme used is developed by Fornberg and colleagues It is used to

discretize the Laplacian op erator in equation on a uniform grid In onedimensional case the second

order derivative can b e expressed as

X

d u

A

u u u

i j ij ij

dx h

i

j

where h is the distance b etween grid p oints u is an arbitrary function The co ecients are shown in

i 0

200

400

600

800

1000

1200

1400

1600

1800

0 500 1000 1500

nz = 57289

Figure The nonzero pattern of a matrix for S i cluster

table Dep ending on the accuracy desired we can choose the number of neighbors to involved in

the ab ove equation To extend to D case we simply approximate the derivative of each direction separately

For a case where th order accuracy is needed neighboring no des are used to form the discrete form of

the Laplacian op erator from each of x y and z directions

In most test cases we use a slightly mo died form of the natural ordering for the grid p oints This

ordering scheme diers from the natural ordering in that it alternates b etween counting along x direction

and x direction It reduces the chance of memory bank conict on some computers b ecause it creates more

variances in the memory access patterns of op erations such as matrixvector multiplication

In the nitedierence matrix the discrete form of the Laplacian op erator dominate the nonzero pattern

If the th order centered nite dierence is used ab out nonzero elements are generated for each row

of the matrix The nonlo cal interaction generated from V involves ab out or so grid p oints around a

ion

silicon atom This adds ab out nonzero elements for selected rows of the matrix Our nite dierence

scheme pro duces Hermitian or symmetric matrices Figure shows the nonzero pattern of a matrix for a

twosilicon cluster This matrix plotted in gure is only The netlike pattern is due to the

mo dication to the natural ordering In the center of the graph the nonlo cal interactions app ears as two

squares The grid p oints with same z co ordinate are number closer to each other which form a structure that

is similar to the overall structure of the matrix The size of these smaller structure decrease away from the

center of the matrix b ecause more grid p oints are unused on the plan with same z value due to the exclusion

of low electronic density regions

A linear eigenvalue problem is pro duced at each selfconsistent iteration The matrix for the eigenvalue

problem can b e attributed to four sources the Laplacian op erator V V and V From one SCF

ion H xc

iteration to the next one the Laplacian part of the matrix will not change as long as the grid size is the

same The matrix generated from V do es not change if ions do not move Using the centered nite

ion

dierence discretization the p otential V and V are transformed into two diagonal matrices At each

H xc

selfconsistent iteration V and V varies with ie Thus the dierence b etween matrices from two

H xc i

successive iterations is only on the diagonal of the matrices As the nonlinear problem converges we exp ect

the change in V and V to decrease Therefore the solution of previous linear eigenvalue problem will b e

H xc

a go o d initial guess for the next eigenvalue problem

To reduce storage the Laplacian part of the matrices is stored in stencil form ie one row of co ecients

in table is stored as the matrix elements A D array ja is used to maintain the index information of the

matrix where the jth neighbor of ith grid p oint is jaij The p otential is broken into comp onents and

stored in two parts All three diagonal matrices are summed together in one vector The nonlo cal p otential

is stored as a series of short vectors to b e used in the rankone up dates

In summary the eigenvalue problems we would like to address here are large sparse and Hermitian For

realistic applications a large number of eigenvalues and eigenvectors are desired Go o d initial guess are

available in most cases

Chapter

Krylov Eigenvalue Solvers

This chapter explores the similarities and dierences b etween the Arnoldi metho d and the Davidson metho d

Based on their similarities a generic eigenvalue solver is extracted Several dierent variations of this generic

algorithm are studied by varying the input vector given to the preconditioner Without preconditioning these

variants are mathematically equivalent to each other However in practice there are quite dierent due to

dierence in numerical prop erties With preconditioning they b ehavior drastically dierent Through this

study we will identify a numerically stable variation that works well with simple

Introduction

An eigenvalue and an eigenvector x of a matrix A is dened by the following equation

Ax x

The eigenvalue problems of interest here have Hermitian matrices whose eigenvalues are always real and their

corresp onding eigenvectors always exist In addition these matrices are also sparse which means there are

only a small number of nonzero entries compared to total entries in the matrix For this type of matrices

they are usually not stored in a D n n array where n is the matrix size Thus they only require a

fraction of the storage compared to a full matrix of the same size The tradeo is that accessing a matrix

element is often not as fast or as convenient as a matrix stored in a D array

In this thesis we will address only symmetric eigenvalue problem since it is straightforward to extend

everything discussed to Hermitian case There are many ways of extracting eigenvalues and eigenvectors

from a In this section we review the basic techniques for large

sparse symmetric eigenvalue problems to identify a suitable starting p oint for our study

Eigenvalue solvers may come very dierent forms however most of them can b e either classied as a

QRtype metho d or a pro jection metho d QRtype metho ds are very robust in nding eigenvalues and

corresp onding eigenvectors and mature implementations of QR algorithm are widely available

They generally require On storage and On op erations If the matrix size is relatively

small say less than a few hundreds and all eigenpairs are needed a QR based metho d is the b est choice

for the task The entire set of n eigenvalues is referred to as the sp ectrum of the matrix In most practical

eigenvalue problems the matrix size is large and only part of the sp ectrum is needed a QR metho d is

usually not an eective choice

Contrast to a QR metho d pro jection metho ds are designed to extracts part of the sp ectrum eectively

Often it only needs to access the matrix A through matrixvector multiplication For sparse matrices where

the matrix is not stored in the usual n n array only need to p erform matrixvector multiplication with

the matrix could signicantly reduce the complexity of the eigenvalue program A pro jection metho d for

eigenvalue problems has two ma jor comp onents one to generate a basis the other to generate approximate

solution by pro jecting the original eigenvalue problem into the basis Most of variations on dierent eigen

value solvers are in the basis generation part which will b e reviewed in next section Here we will fo cus on

the pro jection techniques to generate approximate eigenvalues and eigenvectors

There are two common metho ds to nd eigenvalue approximations on a given basis they are the Rayleigh

nm

Ritz pro jection and the harmonic Ritz pro jection Given a mdimensional basis V R

b oth pro jection techniques seek approximate solutions of the form

x V y

T T

x Axx x

where is an approximate eigenvalue x is an approximate eigenvector and y is chosen by the two pro jection

techniques according to their criteria An eigenvalue and an eigenvector together is often referred to as an

eigenpair To reduce the number of symbols used the exact value of a quantity will use the same symbol as

the approximate value The dierence is that a sup erscript will b e attached to the exact one for example

x denotes an exact eigenpair The approximate eigenvalue in the ab ove equation is dened as a ratio

known as the Rayleigh quotient An eigenvalue approximation may not b e a Rayleigh quotient However

due optimality of Rayleigh quotient it is often the favorite choice The eigenvalue and eigenvector

approximation computed from the RayleighRitz pro jection are also known as the Ritz value and the Ritz

vector

T T T T

Usually V is an orthonormal basis In which case x Ax y V AV y y H y This means

m m

m

T

that the Ritz values are the eigenvalues of H V AV Because the size of H is usually small a QR

m m

m

metho d can eciently nd its eigenvalues and eigenvectors In addition to get a normalized x we only need

to choose a normalized y Since y is much smaller in size than x it is a lot easier to normalize y These

are go o d motivations to maintain V orthonormal However in general the basis V do es not need to b e

m

orthonormal The residual of the ab ove approximation is

r Ax x AV y V y

The RayleighRitz technique pro jects the original eigenvalue problem Ax x by making the residual

of the approximate solution orthogonal to the basis V ie

T

V AV y V y

T T

Let H V AV and V b e an orthonormal basis ie V V I then and y can b e found by solving the

following pro jected eigenvalue problem

H y y

This pro jected eigenvalue problem is usually solved by using a QR metho d from LAPACK or EISPACK

An eigenvector y of equation is chosen to generate an approximation to the original eigenvalue

problem based on equation For example if the smallest eigenvalue of A is sought the eigenvector

corresp onding to the smallest eigenvalue of H is taken In this case the eigenvalue approximation can b e

easily computed It is simply the eigenvalue of H corresp onding to the chosen y This is another advantage

of using orthonormal basis The following algorithm describ es a version of the RayleighRitz pro cedure we

will use later

Algorithm RayleighRitz pro jection to nd an approximate solution to the smal lest eigenvalue

and corresponding eigenvector x

Given an orthonormal basis V

T

H V AV

nd the smal lest eigenvalue of H and corresponding eigenvector y

the approximate solution to the smal lest eigenvalue of A is the smal lest eigenvalue of H the corre

sponding Ritz vector is x V y The residual of the approximation is r Ax x AV y V y

The harmonic Ritz pro jection is a relatively new technique which follows the following orthogonality

condition on the residual

T

AV AV y V y

By dening W AV this equation can b e rewritten as

T

W W y WA W y

which represents a RayleighRitz pro jection of A on the basis W To nd a vector y for equation

the following pro jected eigenvalue problem is solved

H y Gy

T T

where H V AV and G AV AV Usually the matrix A is also shifted to favor the eigenvalue

sought For example if the eigenvalue wanted is close to the ab ove denition of H and G are redened

as follows

T T T

H V A I V G V A I A I V

The vector y needed for equation is chosen from the eigenvectors of the ab ove generalized eigenvalue

problem Once y is chosen we can compute the approximation to the original eigenvalue problem using

equation

The harmonic Ritz pro jection algorithm shown b elow is one of the schemes shown in There are

dierent realization of the harmonic Ritz pro jection the one chosen here has orthonormal basis V which

make it very easy for it to share the same basis generation pro cedures with the RayleighRitz pro jection

algorithm describ ed b efore

Algorithm Harmonic Ritz pro jection for nding an approximation of the smal lest eigenvalue and

corresponding eigenvector of a symmetric matrix A

nm

Given an orthonormal basis V R W AV

T T

H W V G W W

mm

Solve the generalized eigenvalue problem HY LGY to nd Y L R For convenience columns

of Y are normalized

T T

Let be the minimum of diagY HY y be the corresponding vector such that y H y

is the approximation to the smal lest eigenvalue x V y is the corresponding eigenvector approxima

tion the residual r W y x is the associated residual vector

Both the RayleighRitz pro cedure and the harmonic Ritz pro cedure can b e used to nd eigenvalues other

than the minimum eigenvalue by selecting y according to dierent criteria The descriptions here are geared

toward the eigenvalue problem on hand It should b e a trivial change to adapt them for dierent needs

The RayleighRitz pro jection is widely used It is observed that RayleighRitz pro cedure is eective

in approximating the extreme eigenvalues and the harmonic Ritz pro jection is eective in extracting

eigenvalue in the interior of the sp ectrum Since we are seeking extreme eigenvalues we are naturally

inclined toward using the RayleighRitz pro jection However a large number of eigenvalues is desired some

of them could b e considered interior eigenvalues where harmonic Ritz pro jection is more suited The fo cus

of this chapter is on the basis generation schemes RayleighRitz pro jection is used through out

For some nonsymmetric matrices the pro jected eigenvalue problems equations and could b e

defective In which case the number of eigenvectors will b e less than the size of the matrix Because the

Schur vectors can always b e computed if the matrix A is not symmetric using a Schur vector as y is a more

reliable choice Since the matrix of interest is Hermitian we will only use eigenvectors of

and as y for equation

Pro jection metho ds for eigenvalue problems

All the pro jection metho ds for eigenvalue problems can b e describ ed by the following abstract algorithm

nn

Algorithm A generic algorithm for solving eigenvalue problem AX X where A R X may

have arbitrary number of columns

nm

Given an initial basis V expand its size to form V R

i m

Apply one of the projection schemes on V to generate an approximate solution

m

If the residual of the approximation is smal ler than a preset tolerance stop else collapse V to

m

V and go back to step

i

The aim of section is to review existing schemes for generating a new basis Dierent metho ds use

dierent initial basis size i and maximum basis size m More imp ortantly they dier on how to extend V

i

to V

m

Subspace iteration This type of metho ds are mainly used for nding dominant eigenvalues and can b e

considered extensions of the p ower metho d Some examples include the p ower metho d

the RayleighQuotient iteration simultaneous iterations the trace minimization metho d

and the preconditioned subspace metho d Software based on this approach includes RITZIT for

symmetric matrices and SRRIT for nonsymmetric matrices One very distinct feature of this family

of metho ds is that V and V are the same size These metho ds are generally considered reliable b ecause

i m

they converge to the desired solutions However they often take many more matrixvector multiplications

than the Krylov metho ds we will discuss next

In iterative sparse linear system solvers in order to sp eed up the convergence an auxiliary linear system

is solved Solving this auxiliary linear system is called preconditioning Similar approaches have

also b een applied to sp eed up the convergence of eigenvalue solvers Preconditioning techniques for Krylov

eigenvalue solvers are more well published in the United States than preconditioning for the subspace

iteration techniques This is a slight disadvantage for the subspace iteration metho ds In summary

after a brief review of this type of metho ds we have decided to concentrate our attention on other metho ds

to b e discussed

Krylov metho ds Given a matrix A and a vector v Krylov subspace is the span of fv Av A v g

Metho ds in this family generate Krylov subspace bases and then extract eigenvalue and eigenvector approxi

mations from the bases The p ower series v Av A v do es not make a go o d basis b ecause the vectors late

in the sequence could b ecome very close to each other One eective pro cess for generating an orthonormal

basis of a Krylov subspace is the Arnoldi algorithm describ ed b elow

Algorithm The Arnoldi metho d for generating an orthonormal basis Given an initial vector v and

maximum basis size m let v v denote the columns of the basis V v v kv k and V v v v

j j

be the rst j columns of V for j m do the fol lowing

z Av

j

T T

w i j z z V h h h v

j j j j j ij

i

h kz k v z kz k

j j j

This algorithm generates a basis satisfying

AV V H v e

m m m m

where V v v e is the mth column of the h and H a i j

m m m mm ij

m A crucial characteristic of this equation is that the matrix H is upp er Hessenberg Any basis

that satises this equation equation will b e referred to as an Arnoldi basis Note that this

H is also the same matrix H needed for b oth pro jection techniques see equation and This is

one reason why the Arnoldi algorithm is a favorite for building basis for pro jection metho ds

For symmetric matrices the Hessenberg matrix H b ecomes a Step of the Arnoldi

algorithm can b e simplied since h h The well known can b e viewed

j j j

as this simplied algorithm Step in algorithm can b e express as

T

h h h v z z z h v h v h kz k v z kz k

j j jj j j j j j j j j j j j

j

Eigenvalue solvers based on this scheme are very eective in nding a few extreme eigenvalues of a matrix

Extensive studies have b een done on the Lanczos algorithm for symmetric eigenvalue problems

The b o oks by Parlett and by Cullum and Willoughby are go o d references on

applying it to symmetric eigenvalue problems A number of eigenvalue solvers based Lanczos algorithm are

available from NETLIB mathematical software rep ository

When the matrix is not symmetric the Lanczos algorithm can b e generalized in two ways the nonsym

metric Lanczos algorithm and the Arnoldi algorithm The nonsymmetric Lanczos algorithm still tridiago

nalizes the matrix but it requires two sets of biorthogonal bases Software

packages based on nonsymmetric Lanczos algorithm are also available from NETLIB and other archives

The Arnoldi algorithm can b e used for b oth symmetric and nonsymmetric eigenvalue problems

In the symmetric case the dierence b etween the Lanczos algorithm and the Arnoldi algorithm is that the

later one maintains an orthonormal basis by explicitly orthogonalizing the new vector against all existing

vectors while the Lanczos algorithm only orthogonalizes the new vector against two previous ones Because

of this it is easier for the Arnoldi metho d to maintain an orthonormal basis than the Lanczos metho d Due

to roundo error in digital computer one pass through the orthogonalization step of the two metho d may

not pro duce a result that is orthogonal to the existing basis This is the loss of orthogonality problem It is

simpler to detect this problem in the Arnoldi algorithm than in the Lanczos algorithm Much research

has b een done on detecting and correcting the loss of orthogonality problem in Lanczos eigenvalue metho ds

Many schemes have b een developed as the result of the research but they are usually quite

complex For simplicity we prefer the Arnoldi metho d over the Lanczos metho d

Preconditioning can b e easily introduced into the Krylov eigenvalue solvers Most of the pre

conditioners used for solving linear systems can b e directly translated into preconditioned Krylov eigenvalue

solvers Usually preconditioning is easier to apply to the Arnoldi metho d than

the Lanczos metho d Very simply to p erform preconditioning one can simply change step of algorithm

to

z M Av

j

where M is the preconditioner which approximates A in some way For example M diagA I

In the class of Lanczos metho ds Ruhes variation of implementation of blo ck Lanczos metho d was very

instrumental in our development of blo ck eigenvalue solvers It shows that the initial blo ck size dont

have to the subsequent blo ck size

Given a maximum basis size the computational complexity of the Arnoldi algorithm is ab out m n

Computing the eigenvalues and eigenvectors of H using a QR algorithm generally required Om op erations

For large m these facts mean that the Arnoldi metho d for nd eigenvalues could b e very exp ensive For

this reason practical Arnoldi eigenvalue solvers use relatively small m eg However choice of m

should b e made with consideration of many factors for example computer memory available number of

eigenvalues sought size of the matrix and so on If m is large less restart is needed to reach convergence

but building a large basis is exp ensive If m is small it is cheap to build a basis V but more restart

m

will b e needed Finding an optimal choice is an op en research question Traditionally at restart the

current Ritz vector is used as the starting vector to build another Krylov basis This is referred to as explicit

restarting to contrast with what is describ ed next

Implicitly Restarted Arnoldi metho d This is a fairly new metho d due to Sorensen which has

quickly caught the attention of many researchers A software package based on this technique

is available It is named Implicit Restart Arnoldi IRA b ecause it restarts in such a way that the

starting vector for the new basis is never explicitly computed There are two sp ecial characteristics that

1

The URL is httpwwwnetliborg

deserve some sp ecial attention First the Implicitly Restarted Arnoldi IRA metho d can b e interpreted as

an incomplete QR metho d Thus it links the QR algorithm with the Arnoldi algorithm A great

deal of research has b een done on improving the convergence rate of QR metho ds Now these techniques is

available to b e transplanted to the Arnoldi algorithm

Another imp ortant characteristic of IRA is in the restarting technique IRA restarts by keeping a sig

nicant p ortion of the old Arnoldi basis Even though the idea of saving more than one Ritz vector was

discussed in the recent interests in restarting techniques were directly sparked by the development of

IRA metho d IRA metho d successfully demonstrated the p otential of keeping more

vectors at restart In the context of Davidson metho d Stathop oulos and colleagues call this strategy of

restarting thick restart Morgan showed an analytical argument for preserving more Ritz vectors for

the Arnoldi metho d The approximate eigenvectors saved in the basis deate the relevant part of the

sp ectrum and eectively increase the gap b etween the wanted eigenvalues and the unwanted ones This

characteristics was further exploited in to pro duce a dynamic restarting scheme

Davidson metho d The original Davidson eigenvalue metho d was prop osed in By when

Davidson published his survey the metho d had gone through a series of signicant improvements Many

novel techniques used in the Davidson metho d and its variants have greatly inuenced the development of

other metho ds It pioneered an early form of preconditioning which is still commonly used to day and it is

the rst to allow restarting with more vectors than the desired number of eigenvectors which is an imp ortant

feature of IRA

The Davidson metho d was initially designed to nd a few smallest eigenvalues of the Schrodinger equa

tion from quantum Chemistry applications Now it is a common technique for nding a few extreme

eigenvalues of large matrices generated from elds ranging from numerous science and engineering elds

Publications available on the Davidson metho d are extensive

The original Davidson metho d was prop osed for symmetric eigenvalue problems but

later developments have generalized it to nonsymmetric eigenvalue problems and generalized eigenvalue prob

lems Another area of development is the blo ck versions of the Davidson metho d

More imp ortantly there has b een considerable new developments in preconditioning and restarting tech

niques The scheme prop osed in mo dies the original Davidson preconditioning scheme to

make the result of preconditioning orthogonal the current eigenvector approximation The JacobiDavidson

metho d by van der Vorst and colleagues is another way of realizing the same intension starting from a dif

ferent p oint of view A signicant contribution from is the use of biased estimate for the

shift in preconditioning which can address a number of issues facing the original Davidson preconditioning

scheme

Similar to the Arnoldi metho d the Davidson metho d also needs restart So far only explicit restart is

available for the Davidson metho d Without restart the Davidson metho d would eventually converge for

every matrix as m approaches n Crouzeix Philipp e and Sadkane showed that the restarted version of

the Davidson metho d also converges There is a strong connection b etween the Davidson metho d and

Newton metho d for eigenvalue problem the Davidson preconditioning step is regarded as solving an equation

from the Newton metho d for eigenvalue problems Due to this connection it is b elieved that given a go o d

preconditioner the Davidson metho d can converge quadratically

In short two classes of metho ds app ear to b e go o d candidates for our needs the Krylov metho d and the

Davidson metho d Next we consider one from each class for further evaluation The Arnoldi metho d and

the original Davidson metho d are selected b ecause they are simple to implement and easy to use Many

steps in the Arnoldi metho d and the Davidson metho d are the same For this reason we will only give one

complete including all steps mentioned in algorithm We have shown the Arnoldi

metho d for generating an orthonormal basis and the RayleighRitz pro jection algorithm So we will show

a complete Davidson algorithm to demonstrate how the comp onents work together see for the original

form of the algorithm

The Davidson eigenvalue metho d is the easiest to understand when describ ed as an algorithm for nding

one eigenvalue and restart with the current Ritz vector This is what is shown in algorithm As b efore

V denote the rst j columns of V For convenience we use V to mean null or literally rst column of V

j

Algorithm Davidsons algorithm for nding the smal lest eigenvalue of a symmetric matrix A

Start Choose an initial guess for the eigenvector x Let z x

Iterate to build an orthonormal basis V v v v Let j

m m

T

a z z V V z v z kz k

j j

j

b w Av

j j

T

c h v w for i j

ij j

i

d if j m then nd a new vector z to augment V

j

i compute the residual vector r is current best approximate solution

ii z M r

iii Let j j goto a

else continue to step

RayleighRitz pro cedure to nd the best approximate solution x and the corresponding residual

r using the basis V

m

Convergence test If kr k is smal ler than the preset limit stop else let z x go to step

In ab ove algorithm the parameter m is preset and it is chosen to b e a small number such as or

Conceptually the three steps of algorithm are the steps and of algorithm where step builds

an orthonormal basis and step p erforms the RayleighRitz pro jection on the basis to extract the wanted

eigenvalue and eigenvector More sp ecically step a describ es a classic GramSchmidt orthogonalization

pro cedure It orthogonalizes z against existing basis vectors and app end the resulting normalized vector to

the basis to form a new one This step is resp onsible for keeping the basis orthonormal Step b p erforms

T

matrixvector multiplication Step c computes a new column of H V AV Steps b and c are placed

here to progressively compute H If we simply let z w it is easy to verify that step is a slightly

j

rearranged form of the Arnoldi algorithm The essential dierence b etween the Davidson metho d and the

Arnoldi metho d is in how they generates a new vector to augment the basis To change algorithm into

a preconditioned Arnoldi metho d we could simply skip step di and replace step dii with z M w

j

Alternative input vectors

Based on the observed commonalities and dierences b etween the Arnoldi metho d and the Davidson metho d

a generic algorithm for generating a basis of a given size can b e written as follows

Algorithm A algorithm for generating an orthonormal basis V from b input vectors The

m

basis is also required to be orthogonal to an orthonormal set X

Place the b input vectors at the rst b columns of V Let j and b b

Orthonormalization

T T

XX v v where V v v v v I V V

j j b j j j j b j

j

Perform QR decomposition on the new vectors QR v v and replace v v with

j j b j j b

rst b columns of Q

j j b b b

If j m continue else stop

Compute b vectors s s to give to the preconditioner

b

Apply the preconditioner v v M s s

j j b b

Goto step

This algorithm combines a number of techniques used in more complex eigenvalue solvers It is can b e

made to either repro duce algorithm or the step of algorithm If b b and s Av in step

j

then it is the Arnoldi algorithm If b b and s r then it is the generates the same basis as the

Davidson algorithm In addition we have incorp orated a few changes to make it more exible First it

can take an arbitrary number of input vectors This feature originates from the Davidson metho d has

b een successfully used by many researchers The augmented Krylov subspace concept is

also very similar to this idea Our exp eriences also show this to b e a worthwhile feature to have For

convenience of discussion we have split step d of algorithm to make it into four separate steps Steps

b and c of algorithm are not shown in this algorithm b ecause they are primarily used to pro duce the

pro jected matrix H for the RayleighRitz pro cedure In practice we may compute H progressively either

m m

for eciency concern or to use the intermediate H in the step of algorithm A go o d example of this is

j

the Davidson metho d where it is necessary to compute the residual vectors for preconditioning the matrix

H has to b e available at every step

j

Step of the ab ove algorithm is the orthonormalization step It can b e viewed as a progressive QR

decomp osition which orthonormalizes new vectors as they are generated The basis is also required to

b e orthogonal to an additional set of vectors here Usually this additional set of vectors are converged

eigenvectors This is part of the pro cess necessary for lo cking converged eigenvectors out of the working

basis What is shown here is only intended to b e a mathematical formula some attention to details are

required to guarantee the quality of basis pro duced This issue and many other implementation issues are

addressed in next chapter Although it is p ossible to set b to any number in step of algorithm we

have found it to b e more eective if b is set to in most cases We use b through out the thesis unless

sp ecically stated otherwise Both b and b may b e referred to as the blo ck size in our discussion

we will always call b the blo ck size and b the initial basis size Since b is usually we will drop subscript

in s is step and in algorithm

In the Arnoldi metho d if the current blo ck size b is less than the value in the pro ceeding step there is a

choice to b e made on what to include in s A simple scheme we use is to always choose the rst b new Av s

j

This can b e viewed as a dierent implementation of the augmented Krylov subspace describ ed in

In the j th step of the Davidson metho d it is p ossible to compute j residual vectors The pro cess of

choosing b out of j residual vectors is commonly referred to as targeting A simple scheme we will use here

is to choose b residual vectors corresp onding to the rst b wanted eigenvalues for most of our exp eriments

No matter what is the size of the blo ck there is only one eigenvalue we consider as the targeted eigenvalue

at any time This relates to the choice of shift used in the preconditioner

In the frame work of this generic basis generation algorithm we can vary what is generated in the step

to signicantly aect what basis is generated The main fo cus of this chapter is to vary the input vector

generated in this step of the algorithm Next we will given each one of them one at a time

Orthogonalized Arnoldi scheme One dierence b etween the Davidson metho d and the Arnoldi metho d

is that the Davidson metho d needs to compute the current residual vector at every step If we neglect the

op erations required to solve the small eigenvalue problem and assume that b oth V and W AV are

j j j

saved in memory computing one residual vector needs ab out j n oatingp oint multiplications and j n

additions This is a signicant amount of oatingp oint op erations We would like to reduce this op eration

count if p ossible Notice that in many cases the Davidson metho d works b etter than the Arnoldi metho d

the question b ecomes what prop erties of the residual vector is imp ortant to the p erformance of the Davidson

metho d The Davidson metho d uses RayleighRitz pro jection to extract eigenvalue approximation From the

orthogonality condition equation we know that the residual of the approximate solution is orthogonal

T

to the current basis If the current basis satises the Arnoldi recurrence relation AV V H u e

j j j j

j

then the residual of the RayleighRitz approximation is

T

r AV y V y V H y V y u e y

j j j j j j

j

Since H y y see equation

j

T

y u r e

j

j

P

j

h v u if we orthogonalize w at step of According to the Arnoldi recurrence w Av

ij i j j j j

i

algorithm the vector given to the preconditioner will b e

s u

j

which is parallel to the residual of the Davidson algorithm Therefore the two schemes would generate the

bases spanning the same space

There is a fairly inexp ensive way of mo difying the Arnoldi metho d to make Av orthogonal to the basis

j

vectors

T T

I V V Av Av V V Av

j j j j j

j j

T T

AV orthogonalizing Av against Av is already computed in the pro cess of computing H V Since V

j j j j

j j

j T

Av the preconditioning V is only half as exp ensive as computing a residual vector Let h R denote V

j j j

j

vector of this scheme is

s Av V h

j j j

In theory this vector is orthogonal to the basis V If no preconditioning is used this would b e the redundant

j

b ecause it will b e explicitly orthogonalized against V again in step of next iteration see algorithm

j

Mo died Arnoldi metho d This scheme can b e considered a trivial implementation of the approximate

Cayley transformation scheme presented in If the current targeted eigenvalue is this scheme compute

the s as follows

s A I v

j

This is a simple mo dication to the Arnoldi metho d Since Av is computed while computing H only n

j

additional oatingp oint op erations are needed to compute s

The advantages of this scheme is articulated in One of the main p oints is that the preconditioned

iteration matrix M A I has one eigenvector that is close to the solution The eigenvalue scheme

presented in is a new way of combining the Arnoldi metho d and the Davidson metho d

Harmonic Davidson scheme The Davidson metho d computes the residual vector based on RayleighRitz

pro jection The RayleighRitz pro cedure is shown to have many optimality conditions see for example

However it is not the only way to extract eigenvalue approximation from a basis V Notably the harmonic

m

Ritz values has attracted much attention in research community The harmonic Ritz value

techniques prop osed in can b e immediately applied to the Davidson metho d by replacing the Rayleigh

Ritz pro cedure with the harmonic Ritz pro cedure According to the norm used to maintain orthogonality

among the basis vectors three ways of implementing the harmonic Ritz pro cedure was prop osed in

T

use norm use A Anorm ie maintain W norm orthonormal or use Anorm ie maintain

m

V and W biorthogonal In an example of harmonic Lanczos metho d which maintained V norm

m m m

orthonormal was shown In the authors prop osed a harmonic Davidson metho d which maintains V

m

and W biorthogonal

m

The scheme we will use here is a mixture of b oth the RayleighRitz technique and the harmonic Ritz

technique In particular we use harmonic Ritz values to generate new input vectors to the preconditioner

the vector s in step of algorithm but use the RayleighRitz pro cedure to generate the approximate

solution when the basis is full after algorithm Because we keep the basis V orthonormal in Davidsons

metho d we use the harmonic Ritz pro cedure prop osed in

This scheme is more exp ensive compared to the Davidson scheme mainly b ecause it needs b oth H and

m

G Compared to the Davidson metho d this scheme need j n extra FLOP to up date G into G

m j j

Equivalence prop erties

So far we have describ ed ve dierent ways of generating orthonormal basis V the Arnoldi metho d

m

the Davidson metho d the orthogonalized Arnoldi metho d the mo died Arnoldi metho d and

the harmonic Davidson metho d This section compares the dierent schemes without preconditioning ie

M I in step of algorithm

Starting with same initial vectors all other asp ects b eing equal the basis V built by the Arnoldi

m

metho d and the orthogonalized Arnoldi metho d are identical b ecause the extra orthogonalization step in the

orthogonalized Arnoldi metho d do es not alter the resulting basis vector at any step This is true in theory

in practice b ecause of roundo error the orthogonalized Arnoldi is slightly more stable b ecause of the extra

orthogonalization step The mo died Arnoldi metho d also pro duces exact the same basis at the original

Arnoldi metho d To compare the other two metho ds we need the help of the following lemma

Lemma Assume V is an Arnoldi basis see equation the residual of an arbitrary Ritz pair x

j

generated according to equation is in the space spanned by V u

j j

j

Pro of Given an arbitrary normal vector y R the Ritz pair generated by equation has residual

T

the following is true r AV y V y According to the assumption AV V H u e

j j j j j j

j

AV spanfV u g

j j j

Thus r is a linear combination of vectors in V u ie r spanfV u g 2

j j j j

The implication of the ab ove lemma is that if V is an Arnoldi basis any vector of the form AV y V y

j j j

can b e used at step of algorithm the new basis is another Arnoldi basis and v u The ab ove

j j

lemma can b e applied to b oth the Davidson metho d and the harmonic Davidson metho d b ecause y can b e

any unit vector

It is fairly straightforward to extend the ab ove lemma to a blo ck version of the Arnoldi basis We extend

equation to the blo ck form as follows

T

AV V H U e e B

j j j j b j

bb nb

where H is a bHessenberg matrix that has b nonzero sub diagonal elements B R U R and e

j i

is the ith column of the identity matrix The ab ove lemma can b e trivially extended as follows

Lemma Assume V is a block Arnoldi basis see equation the residual of an arbitrary Ritz pair

j

x is in the space spanned by V U

j

Lemma If at any step j the basis V created by the orthogonalized Arnoldi method the modied Arnoldi

j

method the Davidson method or the harmonic Davidson method span the same space as the basis of the

Arnoldi method and V s s is not degenerate then the basis V wil l also span the same space

j b j b

Pro of It is clear that V pro duced by the Arnoldi metho d the orthogonalized Arnoldi metho d and

j b

the mo died Arnoldi metho d will b e in the span of V U if they are not degenerate All the residual vectors

j

of the Davidson metho d and the harmonic Davidson metho d are in spanV U If the original basis and b

j

residual vectors do not form a degenerate set they must form a basis of spanV U In conclusion the bases

j

pro duced by the ve metho ds span the same space 2

If the number of initial guesses is the same as the blo ck size b than the basis satises equation If

the same initial guesses are used in the ve metho ds mentioned the bases pro duced by them should span

the same space till the end of algorithm If these bases are given to the same pro jection pro cedure to nd

the same eigenpairs the resulting Ritz vectors should b e identical for all metho ds If the new set of vectors

after restart also satises equation then the ve metho ds will b e identical throughout For symmetric

matrices if we choose to save more than b Ritz vectors to restart the new vectors satisfying equation

The results stated here combine and generalize some of the previous equivalence theorems Previously

the restarted Arnoldi metho d with arbitrary b was shown to b e mathematically equivalent to the implicitly

restarted Arnoldi metho d and the thick restarted Davidson metho d was shown to b e equivalent to

the implicitly restarted Arnoldi metho d in Combining these two we can conclude that the restarted

Arnoldi metho d is equivalent to the restarted Davidson metho d if they start with one initial vector at the

b eginning restart with the same Ritz vectors and add only one vector to their bases at each step Because

the strong connection b etween the implicitly restarted Arnoldi IRA metho d and the thick restarted Arnoldi

for symmetric matrices the ab ove lemmas also extend to IRA metho d

metho d extra FLOP

Arnoldi

Mo died Arnoldi n

Orthogonalized Arnoldi jn

Davidson jn

Harmonic Davidson jn

Table Complexity of the metho ds compared to the Arnoldi metho d

Practical dierences

We have implemented the ve schemes for nding eigenvalues and eigenvectors using the frame work of

algorithm In our implementation we save b oth V and W in memory to avoid extra matrixvector

multiplications when computing residual vectors Comparing the ve metho ds the Arnoldi metho d is the

cheapest p er step Table shows how many more oatingp oint op erations used by other four metho ds

compared to the Arnoldi metho d The comparison is for the step that extends V to V Because the

j j

orthogonalization pro cedure step of algorithm is implemented as an iterative pro cedure the

actual number of oatingp oint op erations of the orthogonalization step dep ends on the quality of the input

vectors primarily how close is the new vector to b e a linear combination of the existing ones Therefore

the actual dierences may dier from what is shown in the table More details on the orthogonalization

pro cedure will b e discussed later Without preconditioning the ve metho ds discussed are equivalent to

each other in theory Here we have just shown two dierences among them First they have dierent

complexities Second quality of the basis generated by them are dierent When preconditioned they are

even more dierent

For convenience of discussion lets assume b in algorithm the vector s generate at this step can

always b e expressed as

j

X

v u s

i i

i

P P

T

where s v ks v k u s v If happ ens to b e zero u can b e any normal

i i i i i i

vector orthogonal to V Since the orthogonalization step will normalize the vector the ratio of compared

j

to k k is an imp ortant quantity Because the residual of the RayleighRitz pro cedure is orthogonal to the

i

basis V this ratio could b e considered as innite for the Davidson metho d In theory the orthogonalized

j

Arnoldi should also have innite ratio for k k But this ratio is reduced to nite number if Av is close

i j

to b e linearly dep endent on V b ecause equation is not enough to generate a vector that is exactly

j

orthogonal to V The other three metho ds the Arnoldi metho d the mo died Arnoldi metho d and the

j

harmonic Davidson metho d could have much smaller ratios This indicates that even if u in equation

is the same for all ve metho ds M s could b e very dierent Most often the vector u is not the same

either b ecause the bases do not satisfy the Arnoldi recurrence or b ecause roundo errors have accumulated

to b e signicant Due to these dierences even if V is the same for all ve metho ds mentioned ab ove they

j

will most likely pro duce dierent v Usually the Davidson metho d uses a varying preconditioner such as

j

M diag A I where is up dated at every step which makes it even harder to quantify the dierences

b etween the metho ds

Even if the preconditioner inverts A I exactly unless V is an invariant subspace of A for dierent

j

value of k k the result of preconditioning is dierent More imp ortantly the new basis built by the ve

i

dierent metho ds will b e dierent This means that even with a p erfect preconditioner they will pro duce

dierent results

The most eective way of showing how the metho ds b ehavior is showing numerical examples A series

of numerical results will b e shown in the remaining of this section The test matrices used are nondiagonal

symmetric matrices To reduce bias in the data we have used all symmetric matrices in the collection

They came from a variety of sources such as structure analysis etc see table and Most of the

matrices are from real applications The p erformance of the eigenvalue solvers on these problems will give

NAME N NNZ comment

BUS admittance matrix p ower system

BUS admittance matrix p ower system

BUS admittance matrix p ower system

BUS admittance matrix p ower system

GR nine p oint matrix on a X grid

LUNDA A of Lund eigenvalue problem

LUNDB B of Lund eigenvalue problem

NOS biharmonic op erator on b eam

NOS biharmonic op erator on b eam

NOS biharmonic op erator on plate

NOS b eam structure

NOS building structure

NOS Poissons equation in L shap e mixed BC

NOS Poissons equation in unit cub e

PLAT o cean mo del

PLAT Atlantic o cean mo del

ZENIOS airtrac control mo del

Table Nondiagonal symmetric matrices from HB collection Part I

us some indications as how well they will p erform in general The maximum basis size is chosen to b e

a total of ab out vectors are stored in memory The results show in the table are from nding

smallest eigenvalues and the corresp onding eigenvectors In the application we are interested in many more

eigenpairs are needed Here we are only p erforming a small test on the eigenvalue solvers For simplicity

T

the initial vector to build the rst basis is always a vector of all ones At restart Ritz

vectors corresp onding to the smallest Ritz values are saved There are nondiagonal symmetric matrices

in the HarwellBoeing collection To save space only the cases where ve eigenvalues were found within

matrixvector multiplications The entry with a dash indicates that the eigenvalue solver did not

nd eigenvalues within matrixvector multiplications

Most of the test matrices are very sparse typically having only ab out to nonzero entries p er row

This ab out results were obtained on a SPARC workstation runs at ab out MHz All the HarwellBoeing

matrices shown in the table can t into the main memory of the computer which is Megabytes Because

the typical matrixvector op eration is less exp ensive than an average orthogonalization in terms of oating

p oint op erations the complexity of the whole eigenvalue solver is dominated by the orthogonalization and

pro jection steps Table clearly shows that the Arnoldi metho d is cheaper than others When the number

of the matrixvector multiplication is ab out the same the Arnoldi metho d should use the least amount of

time

The rst set of test results is shown in table We have run the ve metho ds on all test problems

Only four metho ds are shown in the table b ecause the number of matrixvector multiplications MATVEC

used by the mo died Arnoldi metho d is exactly the same as that of the Arnoldi metho d For each of the

four metho ds shown two columns are displayed for each one of them The one headed by MATVEC is the

number of matrixvector multiplications used and the one headed with time shows the total execution time

in seconds From this table we see that the Arnoldi metho d do es use less time if ab out the same number of

matrixvector multiplications is used Comparing the number of matrixvector multiplications used by the

dierent metho ds the Davidson metho d clearly uses less than any other

Generally we can observe two trends from the data in table

The Davidson metho d generally use less iterations to converge than others

The Arnoldi metho d uses less time than others when same number of matrixvector multiplication is

used

NAME N NNZ comment

BCSSTK stiness matrix

BCSSTK stiness matrix small oil rig

BCSSTK stiness matrix

BCSSTK stiness matrix oil rig

BCSSTK stiness matrix transmission tower

BCSSTK stiness matrix

BCSSTK stiness matrix

BCSSTK stiness matrix frame building TV studio

BCSSTK stiness matrix square plate clamp ed

BCSSTK stiness matrix buckling of hot washer

BCSSTK stiness matrix ore car lump ed masses

BCSSTK stiness matrix ore car consistent masses

BCSSTK stiness matrix uid ow

BCSSTK stiness matrix ro of of omni coliseum Atlanta

BCSSTK stiness matrix mo dule of an oshore platform

BCSSTK stiness matrix corp of engineers dam

BCSSTK stiness matrix elevated pressure vessel

BCSSTK stiness matrix REGinna nuclear p ower station

BCSSTK stiness matrix part of a susp ension bridge

BCSSTK stiness matrix frame within a susp ension bridge

BCSSTK stiness matrix clamp ed square plate

BCSSTK stiness matrix textile lo om frame

BCSSTK stiness matrix D building

BCSSTK stiness matrix winter sp orts arena

BCSSTK stiness matrix story skyscrap er

BCSSTK stiness matrix reactor containment o or

BCSSTK stiness matrix buckling problem

BCSSTK solid element mo del MSC NASTRAN

BCSSTM mass matrix

BCSSTM mass matrix buckling of hot washer

BCSSTM mass matrix ore car consistent masses

BCSSTM mass matrix uid ow generalized eigenvalues

BCSSTM mass matrix buckling problem

Table Nondiagonal symmetric matrices from HB collection Part I I

The orthogonalized Arnoldi metho d was built as the hybrid of the Arnoldi metho d and the Davidson

metho d In a couple of cases it used less matrixvector multiplications than all others eg BUS and

BCSSTK However the advantage of this metho d is not signicant The harmonic Davidson metho d is

to o exp ensive p er step to b e comp etitive in this exp eriment

Table shows the results of using diagonal preconditioner M diag A I In this table all ve

metho ds are shown but only the time for the Davidson metho d is shown The value in the preconditioner is

the latest available eigenvalue approximation In the Davidson metho d and the harmonic Davidson metho d

the eigenvalue is up dated as every step In the Arnoldi metho d and the orthogonalized Arnoldi metho d it

is only up dated after the bases are full We have made two mo dications to it Here is how M is dened in

our program

ja j ja j

ii ii

m

ii

ja j

ii

Table only shows the cases where all ve wanted eigenpairs were found by one of the metho ds

The Davidson metho d solve more problems than any other metho d in this case Even in the cases where

other metho d also found all eigenpairs it always uses less iteration and less time

Arnoldi Orthogonalized Davidson Harmonic

MATVEC time MATVEC time MATVEC time MATVEC time

BUS

BUS

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTM

BCSSTM

GR

LUNDA

LUNDB

NOS

NOS

NOS

ZENIOS

Table Solving nondiagonal systems without preconditioning

Arnoldi Orthogonalized Mo died Davidson Harmonic

MATVEC MATVEC MATVEC MATVEC time MATVEC

BUS

BUS

BUS

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTM

BCSSTM

BCSSTM

BCSSTM

GR

LUNDA

LUNDB

NOS

NOS

NOS

NOS

NOS

ZENIOS

Table Solving the nondiagonal systems with diagonal preconditioning

Arnoldi Orthogonalized

MATVEC time MATVEC time

BUS

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTM

BCSSTM

GR

LUNDA

LUNDB

NOS

NOS

NOS

ZENIOS

Table Solving nondiagonal systems with diagonal preconditioning using b etter shifts

Comparing table and we see that the Davidson metho d is the only metho d b eneted from

the diagonal preconditioner All others converged on less problems with this preconditioner than without

preconditioning

We have observed some characteristics among the ve eigenvalue routines when preconditioner is used

Now we will try to satisfy ourselves that these characteristics are not due to some p eculiarity of the diagonal

preconditioner used First thing we noticed is that we did not up date the shift to the preconditioner in the

Arnoldi metho d and the orthogonalized Arnoldi metho d we have mo died them to compute the eigenvalue

approximation at every step Table contains the results of this exp eriment The orthogonalized Arnoldi

metho d can now converge on systems rather this in table However this is still only half of the

number of systems solved by the Davidson metho d More ab out shifting strategy will b e discussed later

here we note that changing shifting strategy do es not seem to make the orthogonalized Arnoldi comparable

with the Davidson metho d

Table shows the results of using SOR as preconditioner to nd the smallest eigenvalues of the Harwell

Bo eing matrices More systems reached convergence with SOR preconditioner than with the diagonal pre

conditioner A total of systems reached convergence in this case However again we see the same trend

that is present in the diagonal preconditioning case that is the metho d that is most successful in taking

advantage of the SOR preconditioner is the Davidson metho d The Arnoldi metho d is lagging b ehind others

in number of systems converged

Both the diagonal scaling and SOR preconditioner are often considered weak preconditioners for linear

systems Next we use two incomplete LU factorizations preconditioners They are often regarded as more

robust for linear systems The results is shown in tables and The rst ILU preconditioner is

ILU which preserves the sparsity pattern of the matrix in the LU factor The second ILU

preconditioner is ILUTP which is an incomplete LU factorization with column pivoting threshold based

dropping and ll based dropping strategies Because the matrix A I is often indenite ILUTP

is more stable than ILUT Both ILU and ILUTP are from SPARSKIT The ILUTP preconditioner

has three parameters the level of ll the drop tolerance and the pivoting threshold In our exp eriment

the drop tolerance is and pivoting threshold is the level of ll is set to a value such that each row

of L and U factor could have one more nonzero element than the average number of nonzero elements in

the lower and upp er triangular part of the original matrix Our intend is to nd out that given the same

preconditioner which one of the ve metho ds p erforms b etter so we simply allowed the program to p erform

a new factorization every time a dierent shift is generated This is not the most timeecient way of using

these two preconditioners however it do es not aect our ability to compare how the ve dierent eigenvalue

solvers b ehave As b efore we see that the Davidson metho d reaches converge on more problems than others

Arnoldi Orthogonalized Mo died Davidson Harmonic

MATVEC MATVEC MATVEC MATVEC time MATVEC

BUS

BUS

BUS

BUS

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

GR

LUNDA

LUNDB

NOS

NOS

NOS

NOS

NOS

NOS

ZENIOS

Table Solving nondiagonal systems with SOR preconditioning

Even on the problems where more than one metho d reached convergence the Davidson metho d is often the

one that takes the least amount of time and the least number of matrixvector multiplications

Table sums up the total number of eigenvalue problems converged under each metho d with or

without preconditioning Without preconditioning the ve metho ds converged on the same problems With

preconditioning the Davidson metho d converges on more problems than any other metho d Table counts

the number of cases where the metho d converged in the least amount of time In the unpreconditioned case

the Arnoldi metho d uses the least amount of time to reach convergence on this set of problems While in

the preconditioned cases the Davidson metho d invariably reaches convergence faster than others

All preconditioners used up to now are imp erfect ie they solve an approximate form of A I z s

What if we have this equation solved exactly In the linear system case if the preconditioner is exactly the

matrix itself the linear system is solved in one step In the eigensystem case this is not true We have

observed some p erformance dierence b etween the ve eigenvalue solvers the particular question we would

like to nd an answer to is will the eigenvalue solvers b ecome identical if the preconditioner is exact To

answer this we decide to apply the diagonal preconditioner on a series of diagonal matrices The diagonal

matrices we use are from the HarwellBoeing collection see table For our purp ose here the imp ortant

Arnoldi Orthogonalized Mo died Davidson Harmonic

MATVEC MATVEC MATVEC MATVEC time MATVEC

BUS

BUS

BUS

BUS

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

GR

LUNDA

LUNDB

NOS

NOS

NOS

NOS

NOS

ZENIOS

Table Solving nondiagonal systems with ILU preconditioning

thing is that they are diagonal matrices Table shows the results of unpreconditioned case and table

shows the results of preconditioned case In this test the preconditioner improved the p erformance of every

metho d however the Davidson metho d still uses less matrixvector multiplications andor time than other

four metho ds in most cases

Since the Arnoldi metho d we implement is always lagging b ehind the Davidson metho d it is nature to

susp ect that our implementation might b e awed in some way To show that we have made a reasonable sound

implementation we compared our results against a few public domain eigenvalue packages The package

that is closest to our implementation of the Arnoldi metho d is the ARPACK In table we show the

number of matrixvector multiplication and time used to reach convergence by b oth IRA and IRA

Note that the basis size we use in all our tests is which is the same as IRA However we need vectors

to save b oth V and W thus the total workspace size is very close to IRA Table only shows the

matrices who reached convergence on all eigenvalues within matrixvector multiplications Compared

with table the Arnoldi metho d we used converged on slightly more problems and it is very comp etitive

in b oth the number of matrixvector multiplications and the time used We know that the ve metho ds will

b ehave dierently with p erfect preconditioner This tests shows that the Davidson preconditioning scheme

is b etter than others with p erfect preconditioner to o

Arnoldi Orthogonalized Mo died Davidson Harmonic

MATVEC MATVEC MATVEC MATVEC time MATVEC

BUS

BUS

BUS

BUS

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTM

BCSSTM

BCSSTM

BCSSTM

GR

LUNDA

LUNDB

NOS

NOS

NOS

NOS

NOS

NOS

ZENIOS

Table Solving nondiagonal systems with ILUTP preconditioning

Orthogonalized Mo died Harmonic

Arnoldi Arnoldi Arnoldi Davidson Davidson

NONE

diagonal

SOR

ILU

ILUTP

Table Number of eigenvalue problems solved by each metho d

Orthogonalized Mo died Harmonic

Arnoldi Arnoldi Arnoldi Davidson Davidson

NONE

diagonal

SOR

ILU

ILUTP

Table Number of cases where the metho d reached convergence in the least amount of time

NAME N comment

BCSSTM mass matrix

BCSSTM mass matrix small oil rig

BCSSTM mass matrix

BCSSTM mass matrix oil rig

BCSSTM mass matrix transmission tower

BCSSTM mass matrix

BCSSTM mass matrix frame building TV studio

BCSSTM mass matrix square plate clamp ed

BCSSTM mass matrix ORE car lump ed masses

BCSSTM mass matrix part of a susp ension bridge

BCSSTM mass matrix frame within a susp ension bridge

BCSSTM mass matrix clamp ed square plate

BCSSTM mass matrix textile lo om frame

BCSSTM mass matrix D building

BCSSTM mass matrix winter sp orts arena

BCSSTM mass matrix story skyscrap er

BCSSTM mass matrix reactor containment o or

Table Diagonal RSA matrices from HB collection

Arnoldi Orthogonalized Davidson Harmonic

MATVEC time MATVEC time MATVEC time MATVEC time

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

Table Solving diagonal systems without preconditioning

Summary

From comparing the Arnoldi metho d and the Davidson metho d we observed that many eigenvalue solvers

can t into the frame work of a generic pro jection eigenvalue algorithm see algorithm From this

generic algorithm we describ e three variations The ve metho ds are mathematically equivalent to each

other without precondition If no preconditioner is available the most ecient metho d among the ve is

the Arnoldi metho d b ecause it is cheaper than others p er iteration However compared to others the bases

it created is more likely to loss orthogonality which could lead to more matrixvector multiplications b eing

used to b e compute the same solution Even though it might take a few extra steps if the matrixvector

multiplication is relatively cheap the Arnoldi metho d may still take less time than others

When a preconditioner is applied the ve metho ds b ehave quite dierently A go o d preconditioner

can enhance the p erformance of all ve metho ds we tested However in the tests we have conducted the

Davidson metho d can take advantage of preconditioning much b etter than other four metho ds This trend

is consistent in all preconditioners we have tried diagonal scaling SOR ILU and ILUTP Even when the

preconditioner is exactly A I the ve metho ds still pro duce dierent bases As we have seen in the

case of diagonal matrices with diagonal preconditioner the Davidson metho d can still outp erform the others

Arnoldi Orthogonalized Mo died Davidson Harmonic

MATVEC MATVEC MATVEC MATVEC time MATVEC

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

Table Solving diagonal systems with diagonal scaling

IRA IRA

Name MATVEC timesec MATVEC time sec

BUS

BUS

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTM

BCSSTM

GR

LUNDA

LUNDB

NOS

NOS

NOS

ZENIOS

Table Solving the nondiagonal systems using ARPACK

with exact preconditioners

Through this study we have shown that preconditioning the pro jection metho ds for eigenvalue problems

is an eective way of improving the p erformance of the metho ds The tests with diagonal matrices showed

that if a go o d preconditioner is available the p erformance of all ve metho ds can b e signicantly improved

However even if with a week preconditioner such as the diagonal scaling or SOR the Davidson metho d

can reach convergence on more problems On those problem that can b e solved without preconditioning

the Davidson metho d with preconditioning can often reduce the time and the number of matrixvector

multiplications used to compute the solutions

Chapter

Preconditioning Schemes

From study of the ve preconditioned pro jection metho ds for eigenvalue problems in the previous chapter

we see that preconditioning can signicantly help reduce the number of matrixvector multiplication and

time of solving an eigenvalue problem The Davidson metho d is esp ecially eective in taking advantage of

preconditioner It is fairly simple to implement and very eective in many cases and works well with simple

preconditioners In many applications where the Davidson metho d is used when a simple preconditioner is

not sucient it is often p ossible to construct an eective preconditioner based on the characteristics of the

problem However there are also cases where aggressive preconditioners do not work well with the Davidson

metho d This and other issues discussed in section have generated a great deal of interest in

seeking new preconditioning schemes for eigenvalue algorithms The Davidson preconditioning scheme can

b e viewed as an inexact Newton metho d for eigenvalue problems In this chapter we extend on this theme

and develop a number of preconditioning schemes that approximate Newtontype algorithms for eigenvalue

problems Similar to the Rayleigh Quotient Iteration RQI these Newton metho ds converge cubically to

a simple eigenvalue of a symmetric matrix In addition to their high convergence rate the preconditioning

schemes developed from them can avoid some of serious diculties in the Davidson preconditioning scheme

Section of this chapter reviews the current state of the Davidson preconditioning including its connec

tion to Newton metho d and its weaknesses Section describ es a number of Newtontype recurrences for

eigenvalue problems and their relation to the Rayleigh Quotient Iteration Section introduces strategies

of using these Newtontype metho ds as preconditioners and practical issues related to their implementations

Section present a few numerical examples A short summary is given in section

Davidson preconditioning

When Davidson describ ed his metho d in he did not use the word preconditioning Later on many

researchers noticed the resemblance b etween the technique used by Davidson and preconditioning techniques

for solving linear systems The diagonal scaling step of the Davidson metho d is call preconditioning

Since the diagonal scaling can b e regarded as an approximation to A I We refer to any technique that

approximates A I as a Davidson preconditioning scheme Many techniques for preconditioning linear

systems are now applied to the Davidson metho d to improve the convergence of eigenvalues Preconditioning

for eigenvalue problem is gradually b eing accepted as a practical to ol Theoretical study of this sub ject has

just b egun

Preconditioning for iterative linear system solver is a well studied research topic In order

to sp eed up the convergence of an iterative pro cedure for linear system Ax b the pro cedure is usually

applied on one of the following three preconditioned systems instead

M Ax M b

AM M x b

M AM M x M b

R

L R L

where the rst form is commonly referred to as leftpreconditioning the second rightpreconditioning and the

third split preconditioning One way of viewing preconditioning is that preconditioning attempts to make

M A AM or M AM close to the identity matrix b ecause a linear system with the identity

L R

matrix is trivial to solve Since the sp ectrum of M A AM and M AM can dier from that of

L R

A arbitrarily preconditioning for eigenvalue problem has to take on a dierent form There are dierent ways

of viewing preconditioning for example preconditioning to reduce and preconditioning to

increase stability however they are not directly relevant to eigenvalue problems either

An eigenvalue algorithm like the Lanczos metho d or the Davidson metho d can b e viewed to have two

distinct comp onents one to build a basis and one to p erform pro jection on the basis see algorithm In

this setting preconditioning is applied to the basis generation step of the algorithm see algorithm As

shown in the previous chapter the unpreconditioned Davidson metho d app ends the current residual vector

to the basis The Davidson preconditioning scheme applies a preconditioner to the residual vectors in an

attempt to improve the quality of the basis The original Davidson metho d solves the following linear system

for preconditioning

M I z r

where M is the diagonal part of the matrix A is the approximate eigenvalue r A I x is the

residual and x is the approximate eigenvector Solving this linear system is the preconditioning step or

simply preconditioning The matrix M I is called the preconditioner The preconditioner for linear

systems approximates A The Davidson preconditioner approximates A I Because their similarities

any preconditioner for linear systems can also b e adopted for eigensystems However using

an approximation of A I as preconditioner has the following pitfalls most of which are pitfalls of the

underlying Newton metho d for eigenvalue problems

Linear dep endent basis When M is very close to A one would exp ect an go o d preconditioner

based on exp eriences with the linear system solvers However the ab ove equation suggests that z x

app ending z to the current basis will cause the new basis to b e almost linearly dep endent Usually

this means we have to p erform extra reorthogonalization to maintain the orthogonality of the basis

If the dierence b etween z and x is very small the new basis vector generated will b e dominated by

roundo error which might imp ede convergence

Illconditioned iteration matrix The matrix A I is illconditioned when the approximate

eigenvalue is close to an actual eigenvalue In it was shown that the error from this illconditioned

system do es not hurt the convergence of inverse iterations One might b e tempted to adopt the result

to the preconditioner However the argument used there can not b e directly applied b ecause the

preconditioner is usually solved with an iterative solver or an approximate factorization which only

approximately solves A I z r In addition a linear system with an illconditioned matrix is

generally harder to solve for an iterative metho d

Misconvergence When the initial Ritz value is far from the desired eigenvalue the Davidson metho d

may converge to an eigenvalue that is close to the initial Ritz value rather than the desired one There

is no practical pro cedure that can guarantee the Lanczos metho d or the Davidson metho d will converge

to the desired solution with or without preconditioning However this problem is much more serious

with preconditioning than without

Indenite iteration matrix Usually is inside the sp ectrum of A if M is a go o d approximation to

A then M I is indenite An indenite system is harder to solve for a common preconditioner

such as an iterative linear system solver or an incomplete factorization An indenite iteration matrix

can cause an iterative solver to convergence slowly An incomplete LU factorization pro cedure may

fail on an indenite matrix for it may encounter zero pivot

Complex shift When the matrix is nonsymmetric may b e complex the ab ove preconditioned

linear system is complex This could demand either added storage andor added complexity to the

preconditioning pro cedure

Among these ve problems we will not encounter the th one b ecause we deal with Hermitian matrices

only We will address the other four problems with a strong emphasis on the rst two

There are several variations on the Davidson preconditioning scheme to overcome some of the diculties

mentioned ab ove The original preconditioning scheme in the Davidson metho d solves M I z r where

M is the diagonal part of A One of the earliest mo dications is from Olsen and colleagues which uses the

following equation for preconditioning

M I z r x

O

This was shown to b e an eective scheme if an appropriate is used The original criteria for choosing

O O

was to make z orthogonal to x as this would avoid the rst problem mentioned ab ove To achieve this it is

easy to show that we must have

T

x M I r

O

T

x M I x

The idea of orthogonalizing the output of preconditioning against the Ritz vector app ears to b e a logical

choice for avoiding the linear dep endent basis problem The JacobiDavidson preconditioning schemes can b e

thought as a dierent implementation to realize this intention In order to solve equation it is necessary

to apply the preconditioner on b oth x and r which doubles the cost of preconditioning Stathop oulos and

coauthors later mo died this scheme based on the correction equation

A d A x

where and x x d are the exact eigenvalue and the corresp onding eigenvector Based

on this equation the error in is used as in equation This provides an inexp ensive alternative

O

to the original choice of

O

For convenience of discussion we will refer to the rst mo died residual as the Olsen residual and this

later mo dication as the Stathopoulos residual When referring to b oth of them without distinction we will

use the term modied residual In general we will write them as

s A I x

where the actual value of is chosen to either make it the Olsen residual or the Stathop oulos residual Most

often the later is used

In equation is used as the shift to the preconditioner M It is a common choice However there

is no indication that it is an optimal choice In the authors prop osed to use a biased estimate of the

eigenvalue in the ab ove preconditioner based on the correction equation In order to dene this biased

estimate recall that the error of a Ritz pair can b e b ounded as follows

kr k kr k g

kr k g kr k g

where g is the gap b etween and the closest eigenvalue When seeking the smallest eigenvalues the

biased estimate of is where is the current Ritz value when lo oking for the largest eigenvalues

Using the denition of biased estimated of eigenvalue the Stathop oulos residual can b e written

as s A I x In many cases this biased eigenvalue estimate is closer to the exact eigenvalue than

the Ritz value Numerical tests have shown the biased shift to b e more eective than Tests also show

that it is b etter in addressing the misconvergence problem in the ab ove list Some researchers have argued

that M I should b e p ositive denite using a biased shift in many cases will resulting a denite

preconditioner One limitation of this biased estimate idea is that it do es not extend to complex eigenvalues

easily

After the ab ove two mo dications on the residual and the shift the preconditioning equation can b e

rewritten as

M I z s

where is an overestimate of eigenvalue and s is the mo died residual dened by equation We note

that this preconditioning equation is essentially an approximate form of the correction equation Dongarra

and colleague have studied this sub ject from the p ersp ective of improving accuracy of eigenvalues computed

It was p ointed out in that directly solving the correction equation may also face the same

diculties we have listed ab ove

Recently the JacobiDavidson metho d has attracted much attention The preconditioning step of

the JacobiDavidson metho d can b e written as

T T

I xx A I I xx z r

instead of equation The matrix on the lefthand side is singular which make it unsuitable for many

preconditioning techniques However a sp ecial advantage of this scheme is that when this linear system is

solved using a Krylov iterative solver the result is orthogonal to the approximate eigenvector In spirit this

can b e regarded as a reincarnation of the Olsen scheme b ecause it is a dierent way of making z orthogonal

to x

In this chapter we will only try to nd one eigenpair The subscripts to variables x and r are

i i i

used to indicate that they are values pro duced at ith iteration The initial values are denoted as and x

For simplicity we will always assume that x is a unit vector and is the corresp onding Rayleigh quotient

T

x Ax

Newton metho d for eigenvalue problems

The Newton metho d is a class of metho d that seeks a solution for f x by p erforming the following

recurrence

df

f x x x

i i i

dx

xx

i

df

is known as the Jacobian or the Jacobian matrix which is usually denoted by J for The derivative

dx

a continuous function f if J is nonsingular and the starting p oint x is close enough to the solution this

recurrence converges quadratically An inexact Newton metho d is one that solves the equation J x x

i i

f x inaccurately

i

The Davidson preconditioner solves an Newton equation approximately For this reason given a go o d

enough preconditioner the Davidson metho d could converge quadratically From our de

scription in the previous section it is clear that the Davidson preconditioning scheme can b e viewed as an

approximate form of the correction equation If the exact eigenvalue is known nding the corresp onding

eigenvector can b e achieved by applying the following Newton recurrence to solve r A I x

x x A I r

i i i

where r A I x Note that this equation is equivalent to the correction equation Because the

i i

Davidson preconditioning approximates the correction equation it is an inexact form of the ab ove Newton

recurrence

If equation is actually solved accurately the solution is zero x x A I A I x

i i i

This is a dierent manifestation of linear dep endent basis problem describ ed on page Since is an exact

eigenvalue the Jacobian matrix A I cant b e inverted so the ab ove Newton recurrence is not well

dened To correct this the following recursion is considered instead of equation

x x A I r

i i i

where the sup erscript indicates pseudoinverse By denition of pseudoinverse AA A A The ab ove

equation is still equivalent to the correction equation

A I x x A I A I r

i i i

A I A I A I x

i

A I x r

i i

This Newton iteration removes the comp onent orthogonal to x from x If the initial guess x is not

i

orthogonal to x x computed by equation is a nontrivial solution of r in other word x is the

wanted eigenvector

Through the ab ove argument we see that solving the correction equation using pseudoinverse see equa

tion has signicant advantage over the alternative see equation In practice we dont know the

exact eigenvalue therefore A I is not singular thus A I A I We continue to face the

same problems mentioned ab ove Since the pseudoinverse of a matrix is dierent from the inverse only if the

matrix is singular we could force the matrix to b e singular as what is done in the JacobiDavidson metho d

Note that the JacobiDavidson metho d was not derived this way but this is a valid interpretation In the

JacobiDavidson metho d a Krylov subspace metho d is used to solve the preconditioning equation

The Krylov subspace metho d approximates pseudoinverse by computing a solution of in the range of the

iteration matrix One dierence b etween equation and equation is that the later one generates

a correction that is orthogonal to the exact eigenvector x in other word the correction is in the range of the

iteration matrix Both the Olsen preconditioning scheme and the JacobiDavidson preconditioning scheme

are successful b ecause they generate orthogonal corrections An alternative strategy to address the same

issue is to nd a Newton metho d with a nonsingular Jacobian matrix Searching for a well b ehaved Newton

metho d is the main ob jective of this section

Newton based schemes for eigenvalue problems have b een investigated from many dierent p oints of

view Peters and Wilkinson showed a Newton metho d for eigenvalue problems which solves an augmented

n n system The eigenvalue problem is treated as a unconstrained minimization problem

in this case As shown ab ove the correction equation can b e regarded as a Newton metho d for eigenvalue

problems One of the main uses of the correction equation is to improve the accuracy of computed eigenpairs

Numerous variations of the correction equation have b een studied for this purp ose in Newton

metho d is also often used as a part of the Homotopy metho d for eigenvalue problems In this case

the Newton metho d used is very close to what is prop osed in The JacobiDavidson metho d is regarded

as a form of Newtons metho d in the space p erp endicular to x Due to this unique construction it

avoids some of the pitfalls of the original Davidson preconditioning scheme There are a number of eigenvalue

metho ds that are based on Newton metho d for eigenvalue problems for example the trace minimization

metho d and the inated inverse iteration Since we are interested in mimic the original Davidson

preconditioning scheme we will not directly adopt these metho ds as preconditioners though the idea may

deserve some attention

In this section we will revisit the augmented Newton scheme prop osed in It is slightly reformulated so

that the eigenvector is scaled in norm Three variations are derived from this augmented Newton metho d

one of which can lead back to the Olsen scheme and all three are equivalent to the Rayleigh quotient iteration

in exact arithmetic The eigenvalue problem can also b e formed as an constrained optimization problem in

which case the Newton metho d for constrained optimization prop osed by Tapia can b e applied

We will describ e this constrained minimization problem and discuss its prop erties In addition we will

present the JacobiDavidson iteration as a Newton metho d for eigenvalue problems Coincidentally it is also

equivalent to the Rayleigh quotient iteration

In this section we will use the subscript i to indicate an arbitrary iteration If a subscript is missing from

x or r it is implied that they refer to x or r

i i i

Rayleigh quotient iteration

The Rayleigh Quotient Iteration RQI is a very eective metho d for nding one eigenvalue and its eigenvector

Its asymptotic convergence rate is cubic for simple eigenvalues of symmetric eigenvalue problems

Unless the eigenvalue is known no algorithm can converge to an eigenvector with higher sp eed It is also

very simple to describ e this basic iteration can b e summarized in one line

A I x

i i

T

x Ax x

i i i

i

kA x k

i i

where x is a unit vector

Note that the Rayleigh quotient iteration has most of the problems we have observed for preconditioners

see page b ecause the iteration matrix used in RQI is the same as in the Davidson preconditioning Our

i kr k

i i i

e e

e e

e e

Table Ritz values and residual norms pro duced at each step of RQI while computing the smallest

eigenvalue of PLAT

i kr k

i i i

e

e

e

e

e

e

e e

e e

Table Ritz values and residual norms from RQI while computing the smallest eigenvalue of EX

main interest here is not to study the Rayleigh quotient iteration but to use it for comparisons against

the Newton metho ds to b e discussed Because we will often refer to its iteration matrix we will name it

J J A I We will also regard it as the iteration matrix for the Newton iteration describ ed in

R R

equation since is not known b efore convergence

To test the Newton metho ds we will describ e we apply them on a test matrix called PLAT It is a

small symmetric p ositive denite matrix see table for the size and the origin of the matrix All of the

eigenvalues for PLAT are in pairs ie they are doublets The smallest one is ab out and

the largest one is We set the tolerance on the residual norm to b e so that eigenvalues can b e

accurately resolved Our intention here is to seek the smallest eigenvalue and the corresp onding eigenvector

T

starting with the initial guess This exp eriment is conducted on a SPARC workstation

using MATLAB which uses bit arithmetic for oatingp oint op erations The unit roundo error is

For our purp ose the unit roundo error can use understo o d as follows If the largest eigenvalue

in absolute value is any eigenvalue with absolute value less than j j is indistinguishable

max max

j j

max

from zero The condition number for a can b e dened as where and

max min

j j

min

are the largest and smallest eigenvalue in absolute value In bit arithmetic any matrix with condition

number greater than could b e considered as practically singular Note that nding the smallest

eigenvalues of PLAT is relatively hard for the Davidson metho d

In a number of cases we will also use a second test matrix This second test matrix is EX which is

a nite element matrix for a fully coupled NavierStokes equation It is generated from solving the second

example problem in the FIDAP package It is a symmetric matrix with only simple eigenvalues ranging from

to The matrix size is The largest negative eigenvalue is the smallest

p ositive eigenvalue is It is chosen b ecause its unusual sp ectrum where there are will separated

negative eigenvalues and ab out eigenvalues b etween and The conditioner number is ab out

T

Again we use the vector as the initial guess For this test matrix we iterate till the residual

norm can no longer b e reduced

1

SPARC is a trade of Sun microsystem

2

MATLAB is a trademark of MathWorks

3

FIDAP is a trademark of Fluid Dynamics International

Table show the kr k from solving PLAT using RQI In this table the rst column is the

i i

iteration number i the second and the third column are the Ritz values and the residual norms kr k the

i i

fourth column is the condition number of the iteration matrix J RQI did not converge to the

i R

smallest eigenvalue in this case However it do es converge very quickly to an eigenvalue close to the initial

Ritz value One imp ortant observation to note here is the trend of As iteration progresses it quickly

grows to practically innity for bit arithmetic For later comparisons we also applied RQI on EX The

results is shown in table Note that in this case RQI converges to an eigenvalue within the large cluster

around zero

Augmented Newton metho d

One way of formulating the eigenvalue problem is to express it as nding a zero residual with x pair as

the indep endent variable This formulation can b e describ ed as seeking a solution to the following

quadratic equation

A I x

T

x x

The original form of this Newton metho d is presented in Many researchers have noticed the scaling

scheme in this pap er was not practical dierent variations have b een used The scheme shown here is

natural for norm This is a unconstrained quadratic problem Given an initial guess x the Newton

iteration can b e describ ed as follows

A I x x x

i i i i

J

T

A

x x

i i i

i

A I x

i i

J

A

T

x

i

We refer to this scheme as the augmented Newton recurrence b ecause it solves a linear system whose

size is larger than the size of original eigenvalue problem The validity of this Newton scheme has b een

established in However we will still rep eat part of the argument mainly to introduce the techniques

used b ecause they will b e used again for other Newton schemes First lets give two simple denitions which

will b e used in a couple of places later

If x is an exact eigenpair and is a simple eigenvalue we will call x a simple eigenpair If x

is an eigenpair the solution for the following equation

A I y x

is called a principle vector of grade corresp onding to see section of and corresp onds to at

least a quadratic elementary divisor of the characteristic p olynomial of A see section of Note that x

may b e arbitrarily scaled in equation even though we prefer a unit vector as an eigenvector A basic

fact ab out the principle vector is that if is a simple eigenvalue there is no grade or higher principle

vector This fact is used immediately in the pro of of the following lemma

Lemma If x is a simple eigenpair the Jacobian matrix J for the augmented Newton recursion is

A

nonsingular

y

For a pro of of this lemma see section of Assuming there is a vector such that

y

n

y R R J

A

then b oth y and must b e zero According to the denition of J see equation y and must satisfy

A

the following equations

A I y x

T

x y

i kr k

i i i

e

e e

e e

e e

e e

Table The intermediate results from the augmented Newton recurrence when computing the smallest

eigenvalue of PLAT

i kr k

i i i

e

e

e

e

e

e

e

e

e

e

e

e

e e

e e

Table The intermediate results from applying the augmented Newton recurrence on the EX test

problem

If y is not zero either y is another eigenvector corresp onding to when is zero or y is a principle vector

corresp onding to when is not zero in either case is not a simple eigenvalue which contradicts the

assumption

This lemma has two implications First b ecause equation is quadratic in x it guarantees that

given a go o d initial guess the Newton recurrence describ ed by equation will converge quadratically

Second the iteration matrix J is nonsingular This is imp ortant b ecause we are searching for a Newton

A

metho d for eigenvalue problem with nonsingular Jacobian matrix We have found our rst one

In our implementation of this metho d we only take an eigenvector initial guess as input This initial guess

is normalized to pro duce the x for equation The initial eigenvalue is taken to b e the corresp onding

T

Rayleigh quotient x Ax Table shows the convergence pro cess of this metho d applied to the test

0 2 10 10

−2 0 10 10

−2 −4 10 10

−4 −6 10 10 −6 10 −8 10 −8 10 −10 10 −10 residual norm 10 error in eigenvalue −12 10 −12 10

−14 10 −14 10

−16 −16 10 10

−18 −18 10 10 0 5 10 15 20 0 5 10 15 20

iteration iteration

Figure Convergence histories of RQI solid line and the augmented Newton recurrence dashed line

applied on the PLAT test problem

matrix PLAT We see that the condition number of J stay relatively small compared to table It

A

also app ears that the augmented Newton metho d takes more steps to converge than RQI which could b e

attributed to the fact that the approximate eigenvectors are not normalized as in RQI In a version of

this augmented Newton metho d with a dierent normalization was shown to b e identical to RQI Later in this

section an normalized version of the ab out augmented Newton recurrence will also b e shown to b e identical

to RQI In the case of RQI eigenvalue approximations are Rayleigh quotients which are always contained

inside the sp ectrum of the matrix However there is no such restriction on the eigenvalue approximation

generated by the augmented Newton metho d The eigenvalue approximations by the augmented Newton

recurrence is clearly dierent from that by RQI see tables and

Figure shows the convergence history of b oth RQI and the augmented Newton recurrence The error

in eigenvalue is estimated as j j where is replaced with the approximation at the last step This

i

error estimate leaves the error at the last iteration as zero which is not plotted in gure From the gure

it app ears that the augmented Newton recurrence sp ends a signicant number of step in searching for an

eigenpair to converge to Once it has nd a destination it app ears to converge almost as fast as RQI

Table shows the intermediate results while solving the EX test problem In this case the conditioner

number of J stays fairly large However it shows an overall decreasing trend Near convergence the

A

condition number settles down near the condition number of the original matrix This contrasts the trend

in table where the condition number of the iteration matrix grows as convergence approaches This

test shows that even though J is not singular around the exact solution it may b e illconditioned far

A

from convergence Another p ossible shortcoming of this metho d is that the approximate eigenvalues are

not Rayleigh quotients Because the Rayleigh quotient has many optimality conditions it would b e

preferable if the eigenvalue approximation is computed as Rayleigh quotient

Variations of the augmented Newton metho d

Because the augmented Newton recurrence solves a linear system of a larger size it may not b e desirable to

work with the augmented system directly One solution to this problem is to factor J and write the up date

A

expressions for and x separately When J is nonsingular the blo ck LU factorization of J is as follows

R A

J x I

R

J

A

T T

x J x x J

R R

Note that J A I and the variables x and r are assumed to have subscript i if no subscript is present

R

T

Dening x x r A I x equation can b e transformed as follows

i i i i i

i

T

x x x J r

i i i i

i

R

x x J r

i i i

R

T

J x x

i

i

R

T

x J r

i i

i

R

i i

T

J x x

i

i

R

Since the Newton iteration only requires x to compute x the ab ove recurrence can b e slightly mo died

i i

by scaling x to norm one which ensures is always zero With this mo dication equation b ecomes

i i

r x J

i i

O

x

i

kx J r k

i i

O

T

A I x x

i i

i

I J A I

i

O

T

A I x x

i i

i

We can make two interesting observations ab out this equation One equation resembles a Newton

iteration with the Jacobian matrix b eing J Second the dierence b etween x and x computed by

O i i

equation lo oks very similar to the Olsen preconditioning scheme equation where M I

in equation is replaced with A I in equation This establishes a connection b etween the

Olsen preconditioning scheme with the augmented Newton metho d for eigenvalue problem which provides a

new interpretation of the Olsen preconditioning scheme

is Because J Another p oint we should note ab out the matrix J is that we only know what is J

O

O O

singular J x we choose to not dene what is J itself Since we have given the explicit form for

i O

O

the inverse of the Jacobian matrix in this case the Newton recurrence can always b e carried out as long as

r the following is true r is not zero Let x x J x J

i i O i i

O O

T

x x x A I

i i i

i

r A I x r I

i i O i

T T

A I x A x x x

i i i i

i i

T

This indicates that x is well dened if is not an exact eigenvalue and x A x is not zero

O i i i

i

We have neglected to mention how to compute in the recurrence describ ed in equation Now

i

that we use equation to compute x rather than equation we can no long use equation

i

to up date as in the augmented Newton recurrence The alternative is to compute as a Rayleigh

i

T T

x According to the prop erties of Rayleigh quotient this new Ax x quotient ie x

i i i i

i i

is not worse than that given by equation This mo died scheme should converge at least as fast

as the augmented Newton scheme Because the resemblance to the Olsens scheme for preconditioning

this recurrence is named the NewtonOlsen recurrence The Jacobian matrix J for this scheme lo oks

O

complicated However due to the following connection with the Rayleigh quotient iteration we know that

the NewtonOlsen recurrence converges fast

Lemma Given the same initial unit vector for both the NewtonOlsen recurrence and the Rayleigh

T

quotient iteration if A I neve becomes singular and x A I x never becomes zero the two

i i i

i

recurrences produce the same result

Pro of At step i assume x pro duced by b oth recurrences are the same The subscript i is dropp ed in

i

the rest of this pro of b ecause we will not refer to any other step Let x and x denote the solution of the

R O

Rayleigh quotient iteration and the NewtonOlsen recurrence b efore scaling ie

x A I x

R

T

xx A I

I x x J r x A I r

O

O

T

x A I x

i kr k

i i i

e

e

e e

e e

e e

e e

Table The intermediate results of the NewtonOlsen recurrence applied on the PLAT test problem

i kr k

i i i

e e

e e

e e

Table Intermediate results from the normalized augmented Newton recurrence when applied on

PLAT

T

Using the denition of r A I x and the fact that x x we can simplify the equation for x as

O

follows

T T

x x x A I xx A I x x x A I x

O R

T

If x A I x x and x only dier by a nite scalar constant Since b oth RQI and the Newton

O R

Olsen recurrence scale their approximate eigenvectors to norm it is clear that they give the same answer

2

This lemma indicates that RQI and the NewtonOlsen metho d are equivalent to each other if A I

i

T

A I x is never zero However when A I is singular x is an exact is never singular and x

i i i i i

i

eigenpair Whether it is the desired one or not there is no reason to continue the current recurrence If

the desired one is found we should stop else a new initial guess is need to start another recurrence When

T

x is close to a simple eigenpair there is no solution to the following equations A I y x y x

Thus x A I x can not b e zero In summary if approaches a simple eigenvalue then RQI and the

i i i i

NewtonOlsen recurrence are equivalent to each other

When near the exact solution x x which is p o orly scaled The same is not true for x

R O

T

Because x A I x

x x

O

Intuitively the Rayleigh quotient iteration works to amplify the x comp onent in x while the NewtonOlsen

i

scheme works to remove comp onents orthogonal to x from x Numerically this suggests that x is a b etter

i O

b ehaved vector than x even though there are theoretically equivalent However this b enet is not realized

R

b ecause J is very illconditioned in practice

O

Table shows the NewtonOlsen metho d applied to the test matrix PLAT It app ears that the

eigenvalue is converging to the same one as RQI Unfortunately the residual norm did not decrease b elow

This is due to the fact that multiplying J by a vector requires computing solving with A I

i

O

twice Signicant error can accumulate during this computation

Since J is well conditioned near convergence instead of using equation to compute x we could

A i

use equation and scale x after every step Following the NewtonOlsen recurrence we can compute

i

the eigenvalue approximation as a Rayleigh quotient instead of using the result from equation

i

This is a normalized augmented Newton metho d since the intermediate approximate solutions of the eigen

vector are scaled to have unit norm In normalized augmented Newton scheme the iteration matrix is still

J which is known to b e nonsingular near convergence Theoretically the normalized augmented Newton

A

i kr k

i i i

e

e

e

e

e

e

e

e e

e e

Table Intermediate results from the normalized augmented Newton recurrence when applied on EX

metho d is exactly the same as the NewtonOlsen scheme Numerically this normalize augmented Newton

recurrence should b e b etter than the NewtonOlsen scheme since the matrix J has smaller condition number

A

than J

O

Table shows the results for this normalized augmented Newton recurrence We can see that the

condition numbers are must smaller than those shows in table which are pro duced with the Newton

Olsen recurrence This metho d not only is theoretically equivalent to RQI it also pro duces the exact same

eigenvalue and residual norm for most steps The residual norms dier at the last step only b ecause RQIs

iteration matrix is to o illconditioned and the roundo errors b ecome signicant Note that the condition

number of J grows at the same pace as that of J see table This is b ecause we only prove that

A R

J is well b ehaved if the eigenvalue is simple In this case we know the eigenvalue is a doublet J is

A A

not guaranteed to b e well conditioned The numbers in table indicate that J b ecomes singular as the

A

eigenvalue converges

Table shows the results of applying the normalized augmented Newton metho d on the other test

matrix EX Since all eigenvalues are simple in this case the condition number of J is not to o much larger

A

than the condition number of the original matrix

The Jacobian matrix used in equation can b e approximated as follows

T

xx A I

I J A I

O

T

x A I x

T

where is a scalar constant When jx A I xj is much larger than the ab ove approximation is an

T

accurate one in which case J A I xx the ShermanMorrison formula This approximate form

O

of the Jacobian matrix is simpler than the original denition of J which should make a more viable option

O

as a preconditioner This inexact Jacobian matrix is J with approximate Wielandt deation We

R

can use this approximation to dene another well b ehaved Newton recurrence

Dene

T

J A I x x

I i i

i

The following is true

Lemma If x is a simple eigenpair of A J is nonsingular

i i I

If is an eigenvalue with multiplicity greater than the matrix J is singular The matrix J s nul l

i I I

space should be one dimension less than that of A I

i

T

If the matrix A I is nonsingular and x A I x is not the matrix J is nonsingular

i i i I

i

Pro of

If x is a simple eigenpair the sp ectrum of J is the sp ectrum of A shifted by except the zero

i i I i

eigenvalue of A I is moved to Since is a simple eigenvalue of A A I has only one zero

i i i

eigenvalue By construction J has moved this zero eigenvalue away from zero

I

i kr k

i i i

e e

e e

e e

Table The results from the inated Newton recurrence applied on PLAT

If the dimension of the null space of A I is more than one the construction of J only moves the

I

dimension parallel to x The matrix J has a null space of smaller dimension than A I

i I i

If the matrix A I is nonsingular we can try to solve the following equation to identify the null

i

space of J

I

T

A I y x y x

i i

i

The ab ove equation app ears to have the following solutions y A I x where is a scale

i i

constant When we substitute this expression of y into the ab ove equation the following is the outcome

T

x x A I x

i i i

i

T

If x A I x is not the ab ove equation can only b e satised with This indicates that

i i

i

only y can satisfy equation J y In other word J is nonsingular

I I

2

T

From this pro of we see that when x is not an exact eigenpair J can b e singular if x A I x

i i I i i

i

By denition x is a unit vector which is lo cated on a ndimensional unit sphere If the surface dened

i

T T

by x A x Ax I x intersects this ndimensional unit sphere then for those matrices J may

i i I

i i

b e singular when x fall on the intersections In short there are only very sp ecial values of x can make J

i i I

singular

When J is nonsingular and J x r the following recurrence is a valid Newton metho d for eigenvalue

I I i i

problem

r x J

i i

I

x

i

kx J r k

i i

I

T

Note that the condition of J x r is satised b ecause x is a unit vector x Ax and r A I x

I i i i i i i i i

i

Because of the connection with the Wielandt deation this recurrence is called the inated Newton

T

recurrence We could have dened J as A I xx However b ecause of a lack of practical way to

I

choose we simply set it to in equation Here are some suggestions for choosing if appropriate

information is available First is not a go o d choice One go o d choice is to pick a number within the

sp ectrum of J for example If an iterative solver like CG is used to solve equation it

R

is go o d to place at the center of a large cluster in J s sp ectrum This will improve the convergence of the

R

iterative solver The matrix J is the same as the matrix for the inated inverse iteration metho d shown in

I

equation The derivation of the inated inverse iteration was quite dierent from what is shown

here but it is also developed from the Newton metho d of

From the pro of of lemma it is easy to see that the following is true

Lemma Given the same unit vector x to both the inated Newton recurrence and the Rayleigh quotient

T

iteration if A I is nonsingular and x A I x is nonzero the two recurrence produce the same

result

Table shows the results of applying the inated Newton recurrence on the test matrix PLAT The

results of EX is shown in table The inated Newton recurrence is a fairly stable scheme the condition

number of J is similar to that of the normalized augmented Newton metho d see table Both J and J

I A I

i kr k

i i i

e

e

e

e

e

e

e e

e e

Table The results from the inated Newton recurrence applied on EX

b ecomes singular when the eigenvalue computed is not simple see table and When the eigenvalue

is simple as in the EX case b oth the normalized augmented Newton recurrence and the inated Newton

recurrence have similar condition number for their iteration matrices near convergence see table and

We have b een concentrating on what happ ens near convergence To use them as preconditioner it is

just as imp ortant to study the recurrences when the solutions is far from convergence From table and

we see that at the b eginning of the Rayleigh quotient iteration the condition number of J is relatively

R

small Though J is well b ehaved near convergence it can have very large condition far from convergence

A

see table and The matrix J is b etter than J in this resp ect b ecause far from convergence the

I A

condition number of J is fairly small compared to that of J In fact table and show that the

I A

condition number of J is very close to that of J see table and

I R

Constrained Newton metho d

T T

Dening r A x AxI x ie replacing with x Ax the eigenvalue problem can b e expressed as an

optimization problem with an equality constraint

T

Ax xx Ax

kxk

With the eigenvalue problem stated in this form Tapias algorithm for constrained optimization can directly

apply

The algorithm given in prescrib es a formula for constructing a Newton recursion to solve equa

tion To apply to the eigenvalue problem we need to nd the Jacobian matrix J Sp ecically the

C

recursion can b e written as

x J r

i i

C

x

i

r k kx J

i i

C

dr

T T

J A A A I x x

C i i

i

dx

When A is symmetric the Jacobian matrix can b e simplied to

T

J A I x Ax

C i i i

This scheme for computing eigenpairs will b e referred to as the constrained Newton recurrence later

In order for equation to b e a valid Newton metho d we need to b e able to invert the Jacobian

matrix The following lemma states when J is nonsingular

C

Lemma If x is a simple eigenpair and is not zero then J is nonsingular

C

Pro of To prove this lemma we only need to argue that there is no nontrivial vector which satises the

following equation

J y

C

Substitute the denition of J equation into the ab ove equation and note that x is an exact

C

eigenpair The ab ove equation b ecomes

T

T

A A y A I y x x

T

T

A A y is zero then y is an eigenvector corresp onding to If y is not zero there are two p ossibilities If x

T

T

A A x Since is a simple eigenvalue y x where is a nonzero scalar constant Because x

T

T

A A y can not b e zero The second p ossibility is that y is a principal vector of grade is not zero x

T

T

A A y is not zero This is not p ossible b ecause is a simple eigenvalue Thus there is no when x

nontrivial y that can satisfy J y 2

C

T

The residual r Ax x Axx is a p olynomial of x A consequence of this lemma is that there exists a

region around x where for any vector x the Jacobian matrix J is nonsingular In order for the recurrence

C

not to break down x can not b e zero If x is zero the following is true

i i

J x r

C i i

The lefthand side of the ab ove equation is J x r x It is clear that if is zero then x is

C i i i i i i

zero When is close to an nonzero eigenvalue we can exp ect to b e nonzero In which case the ab ove

i i

constrained Newton recurrence will not break down If the matrix A is denite the metho d will not break

down either In general when the recurrence is far from converging the probability of encounter a zero

Rayleigh quotient is small b ecause the x values that can pro duce zero Rayleigh quotient is a very small

p ortion of all p ossible x values

Near a simple nonzero eigenvalue the recurrence describ ed by equation satises the conditions of

Theorem of Therefore given a go o d initial guess it converges quadratically The following lemma

shows that it will convergence cubically for symmetric eigenvalue problems

Lemma When is not zero and not an exact eigenvalue if x is the same for both the constrained

i i

Newton scheme and the Rayleigh quotient iteration then x is also the same

i

When is an exact eigenvalue the Rayleigh quotient iteration breaks down The constrained Newton

i

metho d breaks down when is zero or when is an exact eigenvalue with multiplicity greater than The

i i

premise of the lemma ensures that b oth recurrences will not break down

Pro of If x is the same for b oth metho ds in order to show x is the same to o we will rst show that

i i

x and x are parallel to each other where

C R

x r x J x x J

R C

R C

Since there is no other subscript involved in the rest of this pro of we drop the subscript i from the variables

Substituting the value of J from equation the relation ab out x can b e rewritten as

C C

T T T

A I xx A xx A x x r

C

After some straightforward rearrangement of the terms and noting that r A I x it can b e written as

T T

A I x xx A A x x

C C

T T

Let x A A x x equation b ecomes

C

A I x x

C

If b oth RQI and the constrained Newton metho d do not break down then A I is nonsingular and x

C

is nonzero Thus the lefthand side of equation is not zero This shows that must b e nonzero

In conclusion x and x dier by a nonzero scalar constant Since the two metho ds scale x and x

C R C R

to generate the eigenvector approximations they generate the same vector x for the next iteration 2

i

i kr k

i i i

e e

e e

e e

Table The intermediate results of using the constrained Newton recurrence on PLAT

i kr k

i i i

e

e

e

e

e

e

e e

e e

Table The results of applying the constrained Newton recurrence on EX

To have an idea of how large the constant x x is we replace x with x in the denition of

C R C R

T T

x A A x x

R

This leads to the following equation for

T T

x A A x

T T

x A A x

R

If x is close to the exact eigenvector x and A is symmetric we can approximate as follows



The assumption of the ab ove approximation is that x is close enough to the exact solution such that x

R

x Using this approximate we know that near convergence

x x

C

From this equation we know that the correction to x pro duced by the constrained Newton recurrence is

i

almost p erp endicular to x as in the NewtonOlsen recurrence see equation Similar to the Newton

i

Olsen recurrence we can exp ect this recurrence to b e numerically more stable than the Rayleigh quotient

iteration In addition since the Jacobian matrix of this recurrence is nonsingular near convergence we

exp ect no obstacle to realize this theoretical b enet

Tables and show the eigenvalue and residual norm at each step of the constrained Newton

recurrence Note that the eigenvalue approximations and the residual norms are nearly identical with that

of RQI Similar to J the condition number of J also grows quite rapidly on the PLAT test problem

A C

b ecause the eigenvalues has multiplicity of two see table and

JacobiDavidson recurrence

The preconditioning step of the JacobiDavidson metho d solves the following equation approximately

T T

I xx A I I xx z r

T T

The matrix J I xx A I I xx is singular so it is not p ossible to solve the ab ove equation

J

in the usual sense One of the ob jectives of JacobiDavidson preconditioner is to make z orthogonal to

x However the ab ove equation do es not enforce this condition On the contrary since x is in the null

space of J the solution to this equation can have an arbitrarily large comp onent in x direction Following

J

the example of the correction equation see equation we regard the JacobiDavidson preconditioning

scheme as solving the following equation

T T

z I xx A I I xx r

Because the pseudoinverse is always dened for a matrix the result from this equation is unique and z

is orthogonal to x If is the Rayleigh quotient of x and r is the corresp onding residual vector the

original JacobiDavidson preconditioning equation equation is consistent In this case the

result from equation satises the original equation and it is the minimum norm solution Based on

equation we dene the following recurrence formula for nding an eigenvector starting from a unit

vector x

x J r

i i

J

x

i

kx J r k

i i

J

This recurrence lo oks similar to the constrained Newton recurrence The JacobiDavidson metho d has b een

argued to b e a Newton metho d based on equation It is reasonable to regard equation

as a form of Newtons metho d for the eigenvalue problem Because J is always dened J r is computable

J J

Since x is p erp endicular to J r x J r will never b e zero Thus the ab ove recurrence is well dened

i i i i

J J

Because of the singular Jacobian matrix we may not exp ect it to have sup erlinear convergence as a Newton

metho d with nonsingular Jacobian However the following lemma shows that it actually converges cubically

b ecause of the connection to the Rayleigh quotient iteration

Lemma For Hermitian matrices the JacobiDavidson recurrence described by equation is equiv

alent to the Rayleigh quotient iteration as long as the Rayleigh quotient iteration can be carried out and

T

x A I x is never zero

i i

i

T

Pro of Starting with a unit vector x x Ax let

r x x x J x J

J R

J R

T T

where J A I J I xx A I I xx x and x are the unscaled solutions of RQI and the

R J R J

JacobiDavidson recurrence Dene z x x b ecause of the consistency of equation the following

J

must b e true

T T T

I xx A I I xx z r x z

From this equation we know that

A I z r x

where is an undetermined scalar constant This equation leads to the following solution for z when is

not an eigenvalue

z x x

R

Now we can use the orthogonality condition of z to determine as follows

T T T

x x x A I x

R

Thus the solution of x is

J

x

R

x

J

T

x x

R

T

If x A I x is never zero is a nonzero nite scalar number In this case x is parallel to x After

i i J R

i

they are scaled to unit norm they should b e identical to each other 2

i kr k

i i

e

e

e

Table Solving PLAT with the JacobiDavidson recurrence

PLAT362 20 10

18 10

16 10

14 10

12 10 RQI 10 10 NANR

condition number DNR 8 10 CNR

6 10 NOR

4 10

2 10 1 2 3 4 5

iteration

Figure Condition numbers of the Jacobian matrix when solving PLAT

One observation from this pro of is that the solution from the JacobiDavidson recurrence is exactly the

same as that of the NewtonOlsen recurrence They are the same even b efore scaling This conrms again

that the JacobiDavidson preconditioning is equivalent to the Olsen preconditioning scheme

Table shows the approximate solutions from applying the JacobiDavidson recurrence on PLAT

We did not show the condition number b ecause J is known to b e singular As the lemma predicted this

J

metho d pro duced exactly the same results as that of RQI

Figures and depicts the condition number of several iteration matrices versus the iteration number

The names of the recurrences are abbreviated to contain only their initials From gure we can see that

the conditioner number of J from the NewtonOlsen recurrence NOR is consistent with the fact that it

O

is singular The four other metho ds shown in gure fall into two groups the Rayleigh quotient iteration

RQI and the constrained Newton recurrence CNR are one the normalized augmented Newton recurrence

NANR and the inated Newton recurrence DNR are in the other In this case the condition numbers

of J RQI and J CNR are nearly identical at every step the condition numbers of J NANR and J

R C A I

DNR are very close to each other Since the eigenvalue computed is not simple as the iterations approach

convergence the condition number of all iteration matrices increase to which is innity for bit IEEE

arithmetic op erations In the EX case gure the condition number of J NANR is very large initially

A

which should b e considered a disadvantage of the normalized augmented Newton metho d Near convergence EX2 20 10

18 10

16 10

14 10

12 10

10 10 condition number 8 10 RQI NANR 6 10 DNR

4 CNR 10

2 10 1 2 3 4 5 6 7 8 9

iteration

Figure Condition numbers of the Jacobian matrix when solving EX

only the condition number of J did not stop increasing As we have shown near convergence J J J

R A C I

are guaranteed to b e nonsingular Figure clearly illustrates this fact

Eects of biased shift

In previous subsections we have discussed our approaches to avoid the singular iteration matrix problem

Here we revisit a technique used in to address the misconvergence problem and the indenite iteration

matrix problem

Table and show the progress of seven eigensystem solvers We wanted

the smallest eigenvalue and the corresp onding eigenvector However none of them converges to the desired

eigenvalue This is a go o d example of the misconvergence problem mentioned on page The Newton

metho d converges quickly to an eigenpair close to the initial guess Obviously RQI has a dierent measure

of distance from the augmented Newton metho d see table and Nevertheless they do not converge

to the smallest eigenvalue Additional mo dication is needed to reach the desired solution

To make the eigensystem solvers converge to the smallest eigenvalue we b orrow a shifting strategy

from preconditioning see equation Since we only compute one eigenvalue no gap information is

available we will always use the residual norm as the error estimate instead of equation The new

iteration matrices are

J A kr kI

R

A kr kI x

J

A

T

x

T

xx A kr kI

I J A kr kI

O

T

x A kr kI x

T

J A kr kI xx

I

i kr k

i i i

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

Table The intermediate solutions of RQI with the biased shift on PLAT

i kr k

i i i

e

e

e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

Table The intermediate solutions of the NewtonOlsen recurrence with biased shift on PLAT

T

J A kr kI xAx

C

T T

J I xx A kr kI I xx

J

To simplify the equations we use to denote the biased estimate of eigenvalues in later discussions

Tables and show the results of nd an eigenpair of PLAT without shifting

and tables and show the results with the biased shift discussed b efore We

did not show the results from the augmented Newton metho d b ecause it did not converge with the biased

shift The shift has changed the algorithm enough to break it This may make it less favorable compared

to others The other six metho ds converge to the smallest eigenvalue Again we notice the dierence in

the magnitude of the condition numbers see table Another p oint to note here is that even with

biased shift the six metho ds also pro duces identical intermediate solutions

i kr k

i i i

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

Table The intermediate solutions of the normalized augmented Newton recurrence with biased shift

on PLAT

i kr k

i i i

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

Table The intermediate solutions of the inated Newton iteration with biased shift on PLAT

Even with the biased shift there is no guarantee that given an arbitrary initial guess any of the metho d

will converge to the smallest eigenvalue For example if the initial guess is an eigenvector not corresp onding

to the smallest eigenvalue all Newtontype metho ds will nd the initial residual is zero and stop Table

is a more ordinary example The Rayleigh quotient iteration with biased shift is applied to the EX with

T

the initial guess it converges to an eigenvalue much smaller than without the biased shift see

table However the eigenvalue it converges to is still far from the smallest one

For Hermitian eigenvalue problem the biased shift also cures the indenite iteration matrix problem of

regular Davidson preconditioning see page Because the biased estimated of the smallest eigenvalue is

usually less than the actual one J A kr kI is p ositive denite By construction J is indenite

R A

If J is p ositive denite J is p ositive semidenite and J is also p ositive denite The Jacobian matrices

R J I

for the NewtonOlsen recurrence and the constrained Newton recurrence still may b e indenite

i kr k

i i i

e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

e e e

Table The intermediate solutions of the constrained Newton recurrence with the biased shift on

PLAT

i kr k

i i

e e

e e

e e

e e

e e

e e

e e

e e

e e

e e

e e

e e

e e

Table The intermediate solutions of the JacobiDavidson recurrence with biased shift on PLAT

Figure plots the condition numbers shown in tables Compared with gure we see that

the conditioner numbers are smaller in gure The fact that the conditioner number of A kr kI is

less than that of A I can b e explained by the well known fact that converges faster than the residual

norm Since j j is less than kr k A I is closer to singularity than A kr kI Because A kr kI

is the dominant part of J J and J reducing the conditioner number J may lead to a reduction of their

A C I R

condition number as well

Characteristics of the Newton metho ds

Because of the equivalence prop erties the Newton metho ds describ ed actually b elongs to two categories the

Rayleigh quotient iteration and the augmented Newton metho d When used as preconditioner usually we

i kr k

i i i

e

e

e e

e e

Table The intermediate solutions of RQI with biased shift on EX

18 10

16 10

14 10

12 10

10 10

8 10 RQI

6 10 NANR DNR 4 10 CNR

NOR 2 10

2 4 6 8 10 12 14 16

Figure Condition numbers of the Jacobian matrix with biased shift on PLAT

only take one step of the Newton iteration in which case the augmented Newton metho d is equivalent to the

normalized augmented Newton metho d if the approximate eigenvectors are unitary Thus all eigensystem

solvers of interests are variations of the Rayleigh quotient iteration We have found six dierent way of

implementing the Rayleigh quotient iteration which op erate the following iteration matrices J J J

R A O

J J and J see page Out of these matrices we have shown that J J and J are nonsingular near

I C J I C A

convergence The next lemma shows how the sp ectra of some of them are related to each other

T T T T

Lemma For any vector y perpendicular to x y J y y J y y J y y J y

R I C J

T T

Pro of If y is p erp endicular to x x y y x

T T T T T

y J y y A I xx y y A I y y J y

I R

T T T T T T

y J y y A I y y xx A A y y A I y

C

T T T T T

y J y y xx A I I xx y y A I y

J

2

In the limit where x is the exact eigenvector this lemma shows that the sp ectrum of J J J and

R I C

J are only a simple sift of the sp ectrum of A except one eigenvalue corresp onding to In the case of

J

J and J is translated to zero ie they are singular J translates to J translates it to

R J I C

if the matrix A is Hermitian Since J is not of the same size as A it is not easy to characterize how its

A

sp ectrum relates to that of A We could image its eects is represented by J b ecause J can b e viewed as

I I

an approximate Schur form of J In short the new recurrences describ ed here avoid singularity in their

A

Jacobian matrices by transforming the zero eigenvalue into a dierent lo cation

We have addressed the problem of singular iteration matrix Another problem is also addressed at the

same time is the Linear Dep endent Basis problem see page By construction the NewtonOlsen recur

rence and the JacobiDavidson recurrence pro duce corrections that are orthogonal to current approximate

eigenvector x Because of the connection to the NewtonOlsen scheme the normalized augmented Newton

recurrence should also pro duce orthogonal corrections as well The constrained Newton recurrence and the

inated Newton recurrence do not have this orthogonality However the problem should b e less severe than

the original Davidson preconditioning scheme

The prop erties of the iteration matrices near convergence are imp ortant For practical use the prop erties

of the Jacobian matrices far from convergence are as imp ortant It is clear that the condition number of

A I is small when is far from any eigenvalue Therefore J is wellbehaved far from convergence

R

This fact can partly explain the success of the Davidson preconditioning scheme Since a small mo dication

can signicantly change the sp ectrum of a matrix it is hard to quantify the b ehavior of other Jacobian

matrices analytically We have seen one example where J has large condition number far from convergence

A

see gure or table This indicates that the normalized augmented Newton metho d is more likely to

encounter diculties than others From gures and we see that the condition number of the

Jacobian matrices from RQI the inated Newton recurrence and the constrained Newton recurrence are

close to each other far from convergence The conditioner number of PLAT is ab out The condition

number of EX is ab out Compared with this numbers the condition numbers of J J and J are

C I R

remarkably small at the b eginning of the Newton iterations see gures and

Newtontype preconditioners

In the framework of algorithm the ultimate purp ose of preconditioning is to generate a go o d basis V

m

The Davidson preconditioning scheme mentioned ab ove basically approximates the correction equation

In this section we search for new preconditioning scheme by discussing dierent ways of approximate the the

Newton recurrences describ ed in the previous section

The Rayleigh quotient iteration metho d is an eective metho d for nd one eigenvalue and its eigenvector

In order to build an eigensystem solver conform to the structure of algorithm we can simply

dene V to b e a basis spanning fx x x g For convenience of reference we will call this basis the

m m

Rayleigh quotient basis Because the optimality of the RayleighRitz pro jection the residual norm of the

solution found by the RayleighRitz pro jection on this Rayleigh quotient basis should b e no greater than the

residual normal of the solution x from the Rayleigh quotient iteration This simple observation

m m

indicates that if we approximate the Rayleigh quotient iteration as preconditioner for the Davidson metho d it

is p ossible to achieve cubic convergence rate for symmetric eigenvalue problems The Davidson preconditioner

computes z A I r to augment the current basis The Rayleigh quotient iteration based preconditioner

gives z A I x to augment the current basis One might exp ect this second form of preconditioner to

p erform b etter than the Davidson preconditioning b ecause the observed diculties related to the Davidson

preconditioning see page A small exp eriment indicates that the contrary is true see table

Table contains the number of matrixvector multiplications used to compute smallest eigenvalues

and their corresp onding eigenvectors of three diagonal matrices The test matrices are from HarwellBoeing

collection The three matrices used in this test are the three largest diagonal matrices in the collection

see table Diagonal matrices are used here b ecause it is easy to invert the preconditioners This small

preconditioner A I r A I x

BCSSTM

BCSSTM

BCSSTM

Table Number of MATVEC required to compute smallest eigenpairs

test indicates that the Davidson preconditioner is more eective This is b ecause A I x is badly scaled

and almost parallel to x

Note that the Rayleigh quotient iteration has most of the problems we have observed in the Davidson

preconditioning see page This test shows that a Newton type of preconditioner is b etter suited to b e a

preconditioner for the Davidson metho d In the remaining of this section we discuss how to turn the other

Newton metho ds for eigenvalue problems into preconditioners for the Davidson metho d

For all the Newton recurrence it is very simple to mo dify the preconditioning step of the Davidson

metho d to use the New preconditioning scheme In step diii of algorithm instead of solving z

diag A I r we simply compute

z J r

where J is one of the Jacobian matrices describ ed in the previous section Because of the equivalence

prop erties if the ab ove preconditioning step is solved exactly we should exp ect cubic convergence with

these new preconditioning schemes

Augmented Newton preconditioner The nonzero structure of J is a straightforward mo dication

A

to the nonzero pattern of A It should b e easy to mo dify most of the incomplete factorization schemes

to factorized J Because we always have unitary approximate eigenvectors the preconditioning step with

A

r

the augmented Newton preconditioner approximately computes J and discards the last element of the

A

solution The exact form of this preconditioner is the normalized augmented Newton recurrence

Olsen preconditioner The simplest way of realizing the Olsen preconditioning scheme is to implemented

it as a driver over existing Davidson preconditioners Denote the original preconditioner as M I and

let z M I r z M I x The Olsen preconditioning scheme computes

r x

T

x z

r

z z z

x r

T

x z

x

In exact form the ab ove formula can b e simplied slightly as follows

A I x

z A I r

T

x A I x

This leads to a mo died form of the Olsen preconditioning scheme

z

x

z z

r

T

x z

x

This new form is slightly simpler than equation but its output is no longer orthogonal to x if M A

We know the the jacobian matrix of the NewtonOlsen scheme is singular However implementing this

scheme either use equation or the matrix J is not directly used

O

Inated Newton preconditioner The Jacobian matrix for the inated Newton recurrence is J

I

T

A I xx This matrix could b e inverted through the ShermanMorrison formula in which case it will

lead back to an approximate form of the NewtonOlsen recurrence Since there is already two variations of

the Olsen preconditioning scheme we will nd a dierent way of using J A Matrixvector multiplication

I

with J is not much more complex than a matrixvector multiplication with A This suggests that a matrix

I

vector multiplication based preconditioner Common preconditioners based on matrixvector multiplications

include p olynomial preconditioners Krylov based iterative solvers and approx

imate inverse built by Krylov iterative solvers Because p olynomial preconditioners usually require

information ab out the sp ectrum of J which is unlikely to b e available they are not as easy to use as the

I

other choices Approximate inverse preconditioners are attractive b ecause they can b e ecient in parallel

environment However the algorithms for computing approximate inverses are not as widely available as

Krylov subspace metho ds For simplicity we will only use the Krylov iterative solvers eg CG GMRES

to test this preconditioning scheme

Constrained Newton preconditioner Similar to the Inated Newton preconditioner case the iteration

matrix J of the constrained Newton recurrence is again a rank one mo dication of J For a Hermitian

C R

T

eigenvalue problem multiplying the Jacobian matrix J A I xAx by a vector can b e easily

C

p erformed b ecause x and Ax are computed by the Davidson metho d in the pro cess of computing the

residual vector It is p ossible to design a sp ecial incomplete factorization metho d that factories J without

C

explicitly computing the whole matrix However the simplest option to approximate a constrained Newton

step is to use a Krylov subspace metho d as we have indicated in the inated Newton preconditioner case

T T

JacobiDavidson preconditioner The iteration matrix in this case is J I xx A I I xx

J

This matrix is more complex than J and J the only practical option of using this matrix is to use

C I

it in a matrixvector multiplication As b efore we have three choices p olynomial preconditioner Krylov

iterative solver and approximate inverse Since the matrix J is singular the approximate inverse algorithm

J

need to compute an approximation of the pseudoinverse How eective are current approximate inverse

algorithms in nding a pseudoinverse approximation is an op en question Usually the JacobiDavidson

preconditioner is solved with Krylov iterative solvers Because the righthand side r is in the range of

J the approximate solution generated by Krylov subspace iterative solvers and p olynomial metho ds are

J

r essentially approximations to J

J

In summary we have three dierent kinds of preconditioners incomplete LU factorizations on J Olsen

A

preconditioning schemes and iterative metho ds applied on linear systems with J J J or J

R I C J

Numerical Comparisons

This section contains a number of small examples to show how the dierent preconditioning schemes work

Due to the large number of p ossible implementations it is not p ossible to cover every case We have selected

to implement the following preconditioners

Incomplete factorization of J We decided to test the p otential b enet of augmented Newton

A

preconditioner with two incomplete LU factorizations ILU ILUTP ILU is an incomplete

factorization where the LU factors have the same nonzero pattern as the original matrix ILUTP

mo dies ILU in three ways elements of LU factors with small absolute values are dropp ed

a maximum number of nonzero elements in a row of LU factors ie level of ll is controlled by the

user column pivoting is p erformed if the diagonal element is signicantly smaller than another

element on the same row The ILUTP used has the level of ll equals to half of the average number of

nonzero elements p er row plus one in other word the ILUTP factorization can b e slightly larger than

ILU On average this ILUTP allows one more nonzero elements in b oth L and U than that of ILU

Elements in the incomplete LU factor that is less than of the row norm are discarded and

pivoting is p erformed if the absolute value of the diagonal element is less than of the largest one in

the same row

Olsen preconditioners As describ ed b efore we have two dierent implementation for the Olsen

preconditioner see equations and In each case one of the following three approximate

solution schemes is used to approximate A I the diagonal scaling ILU and ILUTP

Iterative solvers We have implemented CG and GMRES to work with J J J and J Two

R C I J

dierent kinds of tolerance schemes are used First a xed number of matrixvector multiplications and

a xed residual tolerance are used ie when either the residual norm of the linear system has reduced

PLAT EX

MATVEC timesec MATVEC timesec

NONE

diagonal

ILU

ILUTP

Table Conventional preconditioners applied to PLAT and EX

PLAT EX

MATVEC timesec MATVEC timesec

ILU

ILUTP

Table Augmented Newton preconditioners on PLAT and EX

b elow the tolerance or the number of matrixvector multiplications used is more than the maximum

sp ecied the iterative solver is terminated Second a dynamic tolerance on b oth the residual and the

matrixvector multiplication are used Each time an iterative solver is called the relative tolerance

is decreased by half until the minimum is reached The maximum number of matrixvector

multiplications increase by every step in the rst calls it only increase by every time after the

th call

Our rst set of tests is to apply Davidson metho d with ab ove preconditioning scheme on the two test

matrices used in the previous section The basis size for the Davidson metho d is We seek smallest

T

eigenvalues from each matrix with the same initial guess for the eigenvector In this exp eriment

the tolerance on residual norm is set to orders smaller than the The maximum number

of matrixvector multiplications allowed is Note that this is a very large number compared with the

matrix sizes The test is p erformed on a SPARC running at MHz

Table shows the results of using some of the common preconditioners used for iterative linear system

solvers We only used the Davidson metho d in this test The preconditioners used here were also used in

Chapter more information can b e found there It is hard to compute the smallest eigenvalues of PLAT

b ecause the smallest eigenvalues are clustered together It is easy to nd the smallest eigenvalues of EX

b ecause they are well separated Matrix EX has more than nonzero elements p er row which makes

solving with ILU preconditioners signicantly more exp ensive than diagonal scaling Even though the two

ILU preconditioners use signicantly less iterations to reach convergence than the diagonal preconditioning

case they still use more time than the diagonal scaling preconditioner Ironically the unpreconditioned

Davidson uses the least number of matrixvector multiplication and least amount time in this case

Table shows the results of using ILU factorizations on J as preconditioners Both preconditioners

A

were not able to help Davidson metho d to reach convergence faster Not even one eigenvalue has reached the

residual norm requirement within matrixvector multiplications for PLAT The Davidson metho d

with ILU on J as the preconditioner was able nd the ve smallest eigenvalues of EX However it to ok

A

signicantly more matrixvector multiplications and time than applying ILU on J A I This is largely

R

due to the zero diagonal elements in J which makes it indenite Part of this may b e also attributed to

A

the fact that the condition number of J can b e large when x far from any eigenvector see gure As

A

a preconditioner to the Davidson metho d the ILU on J works b etter than ILUTP on J on the EX

A A

matrix

Tables and show the results of using the two Olsen preconditioning schemes with three incom

plete factorizations of A I The Davidson metho d failed to nd any eigenpair of PLAT Thus we are

only left to compare the result of EX with table In this case it is clear that the original Olsen scheme

see equation improves the eectiveness of the three incomplete factorizations while the mo died

Olsen scheme see equation do es not This dierence could b e explained by the fact that the original

PLAT EX

MATVEC timesec MATVEC timesec

diagonal

ILU

ILUTP

Table The Davidson metho d with the Olsen preconditioning scheme

PLAT EX

MATVEC timesec MATVEC timesec

diagonal

ILU

ILUTP

Table The Davidson metho d with the mo died Olsen preconditioning scheme

Olsen scheme pro duces a result that is orthogonal to the approximate eigenvector while the mo died form

do es not pro duce a result with this orthogonality

Table shows some results of using the iterative linear system solvers as preconditioner The maximum

number of matrixvector multiplication the linear system solver can use is limit to The residual tolerance

is set to b e In this case we are able to nd the smallest ve eigenvalues for b oth PLAT

and EX When the matrix A is symmetric three out of the four Jacobian matrices J J and J are

R I J

symmetric The remaining one J must b e fairly close to b e symmetric in most cases b ecause the table

C

shows that CG p erforms fairly well on it For these reason later on we will only use CG as the solver

Comparing among the four dierent preconditioners the JacobiDavidson scheme uses the least amount of

time on PLAT the original Davidson scheme uses the least amount of time on EX

The contents of table is similar to that of table the dierence is that a dynamic tolerance is used

on the linear system solvers In most instances the dynamic scheme reduced the number of Davidson steps

taken to reach convergence But this achieved at the exp ense of using more matrixvector multiplications in

the preconditioners Comparing table and we see that this tradeo often leads to reduction of total

execution time andor total number of matrixvector multiplications For example in the xed tolerance

case the execution time with CG on J as preconditioner takes seconds to reach convergence for

R

PLAT in the dynamic tolerance case it reduces to seconds

PLAT EX

MATVEC MATVEC time MATVEC MATVEC time

Davidson total sec Davidson total sec

A I

CG

T

A I xx

CG

T

A I xAx

CG

GMRES

T T

I xx A I I xx

CG

Table Using iterative solvers with xed tolerance as preconditioners to solve PLAT and EX

PLAT

MATVEC MATVEC time

Davidson total sec

A I

T

A I xx

T

A I xAx

T T

I xx A I I xx

EX

MATVEC MATVEC time

Davidson total sec

A I

T

A I xx

T

A I xAx

T T

I xx A I I xx

Table Using CG with dynamic tolerance as preconditioner to solve PLAT and EX

Summary

In this chapter we discussed a number of Newton metho ds for eigenvalue problem Through the discussion we

identied pros and cons of the Davidson preconditioning scheme in connection to the correction equation In

order for the correction equation to generate nontrivial solution we need to use the pseudoinverse Computing

the pseudoinverse is an imp ossible task in most large scale applications However an approximate solution

can b e reached through Krylov subspace metho ds if the linear system is consistent Since the iteration

matrix A I is usually not exactly singular the JacobiDavidson preconditioning scheme forces it to b e

singular by explicitly orthogonalization From a small number of exp eriments we have conducted it app ears

that simply applying an iterative metho d on A I could work just as well as the JacobiDavidson scheme

in some cases

The correction generated by solving the correction equation with pseudoinverse is orthogonal to the

exact eigenvector The Olsen preconditioning scheme was designed explicitly to have this prop erty This

orthogonality is apparently imp ortant to the success of the Olsen scheme We tested a theoretically equivalent

form of Olsen scheme which do es not enforce this orthogonality prop erty see equation The test results

clearly indicate that the original Olsen scheme is more eective see table and

The thrust of this chapter is to nd a Newton iteration with a nonsingular Jacobian matrix in order

to avoid the diculty of op erating with a illconditioned linear system for preconditioning We identied

recurrences which could b e considered as Newton metho ds for eigenvalue problems Out of the the

augmented Newton metho d the JacobiDavidson recurrence and the NewtonOlsen recurrence have b een

published b efore The rest seems to b e new metho ds Even though the JacobiDavidson recurrence and the

NewtonOlsen recurrence are not new we presented them from a dierent p oint of view and proved their

equivalence to the Rayleigh quotient iteration A normalized augmented Newton metho d is derived through

a simple mo dication to the augmented Newton metho d which is also equivalent RQI b ecause it is simply

a dierent implementation of the NewtonOlsen recurrence Among the various Jacobian matrices we have

identied of them namely J J and J to b e nonsingular near convergence When the Ritz pairs are

A C I

not close to any eigenpairs our exp eriments show that J J and J tend to have fairly small condition

R C I

numbers

We have implemented a number of the preconditioners and conducted a small number of tests on them

The exp eriment reveals that using CG to solve the Newton recurrences as preconditioners p erforms well

compared to other schemes

Chapter

Practical Davidson Algorithm

In previous chapters we identied the Arnoldi metho d and the Davidson metho d to b e the most promising

eigensystem solvers for our application among the metho ds compared The overall ob jective of this study is

to construct a preconditioned eigensystem solver how well a metho d works with preconditioner is imp ortant

to us Compared with the dierent variations of the Arnoldi metho d the Davidson metho d is usually

b etter in taking advantage of preconditioners Though we will not completely abandon other p ossibilities

the primary fo cus of this chapter is to discuss practical issues related to enhancing the p erformance of

the Davidson algorithm for symmetric eigenvalue problems In the implementation of dierent eigenvalue

metho ds the techniques discussed here will b e applied to all metho ds wherever applicable not just to the

Davidson metho d

Briey the eigenvalue problems we are facing have the following characteristics The matrices are large

sparse and Hermitian For the moment we will only consider real matrices therefore the matrices are

symmetric These matrices are not stored explicitly b ecause the eigensystem solvers under consideration

only need to access them through matrixvector multiplications which can b e conveniently p erformed with

the matrix stored in compact forms In addition cheap and eective preconditioners can b e constructed for

the problem without direct reference to the matrices A typical eigenvalue problem might require the solution

of a few hundred of the smallest eigenvalues and corresp onding eigenvectors of a matrix

Several sequences of eigenvalue problems may b e solved in one simulation where dierences b etween two

consecutive matrices diminish toward the end of each sequence This makes the solution of the pro ceeding

eigenvalue problem a go o d initial guess for the next one Therefore the eigenvalue solver should b e able to

take advantage of available initial guesses

There are many issues which need to b e addressed in constructing a practical Davidson eigensystem solver

many of which have b een discussed in a review pap er by Davidson Here we will only add a

few remarks on some recent work that have strongly inuenced our research One fo cal p oint of our research

is the preconditioning issue which has b een reviewed in previous chapter Another signicant development

in eigenvalue metho ds is the restarting techniques Recent introduction of the Implicit Restarted Arnoldi

IRA metho d has sparked renewed interest in restarting Two restarting schemes for the Davidson metho d

are of particular of interests here The rst one is due to Murray and colleagues which saves Ritz

vector of previous iteration in addition to the latest Ritz vectors The second one called dynamic thick

restart is from Stathop oulos and coauthors It dynamically decides the number of Ritz vectors to save

when restart Both of these schemes have b een proven to work well in tests We will study how will they

work in computing a large number of eigenpairs The computing environment has changed signicantly since

the introduction of the Davidson metho d Part of this change is the emergence of parallel computers There

are a number of references on parallel and distributed Davidson metho d Many issues related to

parallel implementation are not unique to the Davidson eigenvalue metho d there are many publications

devoted to parallel and distributed computing issues which address a number of imp ortant asp ects of a

parallel Davidson metho d The publications that are most directly related to our work are on

overall program structure for sparse computation on matrixvector multiplications design and

on parallel preconditioning

The particular issues most imp ortant to our application includes how to compute a large number of

N NNZ Eigenvalues

S i

S i

S i

Table Size of the test problems

eigenpairs eciently how to maintain a go o d orthonormal basis when there are many vector to orthogonalize

and how to restart This chapter will address these issues one by one

To test the techniques numerically we will use a set of three small test matrices generated from the

electronic structure simulation pro ject see table These three matrices are generated from the rst self

consistent iteration in the simulation of a twosilicon cluster S i a foursilicon cluster S i and a sixsilicon

cluster S i The wavefunction is taken to b e zero atomic units away from the atoms The core radius

for nonlo cal interaction is atomic units and the grid size is atomic unit Table shows the size

of the matrices extracted and the number of eigenvalues to b e computed The average number of nonzero

elements p er row ranges from for S i to for S i For larger problems we exp ect this average to b e

slightly ab ove The number of eigenvalues to b e computed here is much more than needed to simulate

structures indicated However the ratio of eigenvalues over the matrix size is close to typical applications

see table An eigenpair is considered converged if its residual norm is less than The maximum

number of matrixvector multiplications allowed is p er eigenpair On average less than matrixvector

multiplications are needed p er eigenpair in most of our simulations so this is a reasonable upp er b ound

In all exp eriments the maximum basis size is limited to Unless otherwise sp ecied Ritz vectors

corresp onding to smallest Ritz values are kept for restarting

Program structure

After reviewing the basic op erations p erformed in the eigenvalue metho ds considered we noticed that they

have the same basic blo cks identied for Krylov linear system solvers They are

SAXPY dotpro duct matrixvector multiplication preconditioning

To simplify the transition from a scalar program to a parallel program we implement our eigenvalue

routines in the SingleProgram MultipleData SPMD paradigm Each long column vector is divided

evenly among the pro cessors if there is more than one Excluding the matrixvector multiplication and the

preconditioner the only dierence in the b o dy of the eigensystem solvers b etween a scalar version and a

T

parallel version is the dotpro duct x y In the parallel program the dotpro duct can b e p erformed

in two step rst nding the dotpro duct of comp onents of the long vectors that are on the same pro cessor

then add up the partial sums from dierent pro cessors and return the result to everyone This second step

is called a global sum On most distributed environment the global sum is supp orted as one of the primitive

functions The global sum op eration is often considered as an exp ensive op eration b ecause all pro cessors

involved have to wait for each other If one pro cessor is lagging b ehind for any reason the rest of the

pro cessors will b e slowed down To reduce the number of times the global sum op eration is p erformed a

number of the dot pro ducts may b e lump ed together and their partial sums are communicated in one global

sum op eration

One of the central steps of the eigenvalue metho ds under consideration is the matrixvector multiplication

op eration Before building a eigenvalue routine for sparse matrices a decision has to b e made on the storage

format of sparse matrices Many storage formats exist and in many cases it is most

convenient to store the matrix in a sp ecialized compact form see Chapter One metho d of isolating the

dep endence of sparse matrix storage format from the eigenvalue routine is to use reverse communication

Since the eigenvalue algorithms under consideration only need to access matrices through matrix

vector multiplication the reverse communication technique enables us to remove the eects of sparse matrix

storage format from b o dy of the eigenvalue routine This datahiding feature is an imp ortant characteristic

of Ob jectOriented programming By removing the details of sparse matrix storage problem from the

b o dy of the eigenvalue routine it also signicantly simplies the implementation

An additional advantage of using reverse communication is that the preconditioning step can p erformed

outside of the core of the eigensystem solver as well This gives us the exibility of switching b etween

arbitrary preconditioners which makes testing of preconditioners considerably easier

With reverse communication an eigensystem solver may return to the caller for one of three reasons

termination request for matrixvector multiplication or request for preconditioning The reverse commu

nication proto col we use is close to that of A parameter array ipar is used to carry the reverse

communication information b etween the eigenvalue routine and the caller Here is a piece of pseudoco de

to demonstrate the design of the reverse communication proto col The eigenvalue routine is named pesdvd

the matrixvector multiplications routine is matvec and the preconditioner is preconditioning

call pesdvd ipar

if ipar then

call matvec

goto

else if ipar then

call preconditioning

goto

endif

Note that using for matrixvector multiplication and for preconditioning is exactly the same as the

design of P SPARSLIB

Computing large number of eigenvalues

The application of interests requires large number of eigenpairs it is crucial for the eigensystem solver to

b e able to nd a large number of eigenpairs eectively This is a ma jor issue we will address now

There are a number of guidelines for considering solution to this problem First the workspace size

should b e indep endent of the number of eigenvalues wanted Second all initial guesses provided should b e

fully utilized One technique that is most imp ortant to realizing these two requirements is called locking

When computing a large number of eigenpairs some of them will reach convergence earlier than others

The pro cess of taking the converged ones out of the working set is referred to as lo cking Once the converged

ones are taken out their corresp onding eigenvectors are used in the orthogonalization pro cedure to make sure

the active basis vectors are p erp endicular to them This is part of step in algorithm which orthogonalize

the new vector to converged eigenvectors and existing basis vectors Let m b e the basis size and p is the

size of the working set of approximate eigenpairs If the size of the working set is small say p it is

p ossible to keep m relatively small say m therefore requiring only a small work space Once some of

the p Ritz vectors reach convergence they are taken out of the working set and new initial guess are put

into the working set to replace the converged ones This pro cess rep eats until enough eigenpairs are found

In the Davidson metho d at least one residual vector is computed at every step Let V denote the

basis vectors computed by the Davidson metho d If the basis size m is small it is often necessary to save

W AV in memory and compute the residual vector as r V y W y where y is an chosen eigenpair

T

of H V AV In this case only one matrixvector multiplication is required to compute the Ritz pair and

its residual vector after a new vector is added to the basis If the basis size is large computing W y may b e

more exp ensive than p erform a matrixvector multiplication in which case two matrixvector multiplications

T

p er step would b e used one to compute a column of V AV one to compute the residual

Without lo cking it would b e necessary to keep all the converged eigenvectors in the basis as in

The advantage of this scheme includes the larger basis size which could lead to faster convergence in some

cases In addition the accuracy of eigenpairs converged earlier will b e improved continuously which lead to

smaller residual norms on some eigenpairs In the type of eigensystems of interest the number of eigenpairs

wanted is exceedingly large say which makes lo cking a necessity Maintain orthogonality among

vectors is a dicult task Perform RayleighRitz pro jection on a basis of size is exp ensive b oth in terms

of memory requirement and arithmetic op eration count Because the basis size is so large it is impractical

to save AV In this case two matrixvector multiplications are needed at every iteration of the Davidson

metho d In the distributed environment the matrixvector multiplication is the communication intensive

op eration Using two matrixvector multiplications p er step is not a desirable option

Another alternative could b e that we keep work on all the unconverged eigenpair at once and lo ck the

converged ones as they b ecome ready This was suggested in One advantage of this alternative is that

the basis can b e stored in the same space reserved to store the eigenvectors Since the number of eigenvectors

wanted is very large the basis size may b e to o large initially If less than a few hundreds of eigenpairs are

wanted then it is p ossible that the advantage of working with larger bases outweigh the disadvantages

Because we design our program to compute more than a thousand eigenpairs we have not favored this

option

If a group of eigenvectors is b eing pro cessed then a group of initial guesses is available at the start of the

Davidson iterations We could use a blo ck version of the Davidson metho d for example the DavidsonLiu

variant However the blo ck version is often not the most eective approach In addition the number

of vectors in the working set is usually to o large to use as blo ck size see discussion ab out the blo ck size

in section In it was suggested that the basis V should b e extended one vector at a time The

following algorithm takes all of these issues into consideration and makes two additional mo dications to

algorithm

It starts to build the basis V with b starting vector see also algorithm

it computes more than one Ritz pair in the RayleighRitz pro cedure In the following algorithm it is

referred to as the active window size p

Both of these mo dications were mentioned in The following algorithm combines them to compute a

large number of eigenpairs by working on a small active set at a time

Algorithm Restarted Davidsons algorithm for nding the d smal lest eigenvalues and correspond

ing eigenvectors x i d of a symmetric matrix A

i i

Start Choose d initial vectors x x x Choose the active window size p to be an integer less

d

than m say p m Let c

Iterate to build an orthonormal basis V v v v that is orthogonal to X x x x

m m c c

where c is the number of eigenpairs converged

Fil l array Z with rst p available unconverged eigenvectors If there are less than p eigenvectors left

reduce p to the actual number of eigenvectors remained Choose an initial basis size b l l columns

p to b of Z with Ritz vectors of previous iteration if available Let b to be the actual number of

column vectors in Z Let j

a Orthogonalization

T T

Z I V V X X Z

j c

j c

QR Z

v v Q

j j b

b w Av i j j b

i i

T

c h v w for i k k j j b

ik k

i

j j b b b

d if j m then perform preconditioning

T

i compute the smal lest eigenvalue and corresponding eigenvector of H V W say y

j j

j

ii r W y V y where V v v W w w

j j j j j j

iii Z diagA I r

Arnoldi Orthogonalized Davidson Harmonic

d MATVEC time MATVEC time MATVEC time MATVEC time

Table Finding dierent number of eigenvalues from S i matrix

Arnoldi Orthogonalized Davidson Harmonic

MATVEC time MATVEC time MATVEC time MATVEC time

Table Increase in MATVEC and time versus increase in number of eigenvalues

iv Goto a

else continue to next step

RayleighRitz pro cedure Let H h i p be the p smal lest eigenvalues of

m ij mm i

H y be the corresponding eigenvectors ie H y y Let Y y y

m i m i i i p

V V Y W W Y H diag

p m p m p p

The new approximation to the eigenvalue and the eigenvectors are

i min p d c

ci i

x v

ci i

r w x

ci i ci ci

Convergence test

If kr k increment c by and test next Ritz pair

c

If c d return to step

This algorithm computes d eigenvalues by working on p of them at a time which is an extension to what

is prop osed in where the algorithm would only compute p eigenvalues It should b e trivial to extend

the Arnoldi metho d or any other eigenvalue metho d mentioned in the previous chapter to compute large

number of eigenpairs in a similar fashion

In the RayleighRitz pro cedure of the ab ove algorithm we order the eigenvalues of H from small to

m

large In which case the p smallest eigenvalues are simply the rst p eigenvalues If the eigenvalues are in

descending order instead the ab ove pro cedure can b e used to compute the largest eigenvalue Similarly if

the eigenvalues of H are ordered according to the distance from a given number the ab ove algorithm can

m

b e used to compute eigenvalues around that chosen number

Table shows the number of matrixvector multiplications and execution time of four dierent eigen

value metho ds namely the Arnoldi metho d the orthogonalized Arnoldi metho d the Davidson metho d and

the harmonic Davidson metho d see Chapter for more detailed description The MATVEC column shows

the number of matrixvector multiplications used The time column shows the execution time in seconds

Only the S i matrix is used in this test No preconditioning is used Other parameters used are as follows

m p b b

Arnoldi Orthogonalized Davidson Harmonic

MATVEC time MATVEC time MATVEC time MATVEC time

No preconditioning

S i

S i

S i

Diagonal scaling

S i

S i

S i

SOR preconditioner

S i

S i

S i

ILU preconditioner

S i

S i

S i

Table The eigenvalue routines with dierent preconditioning

On average when the number of eigenvalues doubles the number of matrixvector multiplications in

creases ab out times the CPU time increases ab out times see table Overall the increase in

time and matrixvector multiplications is slower for the Davidson metho d compared to the other three It

is interesting to note that the Arnoldi metho d uses the least amount of time to compute eigenvalues the

orthogonalized Arnoldi metho d uses the least amount of time to compute eigenvalues and the Davidson

metho d uses the least amount of time to compute eigenvalues Since no preconditioning is used the four

metho ds shown in table are equivalent to each other Because the Arnoldi metho d is the least exp ensive

p er step it usually uses the least amount of time to execute the same number of steps When the number

of eigenpairs wanted is small there are less opp ortunity for roundo errors to ero de the stability of the

Arnoldi metho d Thus the number of steps taken by the Arnoldi metho d and others are almost the same In

this case the Arnoldi metho d will use less time to compute the same solution The orthogonalized Arnoldi

metho d is more exp ensive and numerically more stable than the Arnoldi metho d In turn the Davidson

metho d is more exp ensive than the orthogonalized Arnoldi metho d and even resistant to roundo errors

which might imp ede the convergence When computing eigenpairs the Davidson metho d uses signicantly

less matrixvector multiplications than other three This test indicates that as we increase the number of

eigenvalues computed the Davidson metho d b ecomes more comp etitive compared with the Arnoldi metho d

Table shows how the eigenvalue routines work with dierent preconditioners The maximum number

of matrixvector multiplication p er eigenpair If the MATVEC column has a negative number it indicates

the metho d did not reach convergence on all wanted eigenvalues within the allowed number of matrixvector

multiplications The absolute value of the number in the cell indicates the number of eigenvalues actually

converged The corresp onding time column is the time used For example with SOR preconditioner the

Arnoldi metho d computed only eigenpairs of the S i matrix using matrixvector multiplications The

corresp onding time seconds is the time taken to computed these eigenpairs In every case where

more than one metho d reached convergence the Davidson metho d uses less time than the others The results

in this table mirrors what we have observed with the HarwellBoeing test matrices in the previous chapter

The preconditioners do not improve the p erformance of the eigensystem solvers except the Davidson metho d

Even for the Davidson metho d only the SOR preconditioner uses less time on the S i case compared to the

unpreconditioned case A b etter preconditioner is necessary for this type of eigenvalue problems

Restarting

The Lanczos metho d the Arnoldi metho d and the Davidson metho d all require restarting after a certain

number of steps The same reasons will also require restarting of algorithm b ecause as the number of

steps increases the space required to store V increases prop ortionally This creates many diculties such as

j

storage problem and loss of orthogonality As j increases the complexity of GramSchmidt orthogonalization

pro cedure increases as Onj The complexity of p erforming b oth RayleighRitz pro jection and harmonic

Ritz pro jection increase as Oj For these reasons the maximum basis size is usually very small In most

cases the maximum basis size we use is only or

The restarting issue has b een addressed in many pap ers One natural choice is to start

with Ritz vectors This is what is done in algorithm In Davidson suggested that one should save

more than d Ritz vectors if d eigenvectors are sought Exactly how many was a question left unanswered

Observations of the convergence prop erties of the Conjugate Gradient metho d motivated the authors of

to save the Ritz vectors from the last two steps as the initial guesses This scheme is quite successful in some

cases A relatively new scheme named Implicit Restarting due to Sorensen restarts with a set of Schur

vectors rather than Ritz vectors One advantage of using Schur vectors instead of Ritz vectors is that

Schur vectors always exist even for defective matrices but Ritz vectors do not In addition Schur vectors

can b e real in case the matrix has complex eigenvalues or eigenvectors Both of these advantages are only

applicable if the matrix is nonsymmetric In the symmetric or Hermitian case a Schur vector dier from a

Ritz vector only by a scalar constant For this reason only Ritz vectors are considered for restarting in

the symmetric case

Before discussing the details of our restarting scheme we review guidelines for restarting

The rst and most imp ortant requirement on restarting scheme is that the quality of the approximate

solution should not decrease after restarting For Hermitian eigenvalue problems residual norm is a

go o d measure of the quality of approximation in which case the residual norm should not increase

after restarting The simplest way of achieving this is to include the latest Ritz vectors in the basis

see algorithm and

Preserve the Ritz vectors near the wanted eigenvectors at restart This was suggested as a feature

for the Davidson metho d A formal justication for this technique on the Arnoldi metho d was

given by Morgan Keeping Ritz vectors with Ritz values close to the desired one has the eect

of increasing the separation b etween the wanted eigenvalue and the rest of the sp ectrum therefore

increasing the convergence rate A variation of this scheme saves Schur vectors instead of the Ritz

vectors is implemented in ARPACK

Preserve enough information to prevent rep etition Without preconditioning many eigenvalue metho ds

are essentially p erforming RayleighRitz pro jection on an orthonormal basis generated from a p ower

series These metho ds always favor the extreme eigenvalues if some Ritz vectors corresp onding to

extreme eigenvalues are discarded they will likely to reapp ear Thus one technique for avoiding

rep etition is to keep the Ritz vectors corresp onding to the extreme eigenvalues in the basis In a

technique for preserving unwanted eigenvectors is discussed In the linear system case the Augmented

Krylov subspace technique is based on the same principle Based on the threeterm recurrence

of the Lanczos metho d Murray and coauthors suggest preserving Ritz vectors from an earlier iteration

can avoid rep etition

Restart based on convergence predictions rather than heuristics for example the dynamic thick

restart scheme

The intuition on the number of restarting vectors is that the more information saved the more likely

later iterations will not rep eat the previous work the more steps taken b efore restarting the more new

information can b e incorp orated into the solutions ie more accurate the next approximation solution will

b e In considering the restarting scheme we assume the goal of the eigensystem solver is to make one

eigenpair converge at any given time With this in mind the question on what to do to restart includes two

comp onents

how many nearby Ritz vectors to save

whether or not to save the Ritz vector of previous iteration

On the rst question until recently the general guideline has b een to keep all the Ritz vectors corresp onding

to wanted eigenvectors plus a small number of near by ones Recently Stathop oulos and colleagues

prop osed a dynamic scheme to determine the number of Ritz vectors to b e saved The prop osed scheme

is based on the observation that the saved Ritz vectors in the basis approximately deate the sp ectrum

To determine how many Ritz vectors to save the dynamic thick restarting scheme uses the Ritz values to

approximate the gap ratio after deation Saving many vectors for restarting is called thick restart

This section describ es a mo died form of the dynamic thick restarting scheme

To simplify the description we will only describ e the case where the smallest eigenvalue is wanted It is

straightforward to extend this to the case where more than one eigenpair is sought or the largest eigenvalue is

needed In it was shown that the convergence rate of the thickrestarted Arnoldi metho d is approximated

prop ortional to assuming is the smallest eigenvalue is the largest eigenvalue

t n n

and t is the number of nearby Ritz vectors saved at restarting The sup er script indicates they are exact

eigenvalues Since the exact eigenvalues are unknown the gap ratio is approximated as follows

t m

where m is the basis size ie the number of Ritz values computed by the RayleighRitz pro jection In

this case only the Ritz vectors corresp onding to the t smallest Ritz values are saved In the authors

extended this to saving Ritz vectors corresp onding to b oth the largest and the smallest Ritz values If

t Ritz vectors corresp onding to the smallest Ritz values and t Ritz vectors corresp onding to the largest

L R

Ritz values are saved at restart the convergence rate of Arnoldi metho d is exp ected to b e prop ortional to

In this estimate of the gap ratio is used to determine the optimal

t mt

L R

t and t This strategy can b e used for all pro jection eigensystem solvers including the Davidson metho d

L R

and the implicitly restarted Arnoldi metho d The reduction in the number of matrixvector multiplications

with this dynamic restart scheme is signicant in many cases

If a total of t t t Ritz vectors are saved for restarting computing these t Ritz vectors from V

L R m

requires roughly tmn oatingp oint op erations or FLOP for short Note that n is the dimension of the

vectors We represent this step by V V Y where Y contains the selected eigenvectors of H To

t m m

avoid matrixvector multiplications we can also recombine columns of W to form the rst t columns of

m

W AV W W Y This again requires ab out tmn FLOP The residual vector corresp onding to the

t t t m

smallest Ritz value needs to b e computed for testing convergence which needs ab out n FLOP The total

oatingp oint op eration count for extending the basis size from t to m is ab out

m

X

j n

j t

excluding the matrixvector multiplications and preconditioning op erations The average FLOP count p er

step is ab out

mt n

m tn n opA opM

m t

where opA denote the op eration count in a matrixvector multiplication and opM denotes the op eration

count in a preconditioning op eration Based on equation the larger t is the faster the eigenvalue

metho d will converge However the larger t is the more exp ensive an average Davidson iteration is To

achieve convergence in the shortest amount of time it is necessary to balance the two factors

Note that when t is close to m the estimated by equation b ecomes invalid However since the

dynamic scheme is shown to b e eective by numerical tests we will continue to use this formula One

slight mo dication to equation is that instead of use computed at the current step the maximum

m

of all ever computed is used This mo dication makes the computed slightly closer to the actual gap

m

ratio

When restarting to build a new basis if it is not the rst time step is entered there is a signicant

amount of information not utilized in Algorithm For example the vectors in V are orthonormal so it

Arnoldi Orthogonalized Davidson Harmonic

MATVEC time MATVEC time MATVEC time MATVEC time

gap ratio only

S i

S i

S i

gap ratio and work

S i

S i

S i

Table Eects of dynamic restarting on the test problems

is not necessary to orthogonalize them again If no Ritz pair reached convergence in the previous iteration

there are p Ritz vectors at the b eginning of array V see step If the rst p columns of V do not need

to go through the orthogonalization step then the corresp onding columns of W need not undergo step b

either Similarly step c can b e skipp ed b ecause step has put the appropriate data in H If some Ritz

pairs reached convergence we could place the unconverged Ritz vectors at the b eginning of V and p ermute

W and H to avoid p erforming step ac on these unconverged Ritz vectors

Table shows the results of the two dynamic restarting schemes applied to the test problems The

upp er half of the table shows the results of using the dynamic thick restart Generally sp eaking this scheme

based on gap ratio tends to decrease the total number of matrixvector multiplications required It do es so

by saving most of the Ritz vectors when restarting In fact a upp er b ound on the thickness t is needed to

prevent it from b ecome m For example if the basis size is this dynamic scheme do esnt allowed more

than Ritz vectors to b e saved Most of the time this scheme will save the maximum allowed number of

Ritz vectors This metho d is exp ensive b ecause recombining the basis to form a large number of Ritz vectors

is an exp ensive op eration

The second scheme we implemented tries to nd the maximum value for the following quantity

log F LO P n

where F LO P refers to equation Table shows that this new scheme generally uses more matrix

vector multiplications than the xed restarting scheme and the dynamic thick restart from Because

it restarts with fewer Ritz vectors it is less exp ensive p er matrixvector multiplication than the other two

From the table we can see that even though more matrixvector multiplications are needed it still uses less

time in many case

Comparing table with the unpreconditioned case in table we notice that the dynamic scheme

reduces the total execution time in the S i case but in the S i case the xedsize thick restart scheme uses

less time than b oth dynamic restarting schemes in computing eigenpairs

The previous Ritz vector is app ended to the regular Ritz vectors saved for restarting if it is used For

example if the restarting scheme decides to save Ritz vectors the Ritz vector of the previous step is

app ended as the th vector in V Let y b e the eigenvector of H corresp onding the smallest eigenvalue

m

of H This th vector is V y

m m

When the old Ritz vector is used in the basis this old Ritz vector is not orthogonal to the p new Ritz

vectors A fairly inexp ensive way of ensuring the new basis is orthonormal is to ensure that y y

p

is orthogonal to y Since y has one element less than y y we need to augment y to the same

p

y

size as the others After this the GramSchmidt pro cedure can b e applied to make orthogonal to

y y One characteristic of this orthogonalization worth mentioning is that it is p erformed on a

p

short vector in other word it is inexp ensive In addition no data communication is necessary in distributed

y

T

computing environments Denote the resulting vector by y y I Y Y where Y y y

a a p p p

p

y

T

kI Y Y k This vector can b e app end to Y and used in step of algorithm as if p has b een

p p

p

incremented by V and W in equation can b e computed as follows

p p

V V y y y W W y y y

p m p a p m p a

Arnoldi Orthogonalized Davidson Harmonic

MATVEC time MATVEC time MATVEC time MATVEC time

thick restart

S i

S i

S i

dynamic restart gap ratio

S i

S i

S i

dynamic restart gap ratio and work

S i

S i

S i

Table Eects of saving previous Ritz vector on the test problems

T

By denition H V AV since the rst p vectors in V are Ritz vectors the corresp onding part of

p p

p

H is diagonal only the last column needs sp ecial attention Let v V y where V denotes the rst

p p m a p

p columns of V the rst p elements of the last column of H is

p p

B C

T T T T T

V Av Y V AV y Y H y Y y

A

p m a m a a

p p m p p

p

Thus the matrix H is also diagonal the last diagonal element is a Rayleigh quotient of v

p p p

T T

H y Av y v

m a p

a p

Table shows the results of saving one previous Ritz vectors in addition to the normal restarting scheme

No preconditioning is used here in this test In many cases the same number of matrixvector multiplications

are needed with previous vector compared to not using the previous Ritz vector The observation here is that

the previous Ritz vector do es not help the type of eigenvalue problem of interest This could b e explained as

the highend of the sp ectrum is not signicantly easier to compute than the smallest ones For example to

compute the largest eigenvalues of S i it takes the Davidson metho d matrixvector multiplications

which is ab out two thirds of the number of matrixvector multiplications needed to compute smallest

eigenvalues Thus the discarded Ritz vectors corresp onding to the largest eigenvalues will not emerge very

quickly

Orthogonalization

The most straightforward way of implementing the orthogonalization step of the eigenvalue algorithm is

the Classic GramSchmidt CGS pro cedure

Algorithm The classic GramSchmidt pro cedure to make a new vector z orthogonal to an or

thonormal set of vectors v v v

j

T

c v v v z

j

z z v v v c

j

This algorithm is simple but exp erience shows that it is not very stable A more stable alternative is

the following Mo died GramSchmidt MGS pro cedure

Algorithm The mo died GramSchmidt pro cedure to make a new vector z orthogonal to an

orthonormal set of vectors v v v

j

For i j do

T

z c v

i

i

z z v c

i i

The main dierence b etween the CGS and MGS is that CGS p erforms the j dotpro ducts together while MGS

do es it one after the other On distributed environment a global sum op eration needs to b e inserted b etween

step and of the ab ove two algorithms As have discussed b efore the global sum op eration requires every

pro cessor to synchronize ie there is a signicant startup time asso ciated with each global sum op eration

The Classical GramSchmidt pro cedure allows one to combine j dotpro ducts to reduce communication time

by reducing the number of times the global sum op eration is p erformed Therefore CGS is more suited for

distributed environment To increase the quality of the solution from CGS reorthogonalization is p erformed

under certain conditions The CGS with reorthogonalization can b e stated as follows

Algorithm The Classic GramSchmidt pro cedure with reorthogonalization to make a new

vector z orthogonal to an orthonormal set of vectors v v v

j

kz k

T

c v v v z

j

z z v v v c

j

if kz k go to step else continue

z z kz k

It is clear that the larger the is the more likely it is that the GramSchmidt pro cedure is rep eated Any

number b etween and can b e considered as reasonable We use throughout this thesis unless

otherwise sp ecied A normalization step is app ended to this algorithm b ecause this pro cedure is intended

to generate an orthonormal basis

To increase the eciency of the program on parallel environment algorithm can b e slightly mo died

to remove the computation of kz k at step The vector c in the following algorithm is a short vector that

T

resides in every pro cessor thus computing c c do es not involve global communication The test in step in

the following algorithm is exactly equivalent to step in algorithm b ecause v v v is orthonormal

j

Algorithm The classic GramSchmidt pro cedure with reorthogonalization to make a new

vector z orthogonal to an orthonormal set of vectors v v v

j

T

c v v v z

j

z z v v v c

j

T T

if c c z z go to step else continue

T

z z z z

A natural way of extending algorithm to handle multiple vectors is to orthonormalize each new vector

one after another with the newly orthonormalized vector app ended to v v v immediately This

j

requires at least two global sums for each new vector Since all the new vectors are available at the same

time it is trivial to orthogonalize the new vectors against the old ones by extending algorithm to a

blo ck version As is suggested in algorithm the normalization step of the ab ove algorithm should b e

replaced by a QR factorization Denote the new vectors by Z Z z z z To fully exploit dierent

b

nb

p ossibilities consider a more general decomp osition Z QB where Q R is an orthonormal matrix

bb T

and B R is an arbitrary real matrix Let S Z Z To compute Q ZB we need to compute

B If Z is full rank S is a symmetric p ositive denite matrix A Cholesky factorization can b e used to

compute an upp er B In general S is symmetric p ositive semidenite a more stable

T

decomp osition is the eigenvalue decomp osition ie S Y LY where columns of Y are eigenvectors of S

and L is a diagonal matrix where each diagonal element is an eigenvalue of S Denote the eigenvalue by

L i b Assume there are r nonzero eigenvalues and they are in descending order ie

i i ii

For stability reason Q is actually computed by replacing B

r r b

with the nonsingular part of the pseudoinverse B ie

B C

B C

B C

Q ZY

B C

A

r

Now that Q has only r columns We will replace the rst r columns of Z with them In most cases the

eigenvalue routine can continue with only one new vector added to the basis thus as long as r the

orthogonalization can stop

The pro cess just describ ed essentially computes a SVD decomp osition of Z Z QL Y The

approach taken here is not a stable technique for computing the SVD decomp osition the main reason for

using it is to reduce interprocessor communication in distributed environment Using bit IEEE arithmetic

the unit round o error is ab out any eigenvalue of S that is smaller than should b e

considered unreliable This algorithm uses ab out twice as many oatingp oint op erations as that of a Gram

Schmidt approach If the new vectors are known to b e linearly indep endent then the Cholesky factorization

of S can b e p erformed to reduce the complexity of computing ZB Even though the Cholesky approach

is less exp ensive in oatingp oint arithmetic op erations it is still more exp ensive than the GramSchmidt

approach This is the price for reducing data communication In most cases this orthogonalization pro cedure

receives only one new vector at a time in which case this SVD pro cedure reverts back to simply normalizing

the vector z as in step of algorithm

The columns of Z may b e badly scaled in which case the accuracy of the eigenvalues of S could b e

compromised To avoid this diculty the diagonal elements of S are scaled to through a symmetric

scaling that is equivalent to scaling the columns of Z to norm Let Y denote the rst r columns of Y It

r

is small If is clear that ZY has r orthogonal columns the norm of these columns are

r

r

i

which means the b unit vectors combine to form a small vector the cancelation error may b e serious in this

case It is necessary to recompute this column of Q

A blo ck version of the GramSchmidt pro cedure can b e describ ed as follows

Algorithm Blo ck GramSchmidt pro cedure for orthonormalizing Z z z against V

b j

T

v v where V V I

j j

j

T

C V Z

j

perform global sum on al l elements of C

Z Z V C

j

T

S Z Z

perform global sum on al l elements of S

let c denote the ith column of C s denote the element of S at the ith row and j th column and set

i ij

ag REORTH to

for i b

T

if c c s REORTH

i ii

i

B C B C

i b S let s S

A A

i

ii

b b

T

eigenvalue decomposition S Y LY

vectors CGS SVD

Table Execution time of the orthogonalization routines

CGS SVD

S i

S i

S i

Table Execution time of the Davidson metho d with dierent orthogonalization scheme

assume the rst r eigenvalues of S are nonzero and Y y y be the corresponding

r r r

eigenvectors replace the rst r columns of Z with

C B B C

Z Y

A A

r

r

b

if REORTH stop else go back to step

Table shows the time orthogonalize three dierent sets of new vectors on a SGI R workstation

The vector dimension is The new vectors need to orthogonalize against orthonormal vectors

As number of new vectors doubles the execution time of CGS increase by a factor slightly larger than

However the time used by the SVD based orthogonalization routine increases considerably slower despite the

fact that SVD uses more oatingp oint op erations than CGS The primary reason for the saving in execution

time is that the SVD based orthogonalization uses BLAS op eration as opp ose to BLAS op erations used

by CGS For every memory access considerably more oatingp oint op erations are p erformed in the SVD

based orthogonalization routine

Table shows the comparison of two dierent orthogonalization routines on the test problems In the

Davidson metho d there are only a small number of cases where the orthogonalization is p erformed on more

than one vectors Therefore the dierence b etween using CGS and SVD orthogonalization is not signicant

Targeting

In the preconditioning step of algorithm it is p ossible to generate j residual vectors Since only one will b e

used there is a choice to b e made on which residual to b e used The pro cess of making this choice is referred

to as targeting b ecause the eigenpair whose residual is chosen will b enet the most from preconditioning

The preconditioning equation is of the form M z r Thus along with the residual vector it is also

necessary to choose an appropriate shift In our implementation of targeting the shift is chosen to b e the

biased estimate of the eigenvalue corresp onding to the residual vector Some of the common targeting

schemes are as follows

Sloppy targeting Always target the rst unconverged one For example if the smallest eigenvalues

are wanted this scheme always targets the smallest unconverged Ritz value This is the scheme shown

in algorithm

Minimum residual targeting Target the Ritz pair with the smallest residual norm The rationale

is that it will converge quickly and b e lo cked out of the basis Disadvantages of this scheme is that

residual norms are needed at every step Since residual norms can not b e estimated conveniently in

Sloppy Minimum Maximum

MATVEC timesec MATVEC timesec MATVEC timesec

S i

S i

S i

Table Comparison of dierent targeting schemes

the Davidson metho d as in the Arnoldi metho d A signicant amount of work is required to compute

all residual norms In most of our applications extreme eigenvalues are wanted this targeting scheme

will encourage convergence of interior eigenvalues if one happ ens to have a small residual norm

Minimum improvement targeting Because the residual norms are not always available the Min

imum Residual scheme is usually not a viable option For the unpreconditioned Arnoldi and Lanczos

there is a direct prop ortional relation b etween the ith residual norm and the last element of the eigen

vector y At step j one could choose the ith Ritz pair corresp onding to the smallest jy j as the target

i j i

The value y indicates to how much the j th basis vector v is used in forming the ith Ritz vector x

j i j i

If jy j is small the contribution of the newest basis vector v to x is small which can b e interpreted

j i j i

as the improvement to x is small Thus this targeting scheme is named the minimum improvement

i

targeting Since it is an imitation of the minimum residual targeting scheme it may also suer the

misconvergence problem

Maximum residual targeting This scheme targets the Ritz pair with the largest residual norm It

tries to keep all wanted Ritz pairs converge at ab out the same sp eed In other word it minimizes the

residual norm of the whole set of wanted eigenpairs Similar to the minimum residual targeting we

need to compute the residual norms which is exp ensive in the Davidson metho d

Maximum improvement targeting This is similar to the Minimum Improvement scheme the

dierence is that the ith Ritz vector corresp onding to the largest jy j is chosen as the target This is

j i

the practical form of the maximum residual targeting scheme which avoids computing of the residual

norms

In general the RayleighRitz pro jection on a Krylov subspace basis gives accurate solutions to the extreme

eigenvalues In selecting the target from all p ossible Ritz pairs half of the Ritz pairs are never considered

For example if there are j Ritz values and the smallest eigenvalues are wanted j largest Ritz values are

discarded automatically The target is sought among the j smallest Ritz values only Among these

targeting strategies we have chosen to test of them the sloppy scheme the minimum improvement scheme

and the maximum improvement scheme For most of our exp eriment so far we have b een using the sloppy

targeting scheme Here we will provide additional tests for the other two

Table shows the results using dierent targeting schemes on the three test problems Only the

Davidson metho d is used here In the S i case the maximum improvement scheme found an eigenvalue

missed by the other two schemes which might b e the cause for it to take extra iterations In general the

maximum improvement scheme uses less matrixvector multiplication than the other schemes if small number

of eigenpairs are sought In the S i case the sloppy targeting scheme uses the least number of matrixvector

multiplications This indicates that if the number of wanted eigenpairs is large the sloppy targeting is the

more eective approach

In the unpreconditioned case the residual vectors from b oth the Arnoldi metho d and the Lanczos metho d

are parallel to each other therefore targeting do es not apply to them Even if the residual vectors are not

parallel to each other as in the case of variable preconditioning and thick restart computing the residual

vectors and p erforming targeting will turn them into the Davidson metho d For this reason targeting is

only applicable to the Davidson metho d

2nd order finite difference 4th order finite difference

Figure Boundary of nite dierence domains

Miscellaneous Issues

This section describ es a number of issues that can b e describ ed in a few paragraphs Some of these might

b e very imp ortant to our application However we feel that they are well understo o d and it is fairly safe to

take the solutions as is Our implementation of the Davidson metho d is strongly inuence by

Matrixvector multiplication To move the eigenvalue solvers on to a distributed environment one of

ma jor tasks is to build a parallel matrixvector multiplication The matrixvector multiplication can b e

divided into three comp onents the Laplacian op erator the lo cal p otential and the nonlo cal p otential

The Laplacian op erator is discretized on a uniform grid with highorder nite dierence see section

for details Because of this a signicant p ortion of the grid p oints are on the b oundary of two or more

neighboring regions Figure illustrates this p oint on a small D grid For example using a th order

nite dierence scheme on a D grid distributed among pro cessors if the grid is divided

into cub es the number of b oundary p oints for each domain is which is ab out

of the number of grid p oints in the domain If a oneway dissection scheme is used the grid p oint in the

b oundary of each domain is On average each domain only have grid p oints In this case

every grid p oint in each domain is a neighbor to the two neighboring domains In order to p erform matrix

vector multiplications each domain needs to collect information ab out its neighboring grid p oints

The highorder nite dierence scheme used generates a heavy communication burden for matrixvector

multiplications

The lo cal p otential is translated into a diagonal matrix which is easy to multiply with a vector The

nonlo cal p otential consists of a series of rankone up dates Each rankone up date only involves a group of

nearby grid p oints Thus it is advantageous to keep grid p oints near each other in physical space on the

same pro cessor

Currently b ecause of the cuto used to eliminate the region far away from all atoms the actual shap e

of the domain is more complex than a simple cub e Because of this a oneway dissection is used

Blo ck size One of the main reasons for using a blo ck version of the Davidson metho d is to nd eigenvalues

with multiplicity In most of the matrices we encounter a small fraction of the eigenvalues have multiplicity

of In this case a blo ck Davidson metho d may b e more eective in nding these pairs Table shows

the eigenvalues found by the Davidson metho d with three dierent blo ck sizes The eigenvalues found by

the one with large blo ck size are smaller This is one advantage of the blo ck versions Table shows the

number of matrixvector multiplications and time sp ent by the Davidson metho d with dierent blo ck sizes

As the blo ck size increases the number of matrixvector multiplication also increases With blo ck size equals

to the Davidson metho d only found eigenvalues in matrixvector multiplications Note that in

index blo ck size blo ck size blo ck size

Table Eigenvalues of S i found by the Davidson metho d

blo ck size blo ck size blo ck size

MATVEC timesec MATVEC timesec MATVEC timesec

S i

S i

S i

Table Eigenvalues of S i found by the Davidson metho d

b oth the S i and the S i cases the eigenvalues found with dierent blo ck sizes are the same

Table may cause alarm ab out the quality of the eigenvalues found In general it is always p ossible

that some desired eigenvalues are not found by a pro jection based eigenvalue metho d If the smallest

eigenvalues are desired usually the Davidson metho d can nd the smallest without a problem In

table of the smallest eigenvalues are the same for all three cases This suggest that if

eigenvalues are desired we should ask the Davidson metho d for Also note that in this particular set of

test matrices the negative eigenvalues are the ones of interest The Davidson metho d with blo ck size do es

nd all negative eigenvalues correctly

Workspace We have decided to save b oth V and W AV in memory The basis vectors V are needed in

many op erations unless we have to store a large number of them it is the easiest to just keep them around

There are alternatives to saving W The array W is used to compute the residual vectors r W y V y and

T

used to compute the pro jection H V W The pro jection matrix H can b e built progressively therefore

only one column of W is needed at a time for this purp ose However all the columns are required to compute

r We could p erform one matrix vector multiplication use the following formula instead r Ax x

On average there are less than nonzero elements p er row in the matrices In terms of oatingp oint

op erations p erforming one matrixvector multiplication is equivalent to combining ab out vectors to form

Ax But p erforming the matrixvector multiplication has two disadvantages compared with p erforming linear

combinations First the matrixvector multiplication involves a sparse matrix Because its irregular memory

access pattern it is usually slower than p erforming linear combination Second on parallel computers the

matrixvector multiplication demands data communication b etween pro cessors while the linear combination

do esnt require any communication In all of our exp eriments we always keep the basis size less than

therefore it is more ecient to save W rather than p erforming matrixvector multiplications to compute

residuals −6 10

−8 10

−10 10

−12 10

−14 10 unmonitored loss of orthogonality

−16 10 5 10 15 20 25 30 35 40 45 50 55 restarts

−6 10

−8 10 monitored

−10 10

−12 10

−14 10 loss of orthogonality

−16 10 5 10 15 20 25 30 35 40 45 50 55

restarts

Figure Loss of orthogonality in Arnoldi bases

index without monitoring with monitoring

Table Eigenvalues of S i found by the Arnoldi metho d

Normality of Ritz vectors In order to save matrixvector multiplications at restart the Ritz vectors

in V are assumed to b e orthonormal and the corresp onding vectors in W are assumed to satisfy relation

W AV After a number of restarts the Ritz vectors may gradually lose orthogonality b ecause the

accumulation of roundo error This is esp ecially a problem with the Arnoldi metho d Thus we use the

T

Arnoldi metho d as an example Here is a list of Frobenius norms of V V I from the rst bases built

by the Arnoldi metho d when solving the S i test problem

The ab ove list shows that the orthogonality of the basis deteriorates quickly from one restart to the next

restart Theoretically the unpreconditioned Arnoldi metho d is equivalent to the unpreconditioned Davidson

metho d However in reality the Arnoldi metho d often takes more iterations to compute the same eigenpair

This loss of orthogonality problem the ro ot cause of the dierence This loss of orthogonality problem also

happ ens to the Davidson metho d esp ecially when a go o d preconditioner is used Thus there is a need to

T

mo derate the loss of orthogonality in the basis Computing V V I is fairly exp ensive so we choose to

monitor the norm of the rst Ritz vector computed at step in algorithm If the norm of this Ritz vector

deviates from signicantly we will discard the content of W and reorthogonalize the vectors in V b efore

continuing The rst Ritz vector is computed by x V y Let V and y b e the quantity V and y in exact

arithmetic the error in the computed Ritz vector x is x V y y V V y V V y y

If the errors in V and y are small the last term can b e neglected To estimate the dierence b etween

T

the norm of x and we need to evaluate the following expression x x The norm of the error is

k x k ky y k kV V k Since the vector y is an eigenvector of H computed by a QR based

eigenvalue routine from LAPACK it should b e very close to b e a unit vector Thus the deviation in norm

of x is mostly contributed from V If each entry of V is allowed to have dierence from its exact value

p

kV V k mn where n is the number of rows in V and m is the number of columns The value of

p

can b e the unit roundo error If the rst Ritz vector norm deviates more than mn from one we

consider that the basis V has lost orthogonality the vectors are orthonormalized again b efore building the

T

next basis The following are the orthogonality measure kV V I k from the rst ten bases built by the

F

Arnoldi metho d with the ab ove monitoring scheme

Figure show the orthogonality of the Arnoldi bases with and without this monitoring scheme Without

monitoring the Arnoldi metho d build bases with monitoring the Arnoldi metho d build bases Ob

viously extra work is done with monitoring the payo is the extra eigenvalue found see table The

eigenvalue was missed by the Arnoldi metho d without the monitoring scheme

Here we demonstrated the imp ortance of maintaining orthogonality of the basis by one example and

showed one way of maintain the orthonormality of the basis We should p oint out that the interplay b etween

the orthonormality of the basis and the quality of the eigenpairs is a fairly complex issue Loss of normality of

the Ritz vectors is only one results from loss of orthonormality in basis Reducing the tolerance on deviation

of Ritz vector norm do es not always increase the quality of the basis Reorthogonalizing the basis b efore

every restart do es not reduce the number of restarts required to compute the same solution Maintaining

go o d orthogonality of the basis is only one of the many factors that can aect the quality of eigenpairs

This example happ ens to involve a missing eigenvalue In general we dont b elieve the missing eigenvalue

problem can not b e addressed solely by orthogonality alone

Summary

This chapter is devoted to some of the implementation issues related to the development of an eective

eigenvalue co de for large sparse eigenvalue problems We discussed techniques adopted to computed a

large number of eigenpairs without requiring prop ortionally large amount of workspace variations of thick

restarting schemes and targeting schemes We have also shown the reverse communication proto col used in

the program to remove the dep endencies on the sparse matrix storage format and to ensure an easy transition

to a parallel environment Some key comp onents of the Davidson metho d are reorthogonalized to enhanced

the programs p erformance on parallel machines

To show that it is in fact easy to adopt the program to the parallel environment we will show some

preliminary results of using the eigenvalue co de describ ed ab ove on the eigenvalues problem generated from

a slightly larger simulation cases To limit the waiting time during the exp eriment we only tested the

eigenvalue co de on the S i H case The matrix size is ab out and eigenpairs are wanted

from the lower end of the sp ectrum The test is done on a small IBM SP cluster with high p erformance

switches The execution time to nd all required eigenpairs during the rst selfconsistent iteration is shown

in table The results in this table is far from optimal and we are actively seek to identify the strength

and weakness of the co de on parallel environment

pro cs time sp eedup

Table The execution time of the eigenvalue co de on a small SP cluster

Chapter

Polynomials in Eigenvalue Metho ds

Introduction

Use of optimal p olynomials has b een an integral part of eigenvalue metho ds

Let A denote the matrix of interest P A denote a p olynomial of A In these eigenvalue metho ds the

p olynomial P A is not explicitly required The op eration of multiplying P A by a vector can b e broken

into a series of matrixvector multiplications with A Since the matrixvector multiplication with A is an

essential part of any pro jection eigenvalue metho d only a minimal amount of additional work is required to

add a p olynomial preconditioner or to construct a hybrid metho d that alternates b etween a Krylov metho d

and a p olynomial metho d

Polynomials can b e used to directly improve the quality of an approximate eigenvector Some of the

earlier eigenvalue metho ds rely on this prop erty exclusively for computing eigenvector approximations for

example RITZIT Recently it has b een shown that a more eective way of using p olynomials

is to construct a hybrid metho d that alternates b etween a p olynomial metho d and a pro jection metho d

Hybrid metho ds of this type are attractive on new parallel computing environments b ecause

they require little additional work b eyond the basic pro jection eigenvalue metho d In addition the Chebyshev

p olynomial has b een shown to successfully complement the Arnoldi metho d and other Krylov metho ds in

solving linear systems and eigensystems In the hybrid metho d mentioned ab ove

this step of applying the p olynomial to improve the quality of eigenvectors is also referred to as purication

or p olynomial purication

Preconditioning techniques which seek to approximately solve A I z r are known to b e error

prone see page and For this reason p olynomial preconditioners used in this chapter are not linear

system solvers Instead they are designed to improve eigenvector approximations like in the p olynomial

purication This avoids some of the diculties of solving the nearly singular preconditioning equation

The second reason for not directly using p olynomials to solve the preconditioning equation is that there are

many Krylov metho ds that are easier to use and are just as eective as p olynomial metho ds with optimal

parameters These optimal parameters for p olynomial metho ds are often computed from the sp ectrum of the

matrix which is usually unknown Since Krylov metho ds like the Conjugate Gradient CG metho d do not

require detailed knowledge ab out the sp ectrum of the matrix they are easier to use than p olynomial metho ds

In practice CG is also observed to b e more eective than the Chebyshev metho d on symmetric linear systems

However during computation of the eigenvalues it is p ossible that the parameters for the p olynomials can

b e b etter estimated than what is p ossible during the solution of a linear system Through the use of b etter

parameters and avoiding some pitfalls of the Davidson preconditioning p olynomial preconditioning could b e

more eective for eigenvalue problems than for linear systems

In Chapter we adopted a lo cking mechanism to reduce the active basis size in eigenvalue routines

When computing a large number of eigenpairs using these algorithms a signicant amount of execution time

is sp ent on orthogonalizing the current basis vectors against the converged eigenvectors One motivation

for considering p olynomials is to reduce this orthogonalization time In terms of arithmetic op erations if

an average row of a sparse matrix has nonzero elements p erforming a matrixvector multiplication is

equivalent to orthogonalize against eigenvectors In this case orthogonalizing against eigenvectors

is equivalent to matrixvector multiplications ie a p olynomial of degree can b e constructed using

the same amount of arithmetic op erations If x is an eigenpair of A the corresp onding eigenpair of

P A is P x With a p olynomial of degree an interior eigenvalue of A may b e transformed into

an extreme eigenvalue of P A In this case the Davidson metho d applied on P A will not b e required

to enforce orthogonality against the converged eigenvectors of A This use of p olynomials will b e called

p olynomial sp ectrum transformation It is a simpler version of sp ectrum transformation techniques since

commonly used ones for example shiftinvert Lanczos techniques and rational Krylov metho ds

all require inversion of a sequence of dierent matrices during the solution of one eigenpair The p olynomial

sp ectrum transformation technique can not only b e used to transform interior eigenvalues into extreme ones

but also used to increase the separation b etween the wanted eigenvalue and the unwanted ones

In short this chapter will study three ways of using p olynomials in eigenvalue metho ds purication

preconditioning and sp ectrum transformation The rest of this chapter is arranged as follows In the next

section we will describ e some characteristics of the p olynomials to b e used Section gives some examples

of how the Davidson metho d b ehave for interior eigenvalue problems since we have not done so b efore

Section describ es how we determine the co ecients of the p olynomials Sections and will b e

devoted each to one of the three techniques of using p olynomials Section describ es an example of how

to use p olynomials to increase eciency when computing a large number of eigenpairs A brief summary is

provided at the end

Characteristics of the p olynomials

Polynomials we will use in this study are the Chebyshev p olynomials the Chebyshev p olynomials

of the second kind the leastsquares p olynomials and the p olynomials

This section will not give detailed algorithms for these p olynomials since they are widely available The

main purp ose of this section is to familiarize readers with some general characteristics of these p olynomials

and review the features that are relevant to their p erformance in solving eigenvalue problems

The p olynomials mentioned ab ove can b e divided into two categories according to their usage They are

either used to compute extreme eigenvalues or interior eigenvalues We use the following p olynomials for

extreme eigenvalue problems the Chebyshev p olynomials the Chebyshev p olynomials of the second kind

and the leastsquares p olynomials on one interval The following p olynomials are used for interior eigenvalue

problems the leastsquares p olynomial on two intervals and the Kernel p olynomial Note that the Kernel

p olynomials may also b e used for extreme eigenvalues However we will mainly use it for interior eigenvalue

problems in this study

The Chebyshev p olynomial can b e computed through a simple threeterm recurrence Let T denote

k

the k th degree p olynomial of the recurrence is

T T T

k k k

where T and T The Chebyshev p olynomial dened this way is an orthogonal p olynomial in the

interval which is also known as the Chebyshev interval Within this interval the maximum of the

p olynomial is Outside of this interval the absolute value of the p olynomial grows rapidly Let x

i i

i n denote the exact eigenpairs of the matrix A and the initial guess of the eigenvector b e the

following

X

x x

i

i

For simplicity we assume the eigenvalues are in ascending order The result of applying a Chebyshev

p olynomial on x can b e expressed as follows

X

x T Ax T

k i k

i i

Compared with the initial value x we see that T Ax has larger comp onents corresp onding to outside of

k

i

the Chebyshev interval In other word the comp onents corresp onding to inside the Chebyshev interval

i

is suppressed If the wanted eigenvalue is the only one outside of with suciently large k T Ax

k

can b e made arbitrarily close to the wanted eigenvector

Normally the unwanted part of sp ectrum do es not conveniently fall into In which case it is

necessary to shift and scale the p olynomial ie change the Chebyshev interval to cover the unwanted

eigenvalues For a complete algorithm for computing this shifted and scaled Chebyshev p olynomial see for

example reference From now on unless stated otherwise all p olynomials used are shifted and scaled

If the approximate eigenvalue is the p olynomials are scaled so that P is For symmetric eigenvalue

problems the Chebyshev p olynomial is dened on a Chebyshev interval that contains the unwanted part of

the sp ectrum If the wanted eigenvalue is the center of the interval is at c the half width of the interval

is d ie the Chebyshev interval is c d c d then the following is true

p

k

c d j cj

T

k

T c d d

k

This is a well known result ab out the Chebyshev p olynomial which will b e used to select the degree of the

p olynomial and the b oundaries of the interval

Assuming the rst eigenpair is to b e computed in order to reduce error of the approximate solution the

Chebyshev interval should contain eigenvalues To measure the eectiveness of p olynomial the

n

following denition of the damping factor is commonly used

T

k

i

max

i

T

k

In practical use the damping factor of T is often approximated as

i

T c d

k

T

k

Since we also scale T to T c d

k k

Figure shows two instances of the Chebyshev p olynomial The examples are designed with the

following scenario in mind the eigenvalues of the matrix are in the range of and the smallest eigenvalue

is to b e computed The p olynomials are scaled so that T The gure shows that a degree

k

Chebyshev p olynomial can reduce the contribution of eigenvalues in the range of by ab out one

eighth In order to reduce the contribution from all eigenvalues in the range of by the same ratio a

degree Chebyshev p olynomial is needed This clearly demonstrates the dep endency b etween the damping

factor and the separation b etween wanted and unwanted eigenvalues

An imp ortant characteristic of the Chebyshev p olynomial is that among all p olynomials of the same

degree it has the fastest growth rate outside of the same interval section Many asp ects regarding

the eective use of the Chebyshev p olynomial have b een discussed In this study we emphasis how to use

Chebyshev p olynomial to transform part of the sp ectrum to help the Davidson metho d to reach convergence

In this context we do not rely on the p olynomial metho d to handle the whole sp ectrum the results and

observations from previous studies may not apply However we are strongly inuenced by previous researches

in adaptive Chebyshev p olynomial for linear systems and eigenvalue problems

Another prop erty of the Chebyshev p olynomial of interest is that it has the largest derivative among all the

p olynomials that has the same maximum value in the interval section If the extreme eigenvalues

are clustered this prop erty indicates that using the Chebyshev p olynomial to transform the sp ectrum is

most eective in increasing the separations

The Chebyshev p olynomial we referred to in the ab ove discussion is the Chebyshev p olynomial of the

rst kind The ab ove reference to the derivative leads us to consider the Chebyshev p olynomial of the

second kind which is the derivative of the Chebyshev p olynomial of the rst kind Let U denote k th

k

order Chebyshev p olynomial of the second kind The rst two U s are U U The remaining ones

k

follow the same recurrence relation as the Chebyshev p olynomial the rst kind Ex Figure

shows two examples of the two kinds of Chebyshev p olynomials Note that the Chebyshev p olynomials

shown are the same as in Figure In gure the Chebyshev p olynomial of the rst kind is indicated 1

5 degree, on [0.1, 1] 17 degree, on [0.01, 1] 0.8

0.6

0.4

0.2

0

−0.2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure Two Chebyshev p olynomials

5 degree, on [0.1, 1] 1

0.5

0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

17 degree, on [0.01, 1] 1

Chebyshev I Chebyshev II 0.5

0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure Comparison of Chebyshev p olynomials of the rst kind and the second kind 5 degree, on [0.1, 1] 1

0.5

0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

17 degree, on [0.01, 1] 1

Chebyshev Least−squares 0.5

0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure Comparison of the Chebyshev p olynomial and the leastsquares p olynomial

by Chebyshev I and the Chebyshev p olynomial of the second kind is indicated by Chebyshev I I We note

that the two Chebyshev p olynomials are close to each other near In the range the value of the

Chebyshev p olynomial of the second kind is smaller than the Chebyshev p olynomial of the rst kind This

indicates that the Chebyshev p olynomial of the second kind could damp interior eigenvalue comp onents

b etter than the Chebyshev p olynomial of the rst kind Near the Chebyshev p olynomial of the second

kind is again larger in magnitude b ecause the Chebyshev p olynomial of the rst kind changes rapidly near

the ends of the Chebyshev interval Overall if the initial guess of an extreme eigenvector do es not contain

any contribution from the opp osite end of the sp ectrum then using the Chebyshev p olynomial of the second

kind could b e more eective than the commonly used Chebyshev p olynomial of the rst kind

The leastsquares p olynomial describ ed in was originally developed for solving indenite linear

systems It was designed to damp eigencomp onents in two disjoint intervals of the sp ectrum Further study

of using the leastsquares p olynomial as preconditioner for linear system can b e found in The same least

squares principle can b e used to develop a p olynomial that only damps eigencomp onents in one interval like

the Chebyshev p olynomial This is what is shown in Figure The recursive formula for computing the

co ecients of the p olynomial is given in and Without referring to the details of the algorithm

we note that the leastsquares p olynomial is computed through a threeterm recurrence which is similar to

the Chebyshev p olynomial This leastsquares p olynomials used here are built from the Chebyshev basis

other orthogonal basis could also b e used

Figure shows two examples of the leastsquares p olynomials on one interval The Chebyshev p olyno

mials shown in this gure are again the same as in gure and the leastsquares p olynomial in each plot

is of the same order and on the same interval as the corresp onding Chebyshev p olynomial One imp ortant

observation here is that the leastsquares p olynomial is much closer to zero far away from compared to

the Chebyshev p olynomial This is the main motivation for us to consider the leastsquares p olynomial

Similar to the Chebyshev p olynomial the interval on which the leastsquares p olynomial is built determines

its prop erty Selection of this interval is the main issue to b e addressed 1

Least−squares 0.8 Kernel

0.6

0.4

0.2

0

−0.2

−0.4

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure The two p olynomials for interior eigenvalue problems

1

gap=0.02 0.8 gap=0.01 gap=0.001 0.6 gap=0.0001

0.4

0.2

0

−0.2

−0.4

0.5 0.51 0.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.6

Figure The interval leastsquares p olynomial with inner b oundaries 1

0.8 500−degree

0.6 1000−degree

0.4 5000−degree

0.2

0

−0.2 Least−squares Kernel

−0.4

0.498 0.4985 0.499 0.4995 0.5 0.5005 0.501 0.5015 0.502

Figure The Kernel p olynomial and the interval leastsquares p olynomial at very high degree

Figure shows an example of the two p olynomials we plan to use for interior eigenvalue problems

namely the leastsquares p olynomial on two intervals and the Kernel p olynomial In this case the two

intervals are and The Kernel p olynomial centers at and it is constructed from a

sequence of Chebyshev p olynomials on the interval of Both p olynomials are scaled to b e one at the

center Kernel p olynomials can b e constructed using dierent basis p olynomials we have chosen to only use

the Chebyshev basis Kernel p olynomials has b een used for interior eigenvalue problems in the literature For

example in the authors describ ed how to use Kernel p olynomials in simulation of material prop erties

The magnitude of a Kernel p olynomial reduces as it go es away from the sp ecied center In this resp ect

it can b e regarded as a p olynomial approximation of the function Using approximate function for

interior eigenvalue problem has b een studied also for example Using Kernel p olynomial for linear

systems can b e traced back to More recent references include

Figure uses a few examples to show the dep endencies of the leastsquares p olynomial on the intervals

chosen The p olynomials shown in this gure are of degree The wanted eigenvalue is still assumed to

b e The gap refers to the distance b etween and the end of the intervals For example if the gap is

then the two intervals are and To eectively use the leastsquares p olynomial and

the Kernel p olynomial the key is again how to select the intervals on which to build these p olynomials

This is esp ecially imp ortant for the leastsquares p olynomial since there are two intervals to b e determined

when seeking interior eigenvalues In this chapter we study dierent schemes for selecting the intervals for

the p olynomials and see how to eectively combine them with the Davidson metho d to solve eigenvalue

problems

Davidson metho d for interior eigenvalues

Computing interior eigenvalues is generally regarded as a more dicult problem then computing extreme

eigenvalues Theoretically the Lanczos metho d converges faster toward extreme eigenvalues than toward

interior eigenvalues The same is true for the Arnoldi metho d and the Davidson metho d when applied to

a symmetric problem without preconditioning The following is an example shows that the interior eigenvalue

problem is more dicult for p olynomial metho ds as well Figure shows the Kernel p olynomial and the

interval leastsquares p olynomial at three dierent orders namely and The two intervals

for the leastsquares p olynomial is and The Kernel p olynomial is build on the interval

The centers for b oth p olynomials are At degree the twointerval leastsquares p olynomial is

ab out at see Figure If the extreme eigenvalue has separation from the rest ie

the extreme eigenvalue is the rest of the sp ectrum is in a degree Chebyshev p olynomial can

achieve the damping factor of and a degree leastsquares p olynomial on the same interval can

also achieve the same damping factor As exp ected with the same separation a p olynomial of much higher

degree is required to achieve the same damping factor for interior eigenvalues than for extreme eigenvalues

The main task of this section is to identify a basic algorithm for interior eigenvalue problems for later

exp eriments Based on our past exp erience with the Davidson metho d it is a go o d candidate for computing

a few interior eigenvalues

As we have mentioned in previous chapters one way to compute a few eigenvalues around a given

number is to slightly mo dify the Davidson metho d algorithm The mo dication can b e limited to

the the RayleighRitz pro cedure in algorithm Instead of ordering the eigenvalues of H in ascending

m

order we now order them according the distance from and place the ones closer to at the b eginning

of the array Aside from the RayleighRitz pro cedure we can also use harmonic Ritz pro jection scheme to

compute the eigenpairs Since the harmonic Ritz pro jection technique is designed to compute interior

eigenvalue the harmonic Davidson metho d for interior eigenvalues use it to nd the eigenvalues during the

expansion of the basis and during restart ie the harmonic Ritz metho d are used in b oth step d and step

of algorithm This is dierent from the harmonic Davidson metho d for extreme eigenvalues where the

harmonic Ritz pro jection is only used in step d of algorithm Because the harmonic Ritz vectors are

not orthonormal in Euclidean norm they are orthonormalized b efore restart to maintain the orthonormality

of the basis One alternative would b e to choose to work with nonorthonormal basis which would require

more mo dication to the co de we have for computing the extreme eigenvalues Thus we will not use this

alternative form

To test the p erformance of the Davidson metho d and the harmonic Davidson metho d for interior eigen

value problems we conducted an exp eriment on the nondiagonal symmetric matrices in the HarwellBoeing

collection see tables and and reference In this exp eriment we compute the two eigenvalues that

are closest to the center of the Gershgorin interval a b where

G G

X X

a min ja j ja j b maxja j ja j

G ii ij G ii ij

i i

j i j i

The number of matrixvector multiplications used to compute these two eigenvalues are shown in tables

and The maximum basis size used in the Davidson metho d and the harmonic Davidson metho d is

The tolerance on the residual norm is set to b e kAk No preconditioning is used The basis is

F

T

built from one initial vector In chapter we have shown that the Davidson metho d without

preconditioning is equivalent to the Arnoldi metho d in theory However in practice the Davidson metho d

is more stable than the Arnoldi metho d and it is more eective with preconditioning as well

In tables and the rst column is the matrix name the second column is the number of matrix

vector multiplications used by the Davidson metho d for interior eigenvalue problems the third column

shows the number of matrixvector multiplications used by the harmonic Davidson metho d and the last

two columns are the results from two p olynomial metho ds The p olynomial metho ds simply apply either

the degree leastsquares p olynomial or the degree Kernel p olynomial on a set of initial guesses then

apply RayleighRitz pro jection on the resulting vectors This pro cess is rep eated until the residual norms

are less than kAk The interval used for the Kernel p olynomial is the Gershgorin interval Denote

F

the two intervals of the leastsquares p olynomial as a b and a b The values of a and b are the left

end and the right end of the Gershgorin interval Order the Ritz values according to the distance from the

center of the Gershgorin interval then a and b are computed as follows

b c jc j a c jc j

where c a b in this test The p olynomials are applied on a blo ck of vectors The initial guesses

T

are the rst columns of the identity matrix plus the vector

harmonic

Davidson Davidson leastsquares Kernel

BUS

BUS

BUS

BUS

BCSSTM

BCSSTM

BCSSTM

BCSSTM

BCSSTM

GR

LUND A

LUND B

NOS

NOS

NOS

NOS

NOS

NOS

NOS

PLAT

PLAT

ZENIOS

Table Number of matrixvector multiplications used to compute two interior eigenvalues of Harwell

Bo eing matrices PART I

Among the four metho ds tested in this exp eriment the Davidson metho d is the most eective one The

two p olynomial metho ds tested generally require considerably more matrixvector multiplications compared

with the Davidson metho d The only exception to ab ove observation is the matrix ZENIOS There are

a number of zero rows near the top of the matrix which makes an exact eigenvalue and e the exact

i

eigenvector Since the center of the Gershgorin interval is also in this case the initial guess for the two

p olynomial metho ds contains the exact solutions Overall on the problems tested the Davidson metho d

p erforms b etter than the harmonic Davidson metho d There is only one exception found the harmonic

Davidson metho d used less matrixvector multiplications to nd two interior eigenvalues of BCSSTK

than the Davidson metho d In this case the harmonic Davidson metho d not only used less matrixvector

multiplications but also found two eigenvalues that are closer to the center of the Gershgorin interval

Comparing b etween the two p olynomial metho ds the leastsquares p olynomial p erforms b etter than

the Kernel p olynomial b ecause is used less matrixvector multiplications to achieve convergence in a large

number of the test cases This can b e partly explained by the fact that the intervals for the leastsquares

p olynomials are dynamic selected Because this variations in the b oundaries of the intervals the leastsquares

p olynomial has p eaks and valleys lo cated at dierent lo cations which makes it more eective in damping out

the unwanted eigenvector comp onents see gure

Based on the results of this exp eriment we will only use the Davidson metho d for the later tests on

interior eigenvalue problems The results of the p olynomial metho ds for interior eigenvalues provides a

reference p oint for later comparisons

harmonic

Davidson Davidson leastsquares Kernel

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

BCSSTK

Table Number of matrixvector multiplications used to compute two interior eigenvalues of Harwell

Bo eing matrices PART I I

Intervals of unwanted eigenvalues

As indicated b efore in order to eectively use the p olynomials we need to determine their co ecients

To fully determine the three p olynomials for extreme eigenvalue we need to identify an interval covering

unwanted eigenvalues For the Kernel p olynomial we need to know the range of the entire sp ectrum In

the twointerval leastsquares p olynomial case two intervals each from one side of the wanted eigenvalue

need to b e identied In addition to these intervals we need to also identify an appropriate degree for the

p olynomials

When considering extreme eigenvalues two b oundaries of an interval covering all unwanted eigenvalues

need to b e determined For convenience of discussion let a b denote the interval and assume that the

smallest eigenvalue and the corresp onding eigenvector is to b e computed We will refer to a as the left end

and b as the right end In this study we consider three dierent ways of selecting a and two dierent ways

of selecting b Since these b oundaries completely determine the prop erties of the three p olynomials we have

chosen for extreme eigenvalue problems it is crucial to select the optimal values for b oth a and b

We will use the damping factor of the Chebyshev p olynomial to guide the selection of degree of the

p olynomials to use after the interval a b is selected Note that the center c and the halfwidth of the

interval d are dened as follows

a b b a

c d

degree 100

0.8

0.6

0.4

0.2

0

−0.2 0 1 2 −4 x 10 degree 200

0.8 Chebyshev I 0.6 Chebyshev II Least−squares 0.4

0.2

0

−0.2 0 1 2 −4

x 10

Figure Left ends of the three p olynomials for extreme eigenvalue problems

degree degree

Chebyshev I

Chebyshev II

leastsquares

Table Values of the three p olynomials for extreme eigenvalue problems at selected p oints

Using equation the damping factor of a k degree Chebyshev p olynomial is given by the following

equation

k k

b a d

p p

c c d b a a b

The degree of the p olynomial is computed in such a way that the comp onents asso ciated with the unwanted

eigenvalues are reduced by a factor of in other word T i n see equation By

i

denition this means the damping factor is In practice it is not unusual to have a b a

in which case the ab ove formula computes a k of Too avoiding computing p olynomial of exceedingly

high order we limit the highest degree of a p olynomial to which means if a b a we

can achieve otherwise we use a degree p olynomial

Figure shows the left end of three p olynomials chosen for extreme eigenvalues The interval to b e

damp ed is The top half of the gure shows three degree p olynomials and the b ottom half

shows the same p olynomials at degree The values of the p olynomials at selected p oints are shown in

table The values at is the same as the damping factors From Figure and Table we can

see that there are signicant dierences among the three p olynomials In particular the damping factor

of the Chebyshev p olynomial may dier signicantly from the leastsquares p olynomial However since we

dont know how to estimate the damping factors of the Chebyshev p olynomial of the second kind and the

leastsquares p olynomial using equation for all three p olynomials is a reasonable alternative

Based on past exp eriences rep orted for example we know that the p erformance of the Chebyshev

p olynomial do es not dep end strongly on the exact lo cation of the b oundary opp osite to the wanted eigenvalue

so long as all unwanted eigenvalues are in the Chebyshev interval In this study all examples for extreme

eigenvalue problem are to nd the smallest eigenvalues In this case the right end of the Chebyshev interval

is not as imp ortant as the left end Thus we have decided to only test two choices for selecting the right end

b They are

Let b b e the right end of the Gershgorin interval This can b e determined b efore the start of

the eigenvalue algorithm and used throughout

T

Let b b e the opp osite extreme of the sp ectrum of the pro jected matrix H V AV If the smallest

eigenvalue is sought and the eigenvalues of H are sorted in ascending order then b the largest

m

eigenvalue of H where m is the size of H

In the second scheme the interval do es not cover all unwanted eigenvalues it is p ossible that the com

p onent corresp onding to the largest eigenvalues will dominate the vector P Ax One strategy to reduce

the impact of this problem is to use only o dd degree p olynomials for extreme eigenvalue problems Using

o dd degree p olynomials the smallest eigenvalues of A and the largest eigenvalues are mapp ed to values with

dierent signs Since the Davidson metho d is eective in separating them so long as this do es not cause

overow in the oatingp oint numbers the impact of this p otential problem is limited The same strategy

is also used in where only o dd degree Chebyshev p olynomials are used to precondition linear system of

equations

The value of is generally much smaller than the rightend of the Gershgorin interval Given the same

m

leftend for the Chebyshev interval using the second scheme for b will generate a p olynomial with b etter

separation b etween the wanted eigenvalue and the nearby unwanted ones This reects our emphasis on

eigenvalues near the wanted ones We rely on the Davidson metho d to deal with the whole sp ectrum of A

We have chosen three schemes for selecting the left end of the Chebyshev interval a They are

Let a where is an eigenvalue of the pro jected matrix H The value is considered the next

i i

eigenvalue sought Usually eigenvalues are computed The rst of them could b e go o d candidates

Thus there are choices for this option a can b e or

Let a where p is the active window size used in algorithm In the examples that will b e

p

shown later the basis size for the Davidson metho d is the window size is or the actual number

of wanted eigenpairs still to b e computed In this case the value p decreases as number of eigenpairs

left b ecomes less than

Cho ose a value for a so that the estimated damping factor under the given number of matrixvector

multiplications is This scheme always use a p olynomial of degree It compute the value of a

to ensure the damping factor is Let the formula on damping factor equation

leads to the following equations

p

a b a b b a b a

a b b a b a a b

b aa b b a

a b

In deriving the ab ove inequalities the following factor is also used b a From the last equation

we can compute a as follows

b

a

As expressed in equation an approximate eigenvector contains comp onents of unwanted eigenvec

tors ie i n To improve the quality of the approximation is to reduce the magnitude of

i

i while maintaining k k to b e a constant The Davidson metho d can b e considered as

i n 100 degree Chebyshev polynomials 500 degree Chebyshev polynomials 1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0

−0.2 −0.2

−0.4 −0.4 on [0.001, 1] on [0.001, 1] −0.6 on [0.0001, 1] −0.6 on [0.0001, 1] on [0.00001, 1] on [0.00001, 1] −0.8 on [0.000001, 1] −0.8 on [0.000001, 1]

−1 −1 0 1 2 0 1 2 −4 −4

x 10 x 10

Figure Highdegree Chebyshev p olynomials on dierent intervals

−4 0 1 2 x 10 Original spectrum

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Chebyshev on [0.001,1]

−0.2 0 0.2 0.4 0.6 0.8 1 Chebyshev on [0.0001,1]

−0.2 0 0.2 0.4 0.6 0.8 1 Chebyshev on [0.00001,1]

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

Chebyshev on [0.000001,1]

Figure A dense sp ectrum transformed by Chebyshev p olynomials on dierent intervals

go o d at damping comp onents corresp onding to far away eigenvalues in other word it is eective in reducing

the magnitude of One way of using the p olynomials is to make them damp the comp onents

n n

corresp onding to eigenvalues near the wanted eigenvalue ie reducing Figures and

show that all three p olynomials can b e used to damp contributions from large range of eigenvalues Fig

ure conrms that the Chebyshev p olynomials have the fastest growth rate outside the interval Next we

will attempt to show how the selection of a might aect the b ehavior of Chebyshev p olynomials near the

wanted eigenvalue

Figures and oer some intuition on which scheme might b e b etter in improving the quality of

a T T T T

a T T T T

Table Values of the Chebyshev p olynomial with dierent intervals

the eigenvector Assuming the wanted eigenvalue is the closest unwanted one is and the largest

eigenvalue is The value of the Chebyshev p olynomial of degree and degree at the p oint x

is shown in table One obvious trend in gure and table is that T is smaller as a b ecomes

closer to The optimal case in terms of reducing the magnitude of T is to have T equal to

which is p ossible by increasing the degree of the Chebyshev p olynomial and adjusting the left end of the

Chebyshev interval However as we decrease the value of T the value of the p olynomial at other

p oints may start to increase For example when a is T is close to

Figure shows how the lower end of the sp ectrum is transformed with Chebyshev p olynomials on

dierent intervals We assume that the original sp ectrum is uniform in the range of and only

show how the eigenvalues in this range is transformed in the gure The p olynomials used in this plot are

the degree Chebyshev p olynomials shown in gure and table they transform the eigenvalue of

the original matrix into As the lower end of the Chebyshev interval b ecomes closer to the separation

b etween b etween and the rest of the transformed sp ectrum b ecomes larger This observation mirrors the

ab ove observation on the value of the Chebyshev p olynomial at By increasing the gap b etween the

wanted eigenvalue and the rest the Lanczos metho d or the Davidson metho d can resolve the transformed

sp ectrum much easier than the original one

If the approximate eigenvector is a linear combination of the rst two exact eigenvectors it is clear that

the Chebyshev interval should b e chosen to minimize jT j However often the approximate eigenvector

is a linear combination of a large number of eigenvectors How to choose a to reduce the residual norm is not

nearly as clear as in this trivial case If the contributions from a few closest eigenvalues do es not dominate

the error in the approximate eigenvector then it is more b enecial to choose a slightly larger than

eg a in gure In the numerical exp eriment to b e conducted later we will attempt to identify

an eective strategy for selecting a

Denote the interval for the Kernel p olynomial by a b and the intervals for the interval least

squares p olynomial as a b and a b The interval selection schemes describ ed earlier in this section are

applied to the Kernel p olynomial and the interval leastsquares p olynomial as follows

Selection of a and b

Let a and b b e the left and right ends of the Gershgorin interval

Let a and b b e the smallest and the largest eigenvalue of H

m

To simplify the selection of b and a when use the twointerval leastsquares p olynomial we use the

the following relation so that we only need to select a value for a b a where is the

current approximation of the eigenvalue The choice of a given and b can b e made using the same

techniques as selecting a for the oneinterval p olynomials It is p ossible that b is less or equal to a

in some cases If this is true we will revert back to use oneinterval leastsquare p olynomial

Polynomials for purication

This section will use some examples to demonstrate how the ab ove mentioned p olynomials work as the

purication step of an eigenvalue metho d The hybrid eigenvalue metho d we build alternates b etween the

Davidson metho d and one of the three p olynomial metho ds mentioned ab ove The Davidson metho d is

immediately restarted if the residual norm has reduced by a factor of since last restart otherwise the

p olynomial metho d is invoked b efore restarting the Davidson metho d The Davidson metho d compute a

blo ck of Ritz values and Ritz vectors b efore restart When use a p olynomial metho d we need to make a

choice as which ones we would apply the p olynomial on Our rst scenario tries to preserve the structure of

the thickrestarted Davidson metho d It only applies the p olynomial on the rst Ritz vector and app ends the

resulting vector at end of the input vectors More sp ecically the thickrestarted Davidson metho d retains

a number of Ritz vectors say t and a blo ck of residual vectors corresp onding to the rst few Ritz pairs

The Davidson metho d essentially restarts at step t and tries to incorp orate the residual vectors in its basis

This hybrid scheme increase the number of new vectors to b e included in the basis by one

The rst example uses a matrix from the HarwellBoeing collection called BUS It has

rows and nonzero elements if stored in CSR format We seek to nd the smallest eigenvalues

T

starting with one initial guess of As b efore the maximum basis size is and the tolerance

on the residual norm is kAk which is in this case Without preconditioning the thickrestarted

F

Davidson metho d can nd eigenpairs in matrixvector multiplications The hybrid metho d that

pro duced results shown in table alternates b etween the unpreconditioned Davidson metho d and one of

the three p olynomials for extreme eigenvalue problems There are dierent choices used for a of which

are based on selection scheme discussed ab ove The row headed by refers to the selection scheme

p

and the last row refers to the third scheme where a is chosen to ensure that the damping factor is

The xed b case uses the right end of the Gershgorin interval as b and the dynamic b refers to the

case of using as b Three columns are presented for each case the rst column headed with MATVEC

m

total is the total number of matrixvector multiplications used to compute the ve eigenpairs the second

column headed with MATVEC Davidson is the number of matrixvector multiplications used in the

Davidson metho d and the third one headed with time sec is the total execution time in seconds We

run this test on an SGI Challenge with MIPS pro cessors The machines used have more than MB

real memory Our test problem t into the main memory

A glance at table reveals that let a and use the xed b with the Chebyshev p olynomial of the rst

kind converges in the least number of matrixvector multiplications Overall the two cases using Chebyshev

p olynomial with xed b and a or use the least amount of CPU time to reach convergence

One alternative to applying a p olynomial on one Ritz vector is to apply the p olynomial on all the Ritz

vectors In the thickrestarted Davidson metho d we save a number of Ritz vectors x and the corresp onding

i

Ax at restart If we apply a p olynomial on all these Ritz vectors the corresp onding Ax are no longer useful

i i

This increases the number of matrixvector multiplications used by the Davidson metho d More imp ortantly

the number of matrixvector multiplications used to build the p olynomial is signicantly more in this scheme

than in the previous case Table has the results of applying this hybrid eigenvalue metho d on BUS

Compared with table it is clear that many more matrixvector multiplications are used in this scheme

In addition table shows that there are a number of cases where this hybrid metho d did not nd all

eigenpairs in matrixvector multiplications For these cases the MATVEC total column app ears

as and the MATVEC Davidson column contains a negative number that indicates the number

of eigenpairs computed using matrixvector multiplications

The b est two cases in table are using the Chebyshev p olynomial and the leastsquares p olynomials

with a and dynamic b These two cases use the same CPU time to reach convergence However the

Chebyshev p olynomial case uses slightly less matrixvector multiplications In this test using dynamic b is

generally b etter than using xed b which is dierent from the case where purication is applied on one Ritz

vector see table

The smallest number of matrixvector in table is the corresp onding time is seconds

Chebyshev p olynomial of the st kind

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

Chebyshev p olynomial of the nd kind

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

leastsquares p olynomial on one interval

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

Table Results of using p olynomial purication with the Davidson metho d on BUS

The smallest number of matrixvector multiplications in table is and the corresp onding time

is seconds Considering the dramatic dierences in number of matrixvector multiplications b etween

tables and the dierences in time is relatively small This is b ecause the ma jority of the matrix

vector multiplications are done on a large blo ck of vectors which could b e made more ecient compared to

multiplying one single vector Comparing tables and we see that the matrixvector multiplications

used by the Davidson metho d is signicantly less in the second case On the environments where the matrix

vector multiplication on a blo ck of vectors is relatively cheap using more blo ck matrixvector multiplication

to reduce the number of steps taken in the Davidson metho d could p otentially reduce the total execution

time Under this circumstance applying the p olynomials on all Ritz vectors could b e a more eective

alternative

The two kinds of Chebyshev p olynomial and the leastsquares p olynomials are computed through three

term recurrences which means the last three p olynomials are available without extra work If we only give one

vector v to the p olynomial routines we could get three dierent results P Av P Av and P Av

k k k

Instead of only adding P Av as new input vectors to the Davidson routine we could use two or three

k

of them Table shows the results of giving the rst two to the Davidson routine and table shows

Chebyshev p olynomial of the st kind

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

Chebyshev p olynomial of the nd kind

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

leastsquares p olynomial on one interval

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

Table Results of applying p olynomial purication on all Ritz vectors when restart

the results of using all three of them Overall all three p olynomials b eneted from using more than one

output vectors Among them the leastsquares p olynomial b eneted the most The execution time of the

hybrid metho d decreased from table to table for most of the cases Overall the b est combination for

each p olynomial is shown in table In the table the numbers and indicates the number of output

p olynomials used Overall the hybrid metho d using the last three leastsquares p olynomials is the most

successful scheme tested This b est scheme uses dynamic b and a It computed the ve smallest

eigenvalues of BUS with matrixvector multiplications This is considerably b etter than using

the Davidson metho d alone which only computed eigenpairs in matrixvector multiplications

To test the p olynomials for interior eigenvalues we have chosen to nd eigenvalues of PLAT near

as the test case The maximum basis size is again and the tolerance on the residual norm

T

is kAk The only starting vector for the Davidson metho d is as b efore

F

The Davidson metho d for interior eigenvalue problem restarts with Ritz vectors whose corresp onding

Ritz values are close to Since the RayleighRitz pro jection algorithm do es not p ossess the same

optimality prop erty for interior Ritz values as for extreme eigenvalues the b ehavior of the hybrid metho d

b ehaves very dierently from the extreme eigenvalue case We have rep eated all dierent combinations

Chebyshev p olynomial of the st kind

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

Chebyshev p olynomial of the nd kind

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

leastsquares p olynomial on one interval

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

Table Results of p olynomial purication where the last two p olynomials are used

of parameters shown in tables and table only shows the cases where the hybrid metho d

successfully computed eigenpairs around within matrixvector multiplications In table

the column headed by N is the number of output vectors used and the leastsquares refers to the interval

p

leastsquares p olynomial For the Kernel p olynomials we only need to determine the outer b oundaries

however we use one of the seven choices for inner b oundaries of the twointerval leastsquares p olynomial to

determine the degree of the Kernel p olynomial Using as inner b oundary indicates that the degree of the

p olynomial is set to From this table we see that the only successful scheme for selecting the b oundaries

of the intervals is to use as inner b oundaries and use the b oundaries of the Gershgorin interval as the

outer

The intermediate approximate solution to an interior eigenvalue can b e either larger or smaller than

the nal solution However for extreme eigenvalues the approximate eigenvalues of the Davidson metho d

approaches the exact solution from the inside of the sp ectrum This dierence could b e the reason why

the successful schemes for selecting the b oundaries for the extreme eigenvalue problems are not successful

schemes for selecting inner b oundaries for the interval leastsquares p olynomials Generally using to

compute the inner b oundaries given more consistent answer to the inner b oundaries Thus it is a more

Chebyshev p olynomial of the st kind

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

Chebyshev p olynomial of the nd kind

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

leastsquares p olynomial on one interval

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

Table Results of p olynomial purication where the last three p olynomials are used

xed b dynamic b

Chebyshev I a a

Chebyshev II a a

leastsquares a a

Table The b est choices for dierent p olynomials in hybrid metho d

inner outer MATVEC MATVEC time

Polynomial N b oundaries b oundaries total Davidson sec

p

leastsquares xed

leastsquares xed

Kernel xed

Table Hybrid metho ds that successful computed eigenvalues around for PLAT

MATVEC MATVEC time

preconditioner total Davidson sec

CGA I

T

CGA I xx

T

CGA I xAx

T T

CGI xx A I I xx

Chebyshev II a xed b

Table Finding smallest eigenvalues of BUS with preconditioners that only require matrixvector

multiplications

suitable choice compare to the other schemes

The number of matrixvector multiplication taken to compute this desired eigenvalues is very high com

pared with the matrix size Nevertheless since we can not even compute one eigenpair around with the plain

Davidson metho d nding of them with matrixvector multiplications is a signicant improvement

Polynomial preconditioning

In this section p olynomials are used as preconditioners to the Davidson metho d see algorithm The

rst mo dication was to simply replace step diii of algorithm with

Z P Ar

This preconditioning scheme is dierent from all previous schemes mentioned in this thesis since it do es not

attempt to solve a linear system

An advantage of using p olynomial preconditioning is that it is easy Using Krylov metho ds as precon

ditioners is similar to p olynomial preconditioners For this reason we will also show examples of Krylov

iterative metho ds as preconditioners in this section

For numerical testing we apply the preconditioners to the two test problems used b efore ie the extreme

eigenvalues of BUS and interior eigenvalues of PLAT The convergence tolerance and the maximum

basis size for the Davidson metho d are the same as b efore Unlike the hybrid metho d case where the p olyno

mial purication step may b e skipp ed if the Davidson iteration is converging rapidly the preconditioning step

is always executed However since the workspace required by the p olynomial routines is the unused p ortion

of the basis vectors it is p ossible that there is not enough workspace to carry out the p olynomial pro cedure

which happ ens when the basis is almost full In this case no preconditioning is p erformed we simply copy

the input vectors to the output array In addition to ensure that there is a reasonable eigenvalue estimate

to establish the intervals for the p olynomial pro cedures the p olynomial preconditioning routines are not

invoked for the rst few steps of the Davidson iteration In the tests shown the p olynomial preconditioning

pro cedure is not used until the basis size is in the Davidson metho d

Table shows the results of computing the smallest eigenvalues of the BUS matrix using the

p olynomial preconditioner and four CG based preconditioners The preconditioners CGA I CGA

T T T T

I xx CGA I xAx and CGI xx A I I xx use the Conjugate Gradient metho d

to compute an approximate solution of B r where the matrix B is the matrix in the parenthesis see

Chapter for the origin of these matrices The maximum number of matrixvector multiplications is set to

for CG The iteration is terminated when the residual norm is reduced by one order of magnitude or

matrixvector multiplications have b een used Compared with the b est results rep orted in tables

and the Davidson metho d with one of the four CG preconditioners takes signicantly less time and

fewer matrixvector multiplications than the b est hybrid metho d scheme The total numbers of matrixvector

multiplications used by the four CG schemes are not very dierent and the simplest of the four CGA I

is slightly more eective than the others

In this test the p olynomial preconditioners tested include all schemes shown in table In table

only one case is shown b ecause it is the only case where the p olynomial preconditioner scheme is successful

Chebyshev p olynomial of the st kind

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

Chebyshev p olynomial of the nd kind

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

leastsquares p olynomial on one interval

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

Table Results of applying the Davidson metho d on BUS with the mo died p olynomial precondi

tioner

in nding all ve smallest eigenvalues of BUS This indicates that app ending P Ar to the basis of the

Davidson metho d is not a go o d option An obvious alternative is to apply the p olynomial on the current

approximate eigenvector ie app end P Ax to the Davidson basis In fact we found that a more robust

approach is to app end b oth r and P ax to the Davidson basis Table shows the results of this mo died

preconditioning scheme

To solve the BUS problem the most eective hybrid scheme tested used seconds matrix

vector multiplications see table The Davidson metho d with the b est mo died p olynomial preconditioner

used only seconds matrixvector multiplications see table The mo died preconditioning

scheme uses less time but more matrixvector multiplications to solve the same problem This is b ecause

the matrixvector multiplication is fairly inexp ensive with BUS This scheme trades the number of steps

used in the Davidson metho d with the number of matrixvector used in the p olynomial metho ds Since one

Davidson step in more exp ensive than one step in the p olynomial metho d the Davidson metho d with this

mo died preconditioner can reduce the execution time However Compared with the CG preconditioners

the mo died p olynomial preconditioning still uses more time and more matrixvector multiplications

MATVEC MATVEC time

preconditioner total Davidson sec

CGA I

T

CGA I xx

T

CGA I xAx

T T

CGI xx A I I xx

Table The results of nding interior eigenvalues of PLAT with preconditioners that only require

matrixvector multiplications

inner outer MATVEC MATVEC time

Polynomial b oundaries b oundaries total Davidson sec

leastsquares dynamic

Kernel dynamic

Kernel dynamic

Kernel dynamic

Table The mo died p olynomial preconditioning schemes that successfully computed the desired eigen

pairs of PLAT

Among the three p olynomials the Chebyshev p olynomial of the second kind shows the b est p erformance

gain Also using xed b is b etter than using dynamic b for preconditioning

Table shows the results of applying the Davidson metho d with matrixvector multiplication based

preconditioners on PLAT to nd eigenpairs around Similar to table we only show

the successful cases The preconditioning schemes that compute P Ar did not reach convergence within

matrixvector multiplications In this case the four CG preconditioners uses signicantly dierent

number of matrixvector multiplications and CPU time The b est among the four preconditioners is the

JacobiDavidson preconditioning scheme

Table shows the results of using the mo died preconditioning scheme on the PLAT test prob

lem Again we only show those cases where the preconditioned Davidson metho d were able to compute

eigenpairs within matrixvector multiplications Among the cases shown in table we see that

using Kernel p olynomial with dynamic outer b oundaries is relatively more successful than other schemes

Compared to the most successful hybrid metho ds see table we observe that the mo died p olynomial

preconditioner is not as eective as the hybrid metho d The shortest time in table is seconds which

is considerable less than the shortest time in table seconds Similar to the BUS case the

mo died preconditioning scheme is eective in reducing the number of steps taken by the Davidson metho d

However the reduction come with the increase in total number of matrixvector multiplications used Since

the matrixvector multiplication is more exp ensive with PLAT the overall CPU time increased in this

case The b est result from the CG preconditioners is again considerably b etter than the b est results from

using the ve p olynomials

Note that it is probably more appropriate to call this mo died preconditioning scheme a hybrid metho d

since it do es not resemble the common form of the eigensystem preconditioning or linear system precon

ditioning This hybrid metho d can b e considered as one extreme form since a p olynomial step is inserted

after every step of the Davidson iteration The hybrid scheme presented in previous section only invokes a

p olynomial pro cedure after the Davidson metho d has built a full basis From the two tests p erformed here

we see that invoking the p olynomial metho d more often is only b enecial if the matrixvector multiplication

is inexp ensive

Polynomial sp ectrum transformation

The essence of p olynomial sp ectrum transformation technique is to replace the matrixvector multiplication of

the Davidson algorithm with a p olynomial in A However since we have to p erform RayleighRitz pro jection

to compute the actual eigenvalue of A and to estimate parameters required to dene the p olynomial the

overall structure of the actual eigenvalue routine is very similar to a hybrid metho d Our implementation of

the p olynomial sp ectrum transformation technique alternates b etween the Davidson iteration on the original

eigenvalue problem and the Davidson iteration on the transformed eigenvalue problem The iterations on the

original eigenvalue problem are used to compute the eigenvalue of A after the transformed eigenvalue problem

has converged and compute the approximate eigenvalues and eigenvectors to b e used dene the transformed

eigenvalue problem Similar to the hybrid metho d if the iteration on the original problem is eective in

reducing the residual norm than we restart to continue the iteration on the original problem However if the

residual norm is decreasing slowly or the residual norms at the end of last two iterations are b oth larger than

the previous one then the routine switches to the Davidson metho d on a transformed eigenvalue problem

The p olynomials used for this routine are the same as those used in the hybrid metho ds and the p olynomial

preconditioning The schemes for selecting the intervals are the same as in the p olynomial purication case

Similar to the preconditioning case the p olynomial routines also use the uno ccupied part of the basis

vectors as workspace This means that the actual basis size for the transformed Davidson iterations is

or less than the basis size for the Davidson iteration on the original eigenvalue problem The two

Chebyshev p olynomial routines and the two leastsquares p olynomial routines require two extra vectors for

their computation and the Kernel p olynomial routine requires three extra vectors When the basis size for

the Davidson metho d without any transformation is if the Kernel p olynomial is used to p erform the

sp ectrum transformation the actual basis size is if one of the other four p olynomials is used the basis

size is The Davidson iteration on the transformed eigenvalue problem is terminated if the residual norm

of the largest eigenvalue of the transformed matrix is less than or if the residual norm increases

compared to previous iteration Thick restart is applied to the transformed Davidson iterations as well As

b efore Ritz vectors corresp onding to the largest Ritz values are saved as starting vectors for the new

Davidson iteration

We again use BUS as the test matrix for extreme eigenvalue problem to show how the p olynomial

sp ectrum transformation technique works The same residual tolerance and the maximum basis size are

used as b efore Table shows the results of using the Davidson metho d with the p olynomial sp ectrum

transformation This table shows all dierent combinations of the parameters Among the schemes for

selecting a the most successful one in this case is a Using the dynamic b is more eective for the

leastsquares p olynomial while using the xed b is more eective for the two kinds of Chebyshev p olynomials

Overall the sp ectrum transformation using the leastsquares p olynomial with a and b is the

m

most eective choice shown in table

Ma jority of the cases shown in table are signicantly b etter than unpreconditioned Davidson

metho d on the same eigenvalue problem However the b est hybrid schemes presented in tables

and use less time and fewer matrixvector multiplications to solve the same problem as the b est results

from the p olynomial sp ectrum transformation technique In general the sp ectrum transformation technique

present also use more time than the mo died p olynomial preconditioning scheme shown in the previous

section

For interior eigenvalue problems we were unable to have the p olynomial sp ectrum transformation scheme

to converge for the PLAT problem within matrixvector multiplications For this reason we

switched to the matrix PLAT which has similar origin as the matrix PLAT We again seek to nd

eigenvalues near with residual tolerance jjAjj For comparing the eectiveness of the p oly

F

nomial sp ectrum transformation scheme we also rep eated the test with the four CG based preconditioners

The results with CG preconditioners are shown in table The cases where the p olynomial sp ectrum

transformation scheme reached convergence within matrixvector multiplications are shown in ta

ble

The b est time shown in table is seconds to nd interior eigenvalues of PLAT The time

T

using the CGA I xx to solve the same problem is seconds It is clear that using CG preconditioner

with the Davidson metho d is more eective in this case In fact the Davidson metho d with three out of the

Chebyshev p olynomial of the st kind

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

Chebyshev p olynomial of the nd kind

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

leastsquares p olynomial on one interval

xed b dynamic b

MATVEC MATVEC time MATVEC MATVEC time

a total Davidson sec total Davidson sec

p

Table Results of using p olynomial sp ectrum transformation with the Davidson metho d on BUS

MATVEC MATVEC time

preconditioner total Davidson sec

CGA I

T

CGA I xx

T

CGA I xAx

T T

CGI xx A I I xx

Table Results of nding interior eigenvalues of PLAT with CG preconditioners

inner outer MATVEC MATVEC time

Polynomial b oundaries b oundaries total Davidson sec

leastsquares dynamic

Kernel dynamic

Kernel dynamic

Kernel dynamic

Kernel dynamic

Kernel dynamic

Table The p olynomial sp ectrum transformation schemes that successfully computed the desired eigen

pairs of PLAT

MATVEC MATVEC time

preconditioner total Davidson sec

NONE

SOR

CGA I

T

CGA I xx

T

CGA I xAx

T T

CGI xx A I I xx

Hybrid metho ds with p olynomials for extreme eigenvalues

MATVEC MATVEC time

Polynomial N a b total Davidson sec

p

Chebyshev I xed

Chebyshev II xed

leastsquares xed

Hybrid metho ds with p olynomials for interior eigenvalues

inner outer MATVEC MATVEC time

Polynomial N b oundaries b oundaries total Davidson sec

p

leastsquares xed

Kernel xed

Table MATVEC and time used to nd smallest eigenvalues of the S i test problem

four CG preconditioners can solve the PLAT problem in less than seconds

Comparing table with table we see that the same preconditioners p erforms quite dierently on

two similar eigenvalue problems This dramatic dierence could b e due to the sp ecial nature of the interior

eigenvalue problem or due to some features of the preconditioners At this moment we dont yet know which

one in the main cause

Computing a large number of extreme eigenvalues

The ultimate goal of this study is to use p olynomials to improve the eciency of computing a large number

of extreme eigenvalues To see how the p olynomial schemes p erform for this task we decided to conduct some

tests on the matrix generated from the S i test case see table We compute smallest eigenvalues from

T

the matrix starting with the lone initial guess To reduce the number of test cases we have

decided to use the hybrid schemes discussed in section There are a total of p ossible test cases with

dierent p olynomials see tables and We have chosen to show only the cases with shortest

execution time for each of the p olynomials in table As references we also show the matrixvector

multiplication and time required to solve the same problem with a small number of preconditioners The

execution time shown here is gathered from a SGI challenge workstation On the eigenvalue problem tested

the Davidson metho d with SOR preconditioner is the only case where the execution time is less than the

unpreconditioned Davidson metho d The four CG preconditioners are able to reduce the number of iterations

sp ent by the Davidson metho d however the preconditioning steps was to o exp ensive to reduce the overall

execution time The hybrid metho ds used more matrixvector multiplications than the unpreconditioned

Davidson metho d Because the matrixvector multiplication is relatively exp ensive the average number of

nonzero elements p er row is ab out for the S i matrix more CPU time is used by the hybrid metho ds

than the plain Davidson metho d

There could b e many reasons for the outcome of this test For example the hybrid metho ds may b e not

eective for computing many eigenpairs or the hybrid schemes may not work well with the lo cking strategy

or simply the test problem may b e a bad example to use However this one example do es not negate the

previous observations that p olynomial metho ds can b e used eectively to solve eigenvalue problems What

this example do es remind us is that continued research is required to improve the robustness of the hybrid

schemes

Summary

In this chapter we studied how to use p olynomials to facilitate the Davidson metho d in computing eigenpairs

of large sparse matrices Among the three techniques tested the purication technique is the most versatile

one The hybrid metho ds that alternate b etween the Davidson iteration and the p olynomial purication

schemes are eective Using p olynomials as preconditioners for the Davidson metho d is found to b e ineective

on the test problems The p olynomial sp ectrum transformation scheme has sound theoretical basis but in the

limited number of tests conducted we were unable to surpass the hybrid scheme Overall the eectiveness

of all three dierent techniques of using p olynomials is sensitive to how parameters are selected

The most successful hybrid metho d for extreme eigenvalue problems in our tests is the one that alternates

b etween the Davidson metho d and the leastsquares p olynomial For this p olynomial the most successful

scheme for selecting the interval to build p olynomials for extreme eigenvalue problem is to select the interval

as In the numerical tests we also see that saving lower degree p olynomials can also enhance the

m

p erformance of the hybrid metho ds see table

The test results for interior eigenvalue problems are not as extensive as the extreme eigenvalue case

This is not b ecause we have conducted fewer tests but b ecause there are fewer cases where the interior

eigenvalue problem was solved successfully This underscores the need for b etter selection schemes for the

parameters of the p olynomials used or b etter p olynomials for interior eigenvalue problems Based on the

available results we see that the hybrid metho ds with the leastsquares p olynomial or the Kernel p olynomial

in the purication step solved the same test problem in ab out the same time The b est scheme for selecting

the inner b oundaries for the twointerval leastsquares p olynomial is to set the b oundaries far enough to

ensure a constant damping factor The outer b oundaries for the the leastsquares p olynomial and the Kernel

p olynomial can simply b e the end p oints of the Gershgorin interval The exp eriments show that it is more

eective if the two highest degree p olynomials generated by the p olynomial routine are used see table

For the three eigenvalue problems tested in this chapter the Davidson metho d with CG preconditioners

was able to reach convergence in less time than the b est of the three types of p olynomial schemes This can

b e partly blamed on the weakness of the interval selection schemes tested

The results from the S i test case see table show another asp ect of the CG preconditioners and

the hybrid metho d In this test problem the CG preconditioners are to o exp ensive for a go o d overall

p erformance The p olynomial schemes are ineective as well But b ecause the Davidson iterations are

eective in improving the quality of the solutions the p olynomial purication step is rarely invoked which

automatically limits the damage that might b e incurred by constructing a highdegree p olynomial In

comparison the hybrid metho ds are more eective than the CG preconditioner in this case However the

two dierent schemes b oth take more time to nd the desired solution than the unpreconditioned Davidson

metho d

In conclusion if only matrixvector multiplication is available using a Krylov metho d as preconditioner

for the Davidson metho d is an eective approach Hybrid metho d with p olynomial purication or p olynomial

sp ectrum transformation could b e eective in some cases For lack of go o d strategies of selecting parameters

the p olynomial based schemes are not comp etitive against Krylov metho ds on the problems tested

Chapter

Summary and Future Work

Summary

Our main goal in this thesis was to study ways to improve algorithms for large sparse eigenvalue problem In

particular the eigenvalue problems that require a large number of eigenvalues and corresp onding eigenvectors

for example nding eigenpairs of a matrix These problems are challenging b ecause

of two reasons First the matrices involved are to o large for traditional QR approaches which were designed

for dense matrices Second the eective algorithms for sparse eigenvalue value problems are designed to

nd only a few eigenpairs We addressed this problem from three dierent prosp ectives seeking a sound

preconditioned algorithm for multiple eigenpairs improving the preconditioning scheme and studying the

use of p olynomials in enhance the exibility of the eigenvalue routine

Seeking a sound preconditioned algorithm for multiple eigenpairs After evaluating eigenvalue

algorithms such as the QR metho d the Lanczos algorithm the Arnoldi metho d and the Davidson metho d

we come to the conclusion that the Arnoldi metho d and the Davidson metho d are the two most promising

algorithms for the task of nd any eigenvalues and eigenvectors of large sparse matrices In addition we

also studied three dierent variants of these two metho ds Without preconditioning it is clear that the ve

metho ds are identical to each other if there is only one initial starting vector to all ve metho ds see section

If they also restart with same Ritz vectors they will continue to pro duce the same basis and the same

approximate solutions For this reason the cheaper the algorithm is p er step the less time it will take to

nd the same solution in exact arithmetic The simplest algorithm among the ve is the Arnoldi metho d

Therefore the Arnoldi metho d is the b est if no preconditioning would b e used For symmetric matrices if

there is only one initial guess the Lanczos algorithm should b e preferred

The ve metho ds b ehave dierently when more than one initial guesses are provided or when nite

precision arithmetic is used Among the ve metho ds compared the Davidson metho d is the most stable

one b ecause it has the least amount of cancelation error when building it basis If fact the new vector added

to the basis is orthogonal to the basis without preconditioning Because of this there are cases where the

Davidson metho d is preferred even though it is more exp ensive p er step For example if the wanted smallest

eigenvalues have signicantly smaller separation than the largest eigenvalues based on the exp eriences with

the Lanczos metho d we know that the largest eigenvalues will converge quickly and cause orthogonality

problem in the basis The same orthogonality problem can o ccur in the Arnoldi metho d though it is less

likely than in the Lanczos metho d but the more stable Davidson metho d will b e able to nd the solution

without encounter the same diculty The same is true if there are interior eigenvalues that converge very

easily as in the case of BCSSTK

With preconditioning the ve metho ds b ehave very dierently from each other The Davidson precon

ditioning is regarded as an inexactNewton metho d which means that it could work well when the precon

ditioner is go o d A theoretical study of preconditioned eigenvalue algorithms has not b een conducted We

to ok the approach of comparing the metho ds through numerical exp eriment Based on our exp erimental

results the Davidson metho d is by far the most eective one with all tested preconditioners

Improving preconditioning scheme The preconditioning scheme of the Davidson metho d approxi

mately solves a nearly singular equation A I y r where r is the current residual vector and is

the current approximate eigenvalue In the Rayleigh quotient iteration metho d for eigenvalue problems

inversion of the same matrix was also required From the study of Rayleigh quotient iteration researchers

have noticed many problems when attempting to invert the matrix A I However our exp eriment

shows that the Davidson metho d is still very eective Based on this observation we b elieve that the inexact

Newton approach is a sound basic preconditioning strategy though this particular Jacobian matrix may have

some undesirable prop erties

We studied three dierent Newton iterations for eigenvalue problems the augmented Newton metho d of

Peters and Wilkinson the constrained Newton metho d based on the framework presented in and

the JacobiDavidson recurrence based on the JacobiDavidson preconditioning scheme The augmented

Newton scheme pro duces a Jacobian matrix that is larger than the size of the original matrix A In attempting

to reduce the size of this linear system we realized that there are a number of variations which are equivalent

to each other but have dierent Jacobian matrices From this analysis we see that the Olsen preconditioning

scheme can b e regarded as a Schur form of the augmented Newton scheme

The Jacobian matrices of the dierent variations of Newton metho d for eigenvalue problems can b e used

as preconditioners in three ways

Develop sp ecialized incomplete factorization scheme for the Jacobian matrix of the augmented Newton

scheme

Olsen preconditioning scheme which can b e viewed as p erforming a regular preconditioner M A I

twice to improve the quality of the approximate solution

There are three Newton schemes whose Jacobian matrices can not b e easily factorized In this case

a Krylov iterative solver or an approximate inverse can b e used to approximately solve the precondi

tioning equation

We have p erformed exp eriment on most of the techniques mentioned ab ove For the limited number of

test cases studied preconditioners based on Krylov iterative solvers are easy to use and fairly eective There

are four dierent linear systems that could b e used with Krylov iterative solvers They share the same right

T

hand side but have dierent iteration matrices The four matrices are J A I J A I xx

R I

T T T T

J A I xx A A and J I xx A I I xx Each of these four dierent linear

C J

systems seems to have its own advantage The condition number of J is very small when is far from any

R

exact eigenvalue J could p otentially run into problems if is very close to an accurate solution Since

R

during most of the iterations is far from an accurate solution the Davidson preconditioning scheme is

eective It is p ossible for J to have large condition number when is far from convergence even though

C

it is shown to b e well b ehaved near convergence The matrix J is designed to b e singular which means a

J

Krylov metho d could nd a solution to the linear system but a dierent kind of linear system solver may

fail to do so In addition it is generally harder to construct a preconditioner for a singular linear system

Overall using J is the most prudent choice In our tests the condition number of J is almost as small as

I I

that of J when is far from any exact eigenvalue and the condition number of J do es not grow without

R I

a b ound like that of J when converges toward a simple eigenvalue

R

Using p olynomials Using p olynomials for eigenvalue problems has a few advantages they are fairly

easy to add to an pro jection based eigenvalue routine they can improve the scalability of the eigenvalue

routine on parallel machines by reducing the number of dotpro ducts needed The main obstacle which

prevents the eective use of the p olynomial schemes is that the user has to determine a certain number

of parameters in order to use them In this thesis we studied dierent choices of selecting parameters

for dierent p olynomials We tested three dierent strategies of using these p olynomials Overall

we saw that using p olynomials with the Davidson metho d can reduce the total number of matrixvector

multiplications signicantly compared with the unpreconditioned Davidson metho d The p erformances of

all three schemes are very sensitive to the choice of the parameters Among the three dierent ways of

using the p olynomials the purication scheme is the most successful one on the two test problems The

most successful parameter selection schemes are dierent for the three dierent ways of using p olynomials

The most successful scheme for extreme eigenvalue problem is a hybrid metho d that uses the three highest

degree leastsquares p olynomials computed on the interval For interior eigenvalue problems the

m

hybrid schemes with the twointerval leastsquares p olynomial and the Kernel p olynomial use almost the

same amount of time in the b est cases

Since Krylov iterative solvers also build solutions in terms of p olynomials of A we compared the results

of using them as preconditioners to other p olynomial schemes The particular iterative metho d used is

the Conjugate Gradient CG metho d The CG preconditioner is much more eective on the problems

tested than the p olynomial preconditioning schemes In terms execution time the b est results from using

p olynomials purication scheme can b e comparable to the average case of the CG preconditioning

In conclusion we identied the Davidson metho d to b e the most eective choice for nding multiple

eigenpairs of large sparse matrices We have found dierent preconditioning schemes that can avoid many

pitfalls of the Davidson preconditioning scheme Among the preconditioners which use matrixvector multi

plications our tests show that Krylov metho ds are eective and easy to use

Future work

Overall we see preconditioned metho ds as an eective approach to solve eigenvalue problems We would

continue to pursue the study of new preconditioning schemes and new preconditioners for eigenvalue problems

to further enhance the p erformance of eigenvalue metho ds In the immediate future we see two dierent

directions

The rst is to build a high quality eigenvalue software During this study we have developed a signicant

number of co des to prove the concept and to conduct tests One way of extending this research is to build

a coherent software package out of the existing co des In building this package a few issues need to b e

addressed to ensure stable p erformance of the software For example a more eective use of the large

eigenvector array a more eective thick restarting scheme a b etter dynamic scheme for switching b etween

the Davidson iteration and the p olynomial metho ds are needed We would like also to use this eigenvalue

co de in a variety of applications Through the use of PVM and MPI data communication libraries we

have b een able to easily p ort the eigenvalue co de onto parallel environments In constructing this software

package we should continue to improve the structure of the program so that it requires little or no additional

work to run the eigenvalue packages eciently on most parallel architectures

The second front of research would b e to extend the preconditioning techniques to other types of eigen

value problem For example most preconditioning techniques can b e trivially extended to Hermitian eigen

value problems and unsymmetric eigenvalue problems Through our study here we see that one of the most

robust preconditioning schemes is the CGbased preconditioner It has b een noticed that by using a dynamic

tolerance on the residual of the linear system along with a dynamic limit on the number of matrixvector

multiplications used by CG one could signicantly reduce the execution time of the overall eigenvalue solu

tion pro cess The search for this dynamic tolerance scheme could b e conducted through study of underlying

dynamics

There are also many questions we have raised during the study of hybrid metho ds for eigenvalue problem

They are interesting challenges as well

App endix A

Acknowledgement

I would like to express my sincere gratitude to my research adviser Professor Yousef Saad for his supp ort

and guidance through out my PhD study Many thanks to Edmond Chow Dr Andreas Stathop oulos

Dr Xiao dun Jin and Dr Andrew Anda for their friendship and for their constant inspiration in life and in

research My graduate school life would not b e the same without them

I would also like to thank all the professors who have taught me and given me advice and suggestions

through out my study at the University of Minnesota Due to their eort the past several years have b een

very fruitful time for me

Bibliography

M Alfaro J S Dehesa F J marcellan J L Rubio de Francia and J Vinuesa editors Orthogonal Polynomials

and Their Applications SprigerVerlag Berlin Germany Pro ceedings of an International Symp osium

held in Segovia Spain Sept

E L Allgower and K Georg Numerical Continuation Methods An Introduction SprigerVerlag Berlin

Germany

E L Allgower and K Georg Continuation and path following Acta Numerica pages

E Anderson Z Bai C Bischof J Demmel J Du Croz A Greenbaum S Hammarling A McKenney

S Ostrouchov and D Sorensen LAPACK Users Guide SIAM Philadelphia PA

W E Arnoldi The principle of minimized iteration in the solution of the matrix eigenvalue problem Quart

Appl Math

S F Ashby T A Manteuel and J S Otto A comparison of adaptive Chebyshev and least squares p olynomial

preconditioning for Hermitian p ositive linear systems SIAM J Sci Statist Comput

O Axelsson and I Gustafsson Preconditioning and twolevel multigrid metho ds for arbitrary degree of ap

proximation Math Comput

O Axelsson and P S Vassilevski A survey of multilevel preconditioned iterative metho ds BIT

Owe Axelsson A survey of preconditioned iterative metho ds for linear systems of equations BIT

Owe Axelsson Iterative Solution Methods Press Syndicate of the University of Cambridge Cambridge UK

Z Bai D Day and Q Ye ABLE an adaptive blo ck Lanczos metho d for nonHermitian eigenvalue problems

Technical Rep ort Univeristy of Kentucky Department of Mathematics

Z Bai and G W Stewart SRRIT a FORTRAN subroutine to calculate the dominant invariant subspace of

a nonsymmetric matrix Technical Rep ort UMIACS TR University of Maryland at College Park College

Park MD

Zhao jun Bai Progress in the numerical solution of the nonsymmetric eigenvalue problem Numerical Linear

Algebra with Applications

J G L Bo oten H A van der Vorst P M Meijer and H J J te Riele A preconditioned JacobiDavidson

metho d for solving large generalized eigenvalue problems Technical Rep ort NMR Mathematisch Centrum

Centrum voor Wiskunde en Informatica Amsterdam

John P Boyd A Chebyshev p olynomial intervalsearching metho d lanczos economization for solving a

nonlinear equation with application to the nonlinear eigenvalue problem J Comput Phys

J H Bramble A V Knyazev and J E Pasciak A subspace preconditioning algorithm for eigenvec

toreigenvalue computation Technical Rep ort Center for Computational Mathematics University of Col

orado at Denver

C Brezinski and M RedivoZaglia Hybrid pro cedures for solving linear systems Numer Math

A Chapman and Y Saad Deated and augmented Krylov subspace techniques Technical Rep ort UMSI

Minnesota Sup ercomputing Institute Univ of Minnesota

F Chatelin Eigenvalues of Matrices Wiley

J R Chelikowsky N Troullier and Y Saad Finitedierencepseudop otential metho d electronic structure

calculations without a basis Phys Rev Lett

J R Chelikowsky N Troullier K Wu and Y Saad Higher order nite dierence pseudop otential metho d

an application to diatomic molecules Phys Rev B

J R Chelikowsky N R Troullier X Jing D Dean N Binggeli K Wu and Y Saad Algorithms for the

structural prop erties of clusters Computer Physics Communications

T S Chihara editor An Introduction to Orthogonal Polynomials Gordon and Breach Science Publishers

New York London Paris

E Chow and Y Saad Approximate inverse preconditioners via sparsesparse iterations SIAM J Sci Comput

To app ear

M L Cohen and J R Chelikowsky Electronic Structure and Optical Properties of Semiconditors Springer

Verlag New York Berlin Heidelb erg nd edition

M Crouzeix B Philipp e and M Sadkane The Davidson metho d SIAM J Sci Comput

J Cullum and R A Willoughby Lanczos Algorithms for Large Symmetric Eigenvalue Computations Theory

volume of Progress in Scientic Computing v Birkhauser Boston

J Cullum and R A Willoughby Lanczos Algorithms for Large Symmetric Eigenvalue Computations Programs

volume of Progress in Scientic Computing v Birkhauser Boston

J W Daniel W B Gragg L Kaufman and G W Stewart Reorthogonalization and stable algorithms for

up dating the GramSchmidt QR factorization Math Comp Octob er

Ernest R Davidson The iterative calculation of a few of the lowest eigenvalues and corresp onding eigenvectors

of large realsymmetric matrices J Comput Phys

Ernest R Davidson Sup ermatrix metho ds Computer Physics Communications

Ernest R Davidson Monster matrices their eigenvalues and eigenvectors Computers in Physics

Philip J Davis Interpolation and Approximation Dover Publications New York

J J Dongarra Improving the accuracy of computed singular values SIAM J Sci Statist Comput

J J Dongarra S Hammarling and J H Wilkinson Numerical considerations in computing invariant sub

spaces SIAM J Matrix Anal Appl

J J Dongarra C B Moler and J H Wilkingson Improving the accuracy of computed eigenvalues and

eigenvectors SIAM J Numer Anal

I S Du R G Grimes and J G Lewis Sparse matrix test problems ACM Trans Math Soft pages

A Edelman T Arias and S T Smith Conjugate Gradient and Newtons metho d on the Grassmann and

Stiefel manifolds Submitted to SIAM J Matrix Anal Appl

B Fischer and R W Freund On adaptive weighted p olynomial preconditioning for Hermitian p ositive matrices

SIAM J Sci Comput March

C Froese Fischer Multitasking the Davidson algorithm for the large sparse eigenvalue problem International

Journal of Supercomputer Applications

D R Fokkema G L G Sleijp en and H A van der Vorst JacobiDavidson style QR and QZ algorithms

for the partial reduction of matrix p encils Technical Rep ort nr Department of mathematics Universiteit

Utrecht

B Fornberg and D M Sloan A review of pseudosp ectral metho ds for solving partial dierential equations

Acta Numerica pages

Bengt Fornberg Highorder nite dierences and the pseudosp ectral metho d on staggered grids SIAM J

Numer Anal

Bengt Fornberg Rapid generation of weights in nite dierence formulas In Dundee

pages Longman Sci Tech Harlow

Roland W Freund QusaiKernel p olynomials and convergence results for qusaiminimal residual iterations In

Braess and Schumack editors Numerical Methods of Approximation Theory pages Birkhauser Basel

Roland W Freund QusaiKernel p olynomials and their use in nonHermitian matrix iterations Journal of

Computational and Applied Mathematics

Alb ert T Galick Ecient solution of large sparse eigenvalue problems in microelectronic simulation PhD

thesis University of Illinois at UrbanaChampaign

A George and J W H Liu Computer Solution of Large Sparse PositiveDenite Systems PrenticeHall New

Jersey

G H Golub editor Fast Poisson solvers pages Math Asso c America Washington DC

G H Golub and D P OLeary Some history of the Conjugate Gradient and Lanczos algorithms

SIAM Review

G H Golub and C F van Loan Matrix Computations The John Hopkins University Press Baltimore MD

B S Grab ow J M Boyle J J Dongarra and C B Moler editors Matrix Eigensystem Routines EISPACK

Guide Extension Number in Lecture Notes in Computer Science Springer New York nd edition

R G Grimes J G Lewis and H D Simon A shifted blo ck Lanczos algorithm for solving sparse symmetric

generalized eigenproblems SIAM J Matrix Anal Appl

Martin H Gutknecht A completed theory of the unsymmetric Lanczos pro cess and related algorithms part I

SIAM J Matrix Anal Appl

Martin H Gutknecht A completed theory of the unsymmetric Lanczos pro cess and related algorithms part

I I SIAM J Matrix Anal Appl

B Harro d A Sameh and W J Harro d Trace minimization algorithm for the generalized eigenvalue problems

In Garry Ro drigue editor Parallel Processing for Scientic Computing Los Angeles CA pages

Philadelphia PA SIAM

Michael A Heroux A reverse communication interface for matrixfree preconditioned iterative solvers In

C A Brebbia D Howard and A Peters editors Applications of Supercomputers in Engineering II pages

Boston Computational Mechanics Publications

X Jing N R Troullier J R Chelikowsky K Wu and Y Saad Vibrational mo des of silicon nanostructures

Solid State Communication

X Jing N R Troullier D Dean N Binggeli J R Chelikowsky K Wu and Y Saad Ab Initio molecular

dynamics simulations of Si clusters using the higherorder nitedierencepseudop otential metho d Phys Rev

B

Xiao dun Jing Theory of the electronic and structural properties of semiconductor clusters PhD thesis Uni

versity of Minnesota Minneap olis Minnesota

Claes Johnson Numerical Solution of Partial Dierential Equations by Finite Eqlement Method Cambridge

University Press Cambridge New York

Wayne Joub ert A robust GMRESbased adaptive p olynomial preconditioning algorithm for nonsymmetric

linear systems SIAM J Sci Comput March

Charles Kittel Quantum Theory of Solids John Wiley Sons New York

Charles Kittel Introduction to Solid State Physics John Wiley Sons New York

A V Knyazev and A L Skorokhodov Preconditioned gradienttype iterative metho ds in a subspace for partial

generalized symmetric eigenvalue problems SIAM J Numer Anal

L Yu Kolotilina and A Yu Yeremin Factorized sparse approximate inverse preconditionings I theory SIAM

J Matrix Anal Appl

J D Kress S B Woo dru G A Parker and R T Pack Some strategies for enhancing the p erformance of

the blo ck Lanczos metho d Computer Physics Communications

V Kumar A Grama A Gupta and G Karypis Introduction to Parallel Computing design and analysis of

algorithms The BenjaminCummings Publishing Company Redwoo d City California

C Lanczos An iterative metho d for the solution of the eigenvalue problem of linear dierential and integral

op erators Journal of Research of the National Bureau of Standards

R B Lehoucq Restart an Arnoldi reduction Technical Rep ort MCSP Argonne National Lab oratory

Argonne IL Submitted to SIAM Journal on Scientic Computing

R B Lehoucq and K Meerb ergen The inexact rational Krylov sequence metho d Technical Rep ort MCS

P Argonne National Lab oratory Argonne IL

R B Lehoucq and D Sorensen Deation techniques for an implicitly restarted Technical

Rep ort TR Department of Computational and Applied Mathematics Rice University Houston Texas

Accepted for publication in SIAM Journal on Matrix Analysis and Applications

Richard B Lehoucq Analysis and implementation of an implicitly restarted Arnoldi iteration PhD thesis Rice

University

G C Lo and Y Saad Iterative solution of general sparse linear systems on clusters of workstations Technical

Rep ort UMSI Minnesota Sup ercomputer Institute University of Minnesota Minneap olis MN

Karl Meerb ergen Robust methods for the calculation of rightmost eigenvalues of nonsymmetric eigenvalue

problems PhD thesis K U Leuven Heverlee Belgium

J A Meijerink and H A van der Vorst An iterative solution metho d for lineat systems of which the co ecient

matrix is a symmetric Mmatrix Math Comp

R B Morgan Davidsons metho d and preconditioning for generalized eigenvalue problems J Comput Phys

R B Morgan Computing interior eigenvalues of large matrices Lin Alg Appl pages

R B Morgan Generalizations of Davidsons metho d for computing eigenvalue of large nonsymmetric matrices

J Comput Phys August

R B Morgan and D S Scott Generalizations of Davidsons metho d for computing eigenvalues of sparse

symmetric matrices SIAM J Sci Statist Comput

R B Morgan and D S Scott Preconditioning the Lanczos algorithm for sparse symmetric eigenvalue problems

SIAM J Sci Comput

R B Morgan and M Zeng Estimates for interior eigenvalues of large nonsymmetric matrices

Ronald B Morgan On restarting the Arnoldi metho d for large nonsymmetric eigenvalue problems

K W Morton and D F Mayers Numerical Solution of Partial Dierential Equations Cambridge University

Press Cambridge New York

S G Mulyarchik S S Bielawski and A V Popov Ecient computational schemes of the Conjugate Gradient

metho d for solving linear systems J Comput Phys

C W Murray S C Racine and E R Davidson Improved algorithms for the lowest eigenvalues and asso ciated

eigenvectors of large matrices J Comput Phys

R Natara jan and D Vanderbilt A new iterative scheme for obtaining eigenvectors of large realsymmetric

matrices J Comput Phys

Roy L Nersesian Computer simulation in nancial risk management a guide for business planners and

strategists Quorum Bo oks New York

Michael Oettli The Homotopy Method Applied to the Symmetric Eigenproblem PhD thesis Swiss Federal

Institute of Technology Zurich Zurich Switzerland

J Olsen P Jorgensen and J Simons Passing the onebillion limit in full conguration interaction FCI

calculations Chemical Physics Letters

C C Paige B N Parlett and H A van der Vorst Approximate solutions and eigenvalue b ounds for Krylov

subspaces Numerical with Applications

G A Parker W Zhu Y Huang D K Homan and D J Kouri Matrix pseudosp ectroscopy iterative

calculation of matrix eigenvalues and eigenvectors of large matrices using a p olynomial expansion of the Dirac

delta function Computer Physics Communications

B N Parlett The Rayleigh quotient iteration and some generalizations for nonnormal matrices Math Comp

B N Parlett Two monitoring schemes for the Lanczos algorithm In R Glowinski and J L Lions editors

Computing Methods in Applied Science and Engineering V pages NorthHolland Publishing Company

Beresford N Parlett The symmetric eigenvalue problem PrenticeHall Englewoo d Clis NJ

Beresford N Parlett Towards a black b ox Lanczos program Computer Physics Communications

Kamran Parsaye and Mark Chignell Intel ligent database tools applications hyperinformation access data

quality visualization automatic discovery Wiley New York

G Peters and J H Wilkinson Inverse iteration illconditioned equations and Newtons metho d SIAM Review

July

S Petiton Y Saad K Wu and W Ferng Basic sparse matrix computations on the CM International

Journal of Modern Physics C

Philip Pfa Financial modeling Allyn and Bacon Needham Heights Mass

Theo dore J Rivlin Chebyshev Polynomials Wiley New York nd edition

A Ruhe Implementation asp ects of band Lanczos algorithms for computation of eigenvalues of large sparse

symmetric matrices Math Comput

A Ruhe Rational Krylov sequence metho ds for eigenvalue computation Lin Alg Appl

Heinz Ruthishauser Computational asp ects of F L Bauers simultaneous iteration metho d Numer Math

Heinz Ruthishauser Simultaneous iteration metho d for symmetric matrices Numer Math

Y Saad Analysis of augmented Krylov subspace techniques Technical Rep ort UMSI Minnesota

Sup ercomputing Institute Univ of Minnesota

Y Saad and K Wu Design of an iterative solution mo dule for a parallel matrix library P SPARSLIB Applied

Numerical Mathematics Was TR Department of Computer Science University of

Minnesota

Y Saad K Wu and S Petiton Sparse matrix computations on the CM In R Sincovec D Keyes M Leuze

L Petzold and D Reed editors Proceedings of Sixth SIAM Conference on Parallel Processing for Scientic

Computing pages Philadelphia SIAM

Yousef Saad On the rate of convergence of the Lanczos and the blo ckLanczos metho ds SIAM J Numer

Anal

Yousef Saad Variations on Arnoldis metho d for computing eigenelements of large unsymmetric matrices Lin

Alg Appl

Yousef Saad Iterative solution of indenite symmetric linear systems by metho ds using orthogonal p olynomials

over two disjoint intervals SIAM J Numer Anal

Yousef Saad Chebyshev acceleration techniques for solving nonsymmetric eigenvalue problems Math Comp

Yousef Saad Practical use of p olynomial preconditioning for the Conjugate Gradient metho d SIAM J Sci

Statist Comput

Yousef Saad Preconditioning techniques for nonsymmetric and indenite linear systems Journal of Computa

tional and Applied Mathematics

Yousef Saad SPARSKIT A basic to olkit for sparse matrix computations Technical Rep ort Research

Institute for Advanced Computer Science NASA Ames Research Center Moet Field CA Software

currently available at ftpftpcsumnedudeptsparse

Yousef Saad Numerical Methods for Large Eigenvalue Problems Manchester University Press

Yousef Saad Highly parallel preconditioners for general sparse matrices In G Golub A Greenbaum and

M Luskin editors Recent advances in Iterative Methods pages SpringerVerlag

Yousef Saad ILUT a dual threshold incomplete ILU factorization with Applications

Technical Rep ort Minnesota Sup ercomputer Institute University of Minnesota

Yousef Saad Iterative Methods for Sparse Linear Systems PWS publishing Boston MA

Miloud Sadkane Blo ckArnoldi and Davidson metho ds for unsymmetric large eigenvalue problems Numer

Math

Miloud Sadkane A blo ck ArnoldiChebyshev metho d for computing the leading eigenpairs of large sparse

unsymmetric matrices Numer Math

A H Sameh and J A Wisniewski A trace minimization algorithm for the generalized eigenvalues problem

SIAM J Numer Anal

V Simoncini and E Gallop oulos A hybrid blo ck GMRES metho d for nonsymmetric systems with multiple

righthand sides Technical Rep ort Center for Sup erconputing Research and Development University of

Illinois at UrbanaChampaign Urbana IL

G L G Sleijp en A G L Bo oten D R Fokkema and H A van der Vorst JacobiDavidson type metho ds

for generalized eigenproblems and p olynomial eigenproblems Part I Technical Rep ort nr Department of

mathematics Universiteit Utrecht

G L G Sleijp en and H A van der Vorst A JacobiDavidson iteration metho d for linear eigenvalue problems

SIAM J Matrix Anal Appl

G L G Sleijp en and H A van der Vorst The JacobiDavidson metho d for eigenvalue problems and its relation

with accelerated inexact newton schemes

B T Smith J M Boyle B S Garb ow Y Ikebe V C Klema and C B Moler editors Matrix eigensystem

Routines EISPACK Guide Number in Lecture Notes in Computer Science Springer New York nd

edition

Gordon D Smith Numerical Solution of Partial Dierential Equations Finite Dierence Methods Oxford

University Press

D Sorensen R Lehoucq P Vu and C Yang ARPACK an implementation of the Implicitly Restarted

Arnoldi iteration that computes some of the eigenvalues and eigenvectors of a large sparse matrix Available

from ftpcaamriceedu under the directory pubp eoplesorensenARPACK

D C Sorensen and C Yang A truncated RQiteration for large scale eigenvalue calculations Technical Rep ort

TR Department of Computational and Applied Mathematics Rice University Houston Texas

D S Sorensen Implicit application of p olynomial lters in a Kstep Arnoldi metho d SIAM J Matrix Anal

Appl

G Starke and R S Varga A hybrid ArnoldiFaber iterative metho d for nonsymmetric systems of linear

equations Numer Math

A Stathop oulos Y Saad and C F Fischer Robust preconditioning of large sparse symmetric eigenvalue

problems Journal of Computational and Applied Mathematics

A Stathop oulos Y Saad and K Wu Dynamic thick restarting of the davidson and the implicitly restarted

arnoldi metho ds Technical Rep ort UMSI University of Minnesota Sup ercomputer Institute

A Stathop oulos Y Saad and K Wu Thick restarting of the Davidson metho d an extension to implicit

restarting In T Manteuel S McCormick L Adams S Ashby H Elman R Freund A Greenbaum

S Parter P Saylor N Trefethen H van der Vorst H Walker and O Wildlund editors Proceedings of

Copper Mountain Conference on Iterative Methods Copp er Mountain Colorado

Andreas Stathop oulos and Charlotte Fischer Reducing synchronization on the parallel Davidson metho d for

the large sparse eigenvalue problem In Supercomputing pages Los Alamitos CA IEEE

Comput So c Press

Andreas Stathop oulos and Charlotte Fischer A Davidson program for nding a few selected extreme eigenpairs

of a large sparse real symmetric matrix Computer Physics Communications

Eduard L Stiefel Kernel Polynomials in Linear Algebra and Their Numerical Applications National Bureau

of Standards Washington D C

J Sto er and R Bulirsch Introduction to Numerical Analysis SpringerVerlag New York

Bjarne Stroustrup The C Programming Language AddisonWesley Publishing nd edition

Gab or Szego Orthogonal Polynomials American Mathematical So ciety Providence Rho de Island th edition

R A Tapia Newtons metho d for optimization problems with equality constraints SIAM J Numer Anal

R A Tapia Newtons metho d for problems with equality constraints SIAM J Numer Anal

R A Tapia Diagonalized multiplier metho ds and quasiNewton metho ds for constrained optimization Journal

of Optimization Theory and Applications

James W Thomas Numerical Partial Dierential Equations Springer New York

Charles H Tong Parallel preconditioned Conjugate Gradient methods for el liptic partial dierential equations

PhD thesis University of California Los Angeles CA

3 2

N Troullier J R Chelikowsky and Y Saad Calculating large systems with plane waves is it a N or N

scaling problem Solid State Communications

N R Troullier and J L Martins Ecient pseudop otentials for planwave calculations Phys Rev B

M B van Gijzen A p olynomial preconditioner for the GMRES algorithm Journal of Computational and

Applied Mathematics

J H van Lenthe and P Pulay A spacesaving mo dication of Davidsons eigenvector algorithm J Comput

Chem

A F Voter J D Kress and R N Silver Linearscaling tight binding from a truncatedmoment approach

Phys Rev B

T Washio and K Hayami Parallel blo ck preconditionering based on SSOR and MILU Numerical Linear

Algebra with Applications

D S Watkins and L Elsner Convergence of algorithms of decomp osition type for the eigenvalue problems

Lin Alg Appl

J H Wilkinson Algebraic Eigenvalue Problem Monographs on Numerical Analysis Oxford Science Publica

tions New York

K Wu Y Saad and A Stathop oulos Preconditioned Krylov subspace metho ds for eigenvalue problems In

T Manteuel S McCormick L Adams S Ashby H Elman R Freund A Greenbaum S Parter P Say

lor N Trefethen H van der Vorst H Walker and O Wildlund editors Proceedings of Copper Mountain

Conference on Iterative Methods Copp er Mountain Colorado