AMSC 600 /CMSC 760 Advanced Linear Numerical Analysis Fall 2007 Arnoldi Methods Dianne P. O’Leary c 2006, 2007
Algorithms that use the Arnoldi Basis
Reference: Chapter 6 of Saad
The Arnoldi Basis How to use the Arnoldi Basis FOM (usually called the Arnoldi iteration) GMRES Givens Rotations Relation between FOM and GMRES iterates Practicalities Convergence Results for GMRES
The Arnoldi Basis
Recall:
The Arnoldi basis v1, . . . , vn is obtained column by column from the relation AV = VH where H is upper Hessenberg and VT V = I.
In particular, hij = (Avj , vi). The vectors vj can be formed by a (modi ed) Gram Schmidt process.
For three algorithms for computing the recurrence, see Sec 6.3.1 and 6.3.2.
Facts about the Arnoldi Basis
Let’s convince ourselves of the following facts:
1 The rst m columns of V form an orthogonal basis for m(A, v ). K 1 Let Vm denote the rst m columns of V, and Hm denote the leading m m block of H. Then T AVm = VmHm + hm+1,mvm+1em.
The algorithm breaks down if hm ,m = 0. In this case, we cannot get a +1 new vector vm+1.
This breakdown means that m (A, v ) = m(A, v ). In other words, K +1 1 K 1 m(A, v ) is an invariant subspace of A of dimension m. K 1
How to use the Arnoldi Basis
We want to solve Ax = b.
We choose x = 0 and v = b/ , where = b . 0 1 k k
Let’s choose xm m(A, v ). ∈ K 1 Two choices:
Projection: Make the residual orthogonal to m(A, v1). (Section 6.4.1) This is the Full OrthogonalizationK Method (FOM). Minimization: Minimize the norm of the residual over all choices of x in m(A, v1). (Section 6.5) This is the Generalized Minimum Residual MethoK d (GMRES).
FOM (usually called the Arnoldi iteration)
The Arnoldi relation:
T AVm = VmHm + hm+1,mvm+1em.
If xm m(A, v ), then we can express it as xm = Vmym for some vector ym. ∈ K 1 The residual is
rm = b Axm = v1 AVmym T = v (VmHm + hm ,mvm e )ym. 1 +1 +1 m
2 To enforce orthogonality, we want
T 0 = Vmrm T T T T = V v (V VmHm + hm ,mV vm e )ym. m 1 m +1 m +1 m = e Hmym. 1
Therefore, given the Arnoldi relation at step m, we solve the linear system
Hmym = e1 and set
xm = Vmym,
rm = b Axm = b AVmym T = Vme1 (VmHm + hm+1,mvm+1em)ym T = Vm( e1 Hmym) hm+1,mvm+1emym T = hm ,mvm e ym +1 +1 m = hm ,m(ym)mvm . +1 +1
Therefore, the norm of the residual is just hm ,m(ym)m , which can be | +1 | computed without forming xm.
GMRES
The Arnoldi relation:
T AVm = VmHm + hm+1,mvm+1em.
The iterate: xm = Vmym for some vector ym.
De ne H m to be the leading (m + 1) m block of H. The residual:
T rm = v (VmHm + hm ,mvm e )ym 1 +1 +1 m = v Vm H mym 1 +1 = Vm ( e H mym). +1 1
The norm of the residual:
rm = Vm ( e H mym) = e H mym . k k k +1 1 k k 1 k
We minimize this quantity w.r.t. ym to get the GMRES iterate.
3 Givens Rotations
The linear system involving Hm is solved using m Givens rotations.
The least squares problems involving H m is solved using 1 additional Givens rotation.
Relation between FOM and GMRES iterates
F FOM: Hmym = e1, or T F T HmHmym = Hme1.
G GMRES: minimize e H my by solving the normal equations k 1 mk T G T Hm Hmym = Hm e1.
Note that Hm Hm = T , hm ,me · +1 m ¸ T T so HmHm and Hm Hm di er by a matrix of rank 2.
Therefore (Sec 6.5.7 and Sec 6.5.8),
If we compute the FOM iterate, we can easily get the GMRES iterate, and vice versa.
It can be shown that the smallest residual computed by FOM in iterations 1, . . . , m is within a factor of √m of the smallest for GMRES.
There is a recursion for computing the GMRES iterates given the FOM iterates: G 2 G 2 F x = s x c x , m m m 1 m m 2 2 where sm + cm = 1 and cm is the cosine used in the last step of the reduction of H m to upper triangular form.
Practicalities
Practicality 1: GMRES and FOM can stagnate.
4 It is possible (but unlikely) that GMRES and FOM make no progress at all until the last iteration.
Example: Let eT 1 A = , x = e , b = e . I 0 true n 1 · ¸ (e is the vector with every entry equal to 1.) Then m(A, b) = span e1, . . . , em , so the solution vector is orthogonal to m(A, b) foKr m < n. { } K
This is called complete stagnation. Partial stagnation can also occur.
The usual remedy is preconditioning (discussed later).
Practicality 2: m must be kept relatively small.
As m gets big,
The work per iteration becomes overwhelming, since the upper Hessenberg part of Hm is generally dense. In fact, the work for the orthogonalization grows as O(m2n) The orthogonality relations are increasingly contaminated by round-o error, and this can cause the computation to compute bad approximations to the true GMRES/FOM iterates.
Two remedies:
restarting (Sec 6.4.1 and Sec 6.5.5) truncating the orthogonalization (Sec 6.4.2 and Sec 6.5.6)
Restarting
In the restarting algorithm, we run the Arnoldi iteration for a xed number of steps, perhaps m = 100.
If the resulting solution is not good enough, we repeat, by setting v1 to be the normalized residual.
This limits the work per iteration, but removes the nite-termination property.
In fact, due to stagnation, it is possible (but unlikely) that the iteration never makes any progress at all!
5 Truncating the orthogonalization
In this algorithm, we modify the Arnoldi iteration so that we only orthogonalize against the latest k vectors, rather than orthogonalizing against all of the old vectors.