Sparse Approximate Inverse Methods

Bachelor thesis in Applied Mathematics July 2013 Student : K. van Geffen Primary Supervisor : dr. B. Carpentieri Secondary Supervisor : prof. dr. H. Waalkens Abstract In undergraduates numerical mathematics courses I was strongly warned that inverting a for computational purposes is generally very inefficient. Not only do we have to do more computation than factorization, but we also lose sparsity in the matrix. However, doing a search in the literature, I found that for many computational settings the inverse, although dense, may contain many small entries that can be dropped. As a result we approximate the inverse by a . Techniques for constructing such a sparse approximate inverse can be effectively used in many applications of numerical analysis, e.g. for preconditioning of linear systems and for smoothing multigrid methods. I describe some of the most popular algorithms and through some theory, numerical experiments and examples of applications, I show that sparse approximate inverse methods can be competitive, sometimes even superior, to standard factorization methods based on incomplete LU decomposition. Keywords: Sparse approximate inverse, Preconditioning, Krylov subspace methods, Iterative methods

i Acknowledgements

I would like to gratefully thank my primary supervisor dr. B. Carpentieri for the supervision during this work, the research of this thesis would not have been possible without his expertise on the subject. I would also like to thank my primary supervisor for providing me with an abundance of the needed material and for his extensive feedback on the work during the process. Thanks to his guidance the writing of this thesis has been a great learning experience.

ii Contents

1 Introduction 1

2 Krylov Methods 2 2.1 Theoretical background on Krylov Methods ...... 2 2.1.1 The nonsingular case ...... 3 2.1.2 The singular case ...... 3 2.1.3 Some remarks on Krylov methods ...... 5 2.2 Arnoldi Method ...... 5 2.3 Generalization on Richardson iterations ...... 6 2.4 Krylov subspace methods ...... 7 2.4.1 Ritz-Galerkin projection ...... 7 2.4.2 The minimum residual approach ...... 8 2.4.3 Petrov-Galerkin projection ...... 9 2.4.4 The minimum error approach ...... 10

3 Preconditioning 11 3.1 The concept of preconditioning ...... 11 3.1.1 Discussion on preconditioning ...... 14 3.2 Preconditioning techniques ...... 16 3.2.1 Incomplete LU-factorization ...... 16 3.2.2 Zero Fill-In ILU (ILU(0)) ...... 18 3.2.3 ILU(p) ...... 18 3.2.4 ILUT ...... 20

4 Sparse Approximate Inverse Preconditioners 21 4.1 Motivation to sparse approximate inverse techniques ...... 22 4.1.1 A priori pattern selection ...... 23 4.2 Frobenius norm minimization ...... 25 4.2.1 SPAI ...... 26 4.3 Factorized sparse approximate inverses ...... 28 4.3.1 FSAI ...... 28 4.3.2 AINV ...... 30 4.3.3 Factorized vs unfactorized preconditoners ...... 32 4.4 Approximate inverses of ILU-factorizations ...... 32

5 Software Implementation 34

iii 5.1 Software libraries and packages ...... 34 5.1.1 PETSc ...... 35 5.1.2 SPAI ...... 35 5.1.3 ParaSails ...... 36 5.1.4 HSL MI12 ...... 37

6 Numerical experiments 39 6.1 Comparative study on SPAI and ILUT preconditioners ...... 39 6.2 Electromagnetic Scattering - the Satellite Problem ...... 43 6.2.1 Spectral deflation ...... 45

7 Conclusion 49

References 50

Appendices 51

A ILUT algorithm 51

B MATLAB code for GMRES(m) 51

iv 1 Introduction

In my thesis I review state-of-the-art techniques for computing sparse approximate inverses. That is, we consider methods for constructing a sparse approximation to the inverse of some matrix. These techniques can be effectively used in many applications of numerical analysis, e.g. for preconditioning of linear systems and for smoothing multigrid methods. For the purpose of this thesis we are mainly interested in linear systems, the problems we address to are of the form

Ax = b, (1.1) where A is an n × n large and sparse matrix with real entries and b is a given, real, right-hand side. Sparse linear systems are ubiquitous in computational science. They arise, for instance, in the numerical solution of partial differential equations, of inverse problems and in optimization. Direct methods are in general very robust, but they are not feasible for large systems; they are expensive in terms of both work and memory. Modern iterative methods, namely Krylov subspace methods, can solve the bottleneck of memory, but it is now well established that convergence might be slow when the coefficient matrix A is highly nonsymmetric and/or indefinite. With the assistance of a preconditioner, we could try to transform system (1.1) into an equivalent system which is more amenable to an iterative solver. Well-known standard preconditioning techniques compute a nonsingular approximation M to the matrix A and the equivalent system might for example be M −1Ax = M −1b. Usually the construction of M is based on an incomplete LU decomposition. In contrast, we could construct a preconditioner M that directly approximates the inverse matrix, i.e. M ≈ A−1. In this case the equivalent system might be given by MAx = Mb. From numerical experiments it is observed that the inverse of sparse matrix is typically dense, but a lot of the entries are of small magnitude. This justifies the motivation to approximate the inverse by a sparse matrix. In the last decade, a significant amount of research has been devoted to develop such preconditioning techniques. There are several advantages of sparse approximate inverse techniques. The most obvious one is that the preconditioning operation reduces to forming one (or more) matrix-vector products, whereas for standard techniques we require a linear solve. Hence, approximate inverse meth- ods are generally less prone to instabilities. Another issue we address to in this thesis is that sparse approximate inverses are suitable for a parallel environment; matrix-vector products are nowadays already efficiently implemented in parallel, but in many cases the constructing phase of the sparse approximate inverse preconditioner is implemented in parallel as well. The stan- dard techniques, however, are highly sequential. Other issues we address to in these thesis are: pattern selection, the use of factorized or non-factorized preconditioners and numerical software packages. The structure of the thesis is as follows. In Section 2 we give a theoretical background on Krylov subspace methods. In Section 3 we discuss the concept of preconditioning and we review some well-known standard preconditioning techniques. The most popular sparse approximate inverse techniques are described in Section 4. Next, in Section 5, we will see what software packages are provided for some of the described algorithms. In Section 6 we conduct some numerical experiments and we will see an application of sparse approximate inverse in a practical implementation. Finally, in Section 7 we draw some conclusions arising from the work.

1 2 Krylov Methods

In the field of iterative techniques for solving linear systems, Krylov methods are nowadays among the most important. Many well known algorithms, such as the Generalized Minimum Residual (GMRES), Conjugate Gradient (CG), Full Orthogonalization Method (FOM), Lanczos Method, Biconjugate Gradient (Bi-CG) and Biconjugate Stabilized (Bi-CGSTAB) are based on projection techniques on Krylov subspaces [15]. Depending on the properties of the system, one method tends to be more efficient than the other. Later on we give an overview of these methods and tell for what kind of systems each particular method is suitable. But first we will give a theoretical background on Krylov methods and show that Krylov methods can be introduced as a generalization of Richardson iterations. Recall that the standard Richardson iteration is based on the A = I − (I − A). Similar techniques that are based on matrix splitting are the Jacobi and Gauss-Seidel method. As it turns out, Krylov methods are a generalization of these type of methods as well.

2.1 Theoretical background on Krylov Methods

In this theoretical background we do not concern ourselves with technical details that come into play when we consider numerical applications, nor do we go into detail on the concept of preconditioning yet. For now we will just focus on the idea behind the Krylov methods. This means that the following is based on exact arithmetic.

Now, as in all iterative methods, we start with an initial guess x(0). For the purpose of this thesis we assume that there is no information available on the solution x to the system Ax = b and hence we will always choose x(0) = 0. Krylov methods search in the kth iteration for an approximated solution x(k) in the Krylov subspace generated by the vector b, which is defined by1:

k−1 Kk(A, b) := span{b, Ab, . . . , A b}. (2.1)

The subspace is usually simply referred to by Kk. At first glance it is not clear why it makes sense to search for a solution in a Krylov subspace and it is certainly not right away clear why a Krylov method has the opportunity to converge fast. Ilse C.F. Ipsen and Carl D. Meyer tried to answer these questions in an article on Krylov methods [12]. In this theoretical background on Krylov subspace methods, we will follow their reasoning starting with the nonsingular case. After that we consider the singular case and give some restrictions to b that need to hold in order for a Krylov solution x to exist. We will always make a distinction between regular solutions and Krylov solutions. We already know that a regular solution exists if and only if b is in the range of A, i.e. b ∈ R(A). We will see that for singular matrices we have to meet slightly different restrictions for the existence of Krylov solutions.

1In other literature you may find that instead of the vector b, the Krylov subspace is generated by the vector r(0) := b − Ax(0). This is a generalization for methods that start with a nonzero vector.

2 2.1.1 The nonsingular case

When A is nonsingular we know that the unique solution is given by x = A−1b. We wish to find the minimal integer k such that x ∈ Kk. The following definition will help us to find this integer k.

Definition 2.1 The minimal polynomial q(t) of A is the unique monic polynomial of least degree such that q(A) = 0. 

As a consequence of the Cayley-Hamilton theory, the degree of q(t) can not exceed n. Using the fact that every matrix A is similar to a Jordan matrix J, i.e. A = XJX−1, we can construct the minimal polynomial q(t) from J. This is because 0 = q(A) = q(XJX−1) = Xq(J)X−1 and hence q(J) = 0. Let A have d distinct eigenvalues λ1, . . . , λd, then to each λj we can assign an index mj which is the size of the largest Jordan block associated to λj. This allows us to determine the minimal polynomial of A:

d d X Y mj m := mj and q(t) = (t − λj) (2.2) j=1 j=1 where q(t) is of degree m. Notice that this equation holds for singular matrices as well. However, only when A is nonsingular we can use this polynomial to express A−1 in terms of a finite sum of powers from A. First we write q(t) as follows:

m X j q(t) = αjt j=0

Qd mj where α0 = j=1(−λj) . To express the inverse of A in terms of powers of A, we proceed as follows:

m−1 1 X 0 = q(A) = α I + α A + ... + α Am and hence A−1 = − α Aj. (2.3) 0 1 m α j+1 0 j=0

In case A is singular we notice α0 = 0 and the inverse of eqn (2.3) is not well defined. For −1 nonsingular matrices we notice x = A b and hence x ∈ Km. We formulate the result in the following theorem.

Theorem 2.2 Let q(t) be the minimal polynomial of degree m to the nonsingular matrix A, then −1 Km is the smallest Krylov subspace that contains the exact solution x = A b. 

The minimal polynomial gives us some insight on why it makes sense to look for a solution in a Krylov subspace. Also, when A has a low degree minimal polynomial, we might expect fast convergence. Next we will see what the Krylov method computes to singular systems.

2.1.2 The singular case

As said before, every matrix is similar to a Jordan matrix. For this reason we confine this discussion to Jordan matrices, because with a little extra work one could generalize the idea using a similarity transformation. The Jordan matrix is unique up to a permutation of the

3 diagonal blocks of which it exists. Moreover, Jordan blocks from zero eigenvalues are nilpotent; a k × k Jordan block from a zero eigenvalue is a nilpotent matrix of index k. So we can even confine this discussion a little more to Jordan matrices AJ of the form: C 0  A = (2.4) J 0 N where C is a nonsingular Jordan matrix and N is a singular nilpotent matrix of index i (meaning that i is the smallest positive integer such that N i = 0, of course i is equal to the index of the zero eigenvalue of A). To determine whether a Krylov solution exist, we need the following lemma which is stated without proof (see [12]).

Lemma 2.3 For the nilpotent system Nx = c, a Krylov solution for a nonzero right-hand never exists, even when a regular solution does exist. 

With this in mind, we try to find a Krylov solution to the system AJ x = b where AJ is of the form (2.4). First partition both x and b as follows x  b  x = 1 and b = 1 x2 b2 such that Ax = b implies Cx1 = b1 and Nx2 = b2. The second is a nilpotent system and we T know that b2 = 0 must hold in order for a Krylov solution x2 to exist, i.e. b = (b1 0) . Also notice: i C 0  Ci 0  Ci 0 Ai = = = 0 N 0 N i 0 0 From this we conclude that for singular Jordan matrices, a Krylov solution can only exist if b ∈ R(Ai). To see that this is also a sufficient condition we first consider the matrix C. This matrix is nonsingular and hence we know that there exist a minimal polynomial q(t) of degree m − i and a corresponding polynomial p(t) of degree m − i − 1 such that p(C) = C−1 (m can T i not exceed n). Let b = (b1 0) ∈ R(A ), the solution satisfies:

C−1b  p(C) 0 b  p(C) 0  b  x = 1 = 1 = 1 = p(A)b ∈ K (A, b). 0 0 0 0 0 p(N) 0 m−i Ipsen and Meyer proved in their article on Krylov methods that the Krylov solution can be found with a certain pseudo inverse, namely the Drazin inverse2. They also proved that this Krylov solution is unique. In stead of following the proof, we give the conclusive theorem that tells us something about the existence and uniqueness of a Krylov solution for singular systems.

Theorem 2.4 For the singular system Ax = b, a Krylov solution exists and is unique if and only if b ∈ R(Ai) where i is the index of the zero eigenvalue of A. Let m − i be the degree of the D minimal polynomial of A, then for the Krylov solution it holds that x = A b ∈ Km−i(a, b) where AD is the Drazin inverse of A. If b∈ / R(Ai) then the system does not have a Krylov solution in Kn(A, b). 

In conclusion, whenever a linear system is solved with a Krylov subspace solver, either for nonsingular or singular systems, the solution that is found (if it exists) is unique and does not depend on the choice of x(0) .

2For nonsingular matrices, the Drazin inverse generalizes to the ordinary inverse. This shows that there is a gradual transition from the singular to the nonsingular case

4 2.1.3 Some remarks on Krylov methods

Recall the definition of the Krylov subspace from eqn (2.1). From the power method we know that as k increases, the vector Ak−1b becomes more and more part of the eigenspace that corresponds λ2 to dominant eigenvalue of A. Convergence to this eigenspace is of the order O(| |) where λ1 λ1 and λ2 are the dominant and second dominant eigenvalue. In general we might expect that, for k sufficiently large, the set {b, Ab, . . . , Ak−1b} becomes more and more linear dependent. From a numerical point of view this is not favorable and for that reason the Krylov space is always computed in a way such that the space is spanned by an orthogonal set. This is done by an algorithm called the Arnoldi method, see subsection 2.2.

As there exist an iteration step m at which the exact solution is found, Krylov methods could be interpreted as direct methods. However, in practical implementations, one would like to terminate the process of iterating far before the mth step. Also, in general one is satisfied with an approximated solution as long as the approximation holds within reasonable bounds. Since we terminate the process after some k << m steps, we will actually interpret Krylov subspace solvers as iterative techniques.

2.2 Arnoldi Method

th We start of with a Krylov subspace Ki(A, b) at the i iteration step and we assume that we know an orthogonal basis {v1, . . . , vi} for it. In the next iteration step we would like to extend the Krylov subspace with another vector, but now we are only interested in extending the orthogonal basis. In order to do this we compute the vector Avi and orthogonalize it with respect to the vectors v1, . . . , vi. The orthogonalized vector vi+1 is computed in the following way:

i X hi+1,ivi+1 = Avi − hj,ivj (2.5) j=1 where hi+1,i is a constant chosen such that the vector vi+1 is normalized. With the constants hj,i = (vj, Avi) it is ensured that (vj, vi+i) = 0 for j = 1, . . . , i and hence {v1, . . . , vi+1} is an extended orthogonal basis for Ki+1. The process is started with choosing v1 = b/kbk2. i−1 Intuitively speaking, we can say that vi is the normalized part of A b that is linear independent of span{b, Ab, . . . , Ai−2b}. From the theory on Krylov methods we also know that the process terminates at step m, where m is the degree of the minimal polynomial. So when we follow the Arnoldi method we can state:

Ki(A, b) = span{v1, . . . , vi}, ∀i ≤ m (2.6) For i = m the Krylov subspace becomes invariant under A. Recall the definition on invariance.

Definition 2.5 A set V is invariant under A if for every v ∈ V it holds that Av ∈ V. 

There is a very compact form of notation for the Arnoldi method. When we define the n × i matrix Vi which consist of the column vectors v1, . . . , vi and the (i + 1) × i upper Hessenberg matrix Hi+1,i where the elements hk,l are defined by the Arnoldi method, then from eqn (2.5) it is clear that the following holds: AVi = Vi+1Hi+1,i (2.7) This form of notation will be very helpful in understanding certain Krylov methods.

5 2.3 Generalization on Richardson iterations

We will now consider iterative solvers that are based on matrix splitting. It will become clear that these solvers indirectly search for an approximated solution in some Krylov subspace of increasing dimension in each iteration step. As an example we first consider the Richardson iteration, based on the standard splitting A = I − (I − A). The corresponding iteration is given by: x(i) = b + (I − A)x(i−1) = x(i−1) + r(i−1) where r(i−1) := b − Ax(i−1) (2.8) and an iterative scheme for the residual is given by:

r(i) = (I − A)r(i−1) = (I − A)ir(0). (2.9)

In general, matrix splitting is of the form A = N − P where N is considered to be some approximation of A for which a linear system of the form Ny = d is easily solvable and requires low costs on operations and memory. This idea is also on the basis of preconditioning and we will return to this subject later.

For the Richardson iteration we have chosen N = I and we hope that the polynomial Pi(A) = (I−A)i converges to zero reasonably fast. Of course in most cases, I will be a poor approximation of A. Alternatives are the Jacobi and Gauss-Seidel methods, where one chooses N to be the diagonal part and the lower triangular part of A respectively. Whenever N is chosen to be unequal to identity, we can view N as an preconditioner on the system Ax = b. Defining B := N −1A and c := N −1b the system to be solved becomes Bx = c and the splitting on A with N 6= I becomes equivalent with the standard splitting on B to the transformed system. Hence, if the Richardson Iteration generalizes to Krylov subspace solver, all iterative solvers based on matrix splitting do. See e.g. [9, Chapter 7].

We again assume x(0) = 0 and hence r(0) = b. Using equations (2.8) and (2.9) we obtain:

i−1 X x(i) = b + r(1) + ... + r(i−1) = (I − A)jb i=0 from which we conclude

(i+1) i−1 x ∈ span{b, Ab, ..., A b} = Ki(A, b).

As it seems, Richardson iteration approximates solutions to the linear system indirectly from a Krylov subspace of increasing dimension. In Krylov subspace methods we compute the best approximation to the exact solution that is contained in the Krylov subspace of increasing di- mension. Hence, Krylov methods are more efficient than algorithms that are based on matrix splitting.

6 2.4 Krylov subspace methods

(k) In Krylov subspace methods we are looking for a solution x ∈ Kk that is in some sense the best approximation to the solution x. Finding a solution in a subspace is done by projection and there are different types of projection that we could use. In all projections we are dealing with some sort of minimization problem. Depending on the problem, one type of minimization may be numerical more favorable then the other. We consider four types of projections [9, Chapter 8]:

1. The Ritz-Galerkin approach requires that

(k) b − Ax ⊥ Kk(A, b) (2.10)

2. The minimum residual approach requires:

(k) kb − Ax k2 to be minimal over Kk(A, b) (2.11)

3. The Petrov-Galerkin approach requires that:

b − Ax(k) is orthogonal to some other suitable k-dimensional subspace (2.12)

4. The minimum error approach requires:

(k) T T kx − x k2 to be minimal over A Kk(A , b) (2.13)

We will treat briefly some (aforementioned) popular methods. Methods as CG, the Lanczos method and FOM rely on the Ritz-Galerkin approach. The minimum residual approach has lead to methods as GMRES and MINRES. The Petrov-Galerkin approach gives rise to methods like Bi-CG and BI-CGSTAB. And at last, a method that relies on the minimum error approach is for example GMERR.

2.4.1 Ritz-Galerkin projection

We begin with a formulation of the the solution to the best approximation in a general sense: (k) k since x ∈ Kk = R(Vk), see eqn (2.7), we know that there exist y ∈ R such that:

(k) x = Vky. (2.14)

When we have constructed a basis {v1, . . . , vk} the best approximation is determined by finding y. This will hold in general for Krylov methods, but the solution we may find depends on the type of projection we choose. We will begin with the Ritz-Galerkin approach, recall that for (k) this approach we require that b − Ax ⊥ Kk(A, b). The most common method based on this approach is the Full Orthogonalization Method (FOM).

Full Orthogonalization Method In FOM we do not require A to possess any special properties and we can just apply the Ritz- Galerkin projection. Hence, we find that y must satisfy the following condition:

T (k) T Vk (b − Ax ) = Vk (b − AVky) = 0.

7 T From Arnoldi Method we know that b = kbk2v1 and hence Vk b = kbk2e1, where e1 is the first k T T canonical unit vector in R . From Arnoldi Method we also know that Vk AVk = Vk Vk+1Hk+1,k = Hk,k. If we combine this results with the Ritz-Galerkin projection, we find that y is the solution of the following linear system: Hk,ky = kb|k2e1 This system can be quite efficiently solved by performing k − 1 Givens rotations on the diagonal- and subdiagonal entries of Hk,k, after which we need to solve a lower triangular system. In essence we are performing an efficient QR factorization on the Hessenberg matrix. As it turns out this method becomes significantly more efficient when A is symmetric, it is known as the Lanczos method. If A is symmetric positive definite (SPD) there are some additional numerical benefits which lead to the even more efficient Conjugate Gradient method (CG).

Lanczos method T To see why a symmetric matrix leads to a more efficient algorithm, consider the matrix Vk AVk. As we know it equals an upper Hessenberg matrix with entries we know from Arnoldi’s method, moreover we see that it is symmetric and we conclude that the Hessenberg matrix is tri-diagonal. T The corresponding notation is as follows: Vk AVk = Tk,k. And now the best approximation is found by solving the tri-diagonal system:

Tk,ky = kbk2e1

Applying Arnoldi’s method simplifies as well, since in the next iteration step Avk has to be orthogonalized with respect to only the previous two vectors vk and vk−1. In practical imple- mentations there are algorithms that prevent the necessity to store all the vectors {v1, . . . , vk}, therefor the cost in terms of computation and memory are reduced tremendously, especially when n is very large.

Conjugate Gradient method There are different ways in which one can interpret the CG method. When we follow the above reasoning, the main benefit of the Conjugate gradient method is that we can solve the tri- diagonal even more efficiently by performing an LU factorization on Tk,k. By symmetric positive definiteness we know that this factorization exist and the tri-diagonal form makes it possible to do this in an way that is low in computational cost. Another way of interpreting the CG method is that we try to minimize the A-norm of the error e(k) in each step. We do this by making the error A-orthogonal to {v1, . . . , vk}, by using the A-norm we prevent that we actual have to compute any errors, instead the residuals are used in an elegant way.

2.4.2 The minimum residual approach

(k) In the minimum residual approach we want to minimize kb − Ax k2 = kb − AVkyk2 on the Krylov subspace Kk. Consider the following:

kb − AVkyk2 = kb − Vk+1Hk+1,kyk2

= kVk+1(kbk2e1 − Hk+1,ky)k2

But as Vk+1 is orthogonal, the norm isn’t influenced by it. Hence we are interested in minimizing kkbk2e1 − Hk+1,ky)k2 over the space Kk. This is done by determining the solution of Hk+1,ky = kbk2e1

8 in the minimum norm sense. For general matrices this approach leads to the GMRES algorithm. For symmetric matrices, Hk+1,k reduces again to a tri-diagonal matrix. Similar to the Lanczos method, this results to a more efficient algorithm known by MINRES.

2.4.3 Petrov-Galerkin projection

For nonsymmetric matrices we notice that the Ritz-Galerkin approach becomes to expensive in both memory and computational cost. To overcome this problem, we would like to mimic, in some way, the 3-term recurrence that we encounter for symmetric matrices. We can do this with a Petrov-Galerking approach (2.12): require that b − Ax(k) is orthogonal to some other suitable k-dimensional subspace. This is what is done in the Bi-Lanczos and Bi-CG method.

To find the suitable subspace, we have to perform Arnoldi’s method in a slightly different way. The idea is that when we have constructed in some way Vi, the suitable basis should satisfy T Wi Vi = Di := [dii], some diagonal matrix with diagonal entries dii := (wi, vi), and besides that T it should satisfy Wi vi+1 = 0, i.e. Wi and Vi form a bi-orthogonal basis. Then it follows that: T Wi AVi = DiHi,i and we would like to choose Wi such that Hi,i is tri-diagonal. This can be realized by defining the bi-orthogonal basis sets {v1, . . . , vi} and {w1, . . . , wi} in the following manner:

i X hi+1,ivi+1 = Avi − hj,ivj j=1

i T X hi+1,iwi+1 = A wi − hj,iwj j=1 where hi+1,i is chosen such that vi+1 is normalized. Notice we generate wi with the transpose of A. With the constants hj,i = (wj, Avi)/dj,j for j = 1, . . . , i it is ensured that the sets are bi-orthogonal, i.e. (wj, vi) = 0 for i 6= j. In matrix notation this gives: T Wi AVi = DiHi,i T T Vi A Wi = DiHi,i

Hence DiHi,i is symmetric and therefor tri-diagonal, notation: DiTi,i. This leads to the required 3-term recurrence. The process is started by taking v1 = b/kbk2 and choosing some w1 6= 0 such that (w1, v1) 6= 0.

th As in the Lanczos method, the following equality holds AVk = Vk+1Tk+1,i. In the k iteration step, the Bi-Lanczos method performs the projection similar to the Lanczos method: T (k) T Wk (b − Ax ) = Wk (b − AVky) T = Wk b − DkTk,ky = 0 −1 T −1 T Now consider the following: D Wk b = D Wk v1kbk2 = kbk2e1. We conclude that, similar as to the Lanczos method, we end up with a tri-diagonal system for y of the same form:

Tk,ky = kbk2e1 This method is known as the Bi-Lanczos method. For symmetric matrices this leads to even shorter recurrences such as the Bi-CG and Bi-CGSTAB algorithms.

9 2.4.4 The minimum error approach

The minimum error approach was formulated as follows. We require:

(k) T T ||x − x ||2 to be minimal over A Kk(A , b).

With this approach we are, for some reasons that are beyond the scope of this thesis, able to minimize the forward error. Methods in this class are SYMMLQ and GMERR.

10 3 Preconditioning

Preconditioning is a technique that is commonly used to accelerate convergence. In practice many iterative methods, including Krylov subspace methods, converge very slow for unpreconditioned systems. By preconditioning the system we try to modify the system in such a way that an iterative method converges significantly faster. In this section we first treat the concept of preconditioning in general. Then we will treat some well-known preconditioning techniques that are based on incomplete factorizations.

3.1 The concept of preconditioning

In order to understand the concept of preconditioning we need to know what properties of the system causes the bad convergence behavior. Next we will see how these properties can be improved by a preconditioning technique. The eigenvalues of the matrix play an important role in this discussion. Recall the definition on the spectrum of a matrix.

Definition 3.1 The set of all the eigenvalues of A is called the spectrum of A and is denoted by σ(A). 

Another definition that will be useful in our discussion on preconditioning is the condition number of a matrix.

Definition 3.2 The condition number of a matrix A is the quantity

K(A) = kAkkA−1k where k.k is any induced matrix norm. 

In general the condition number K(A) depends on the choice of the norm and it is not defined if A is singular. The quantity plays an important role in studying the stability properties of linear systems. (k) Now recall from section 2 eqn (2.1) that for Krylov subspace methods x ∈ Kk. Hence for (k) (k) the residuals it follows that r = b − Ax ∈ Kk as well. We could also write this as:

(k+1) (0) r = Pk(A)r (3.1) where Pk is a polynomial with degree k that satisfies Pk(0) = 1. Notice that in this thesis we will always assume r(0) = b as we assumed x(0) = 0. In methods such as GMRES and MINRES we aim to minimize the residuals at each iteration, hence for these methods we have:

(k) (0) kr k2 = minkPk(A)r k2 (3.2) Pk where we minimize over all polynomials Pk with degree k or less with Pk(0) = 1. In a similar way we could derive that for the same class of polynomials the CG method satisfies

(k) (0) ke kA = minkPk(A)e k2 (3.3) Pk

(k) (k) (k) where ke kA := (e , Ae ) is a well defined A-norm for SPD matrices. For that reason we will first consider the case where A is SPD. In this case there exist an orthogonal matrix Q and

11 T (k) 1/2 (k) a diagonal matrix D such that A = QDQ . Notice we could write ke kA = kA e k2. Then it follows from equations (3.2) and (3.3) that:

(k+1) T (0) (0) kr k2 = minkQPk(D)Q r k2 ≤ minkPk(D)k · kr k2 Pk Pk and

(k+1) 1/2 (0) T 1/2 (0) ke kA = minkA Pk(A)e k2 = minkQPk(D)Q A e k2 Pk Pk (0) ≤ minkPk(D)k · ke kA Pk where k.k denotes some appropriate matrix norm. The inequality follows because:

T T minkQPk(D)Q wk2 ≤ kQPˆk(D)Q wk2 ≤ kPˆk(D)k · kwk2 Pk where we define Pˆk to be the polynomial that minimizes kPk(D)k. We conclude:   (k+1) (0) kr k2/kr k2 ≤ min max |Pk(λi)| for MINRES (3.4) Pk i=1,...,n and   (k+1) (0) ke kA/ke kA ≤ min max |Pk(λi)| for CG. (3.5) Pk i=1,...,n −1 It is not clear whether the bounds are sharp, since the polynomial that minimizes kQPk(D)Q wk2 might not be the same as the one that minimizes kPk(D)k. Nonetheless the upper bound provides us with qualitative information; a small upper bound corresponds to the case of fast convergence. Hence, the smaller the values of the minimizing polynomial Pk (with Pk(0) = 1) on the set σ(A) are, the faster convergence we may expect. For SPD matrices we can simplify the bounds of equations (3.4) and (3.5). It can be shown that [10]: √    κ − 1k min max |Pk(λi)| ≤ 2 √ , κ := λmin/λmax (3.6) Pk i=1,...,n κ + 1 and hence the bounds do not depend on the entire spectrum of A, but only on ratio of the largest to smallest eigenvalue of A. However, for nonsymmetric matrices we could derive for the GMRES method a similar bound as in eqn (3.4). Assume for simplicity that A has a complete set of eigenvalue, i.e. there exist a non singular matrix V and a diagonal matrix D such that A = VDV −1. Then the bound is given by:   (k+1) (0) kr k2/kr k2 ≤ K(V ) · min max |Pk(λi)| for GMRES (3.7) Pk i=1,...,n where K(V ) = kV kkV −1k is the condition number. In this case we expect the bound to depend on the entire spectrum of A. We can understand this by analyzing the following scenarios:

ˆ As mentioned before we require k  n so that we think of Pk as a low degree polynomial. Consequently, when the eigenvalues are widely spread in the complex plane, the polynomial cannot be small at a large number of such points. ˆ Similarly, when a large number of eigenvalues are located around the origin, then a poly- nomial of low degree cannot be 1 at the origin and have small values at a large number of such points located around the origin.

12 ˆ In contrast, eigenvalues clustered around a single point c away from the origin are favorable. k Take for example the polynomial Pk(z) = (1 − z/c) , hence Pk(0) = 1 and the polynomial has small values for z ∈ σ(A). Note that we do not claim that this is the minimizing poly- nomial. Matrices with clustered eigenvalues will typically show good convergence behavior.

It appears that a clustered spectrum is in general good for convergence. Also for SPD matrices, the bound of equations (3.6) is reduced when the eigenvalues are clustered. Hence, it is of no surprise that the main goal of preconditioning is to transform the system with a precondition- ing matrix (or preconditioner) M such that the eigenvalues of the preconditioned system are clustered.

We can transform the system by applying the preconditioner to the original system. We will treat three different types of preconditioning:

1. Left-preconditioning. Apply the iterative method to:

M −1Ax = M −1b (3.8)

2. Right-preconditioning. Apply the iterative method to:

AM −1u = b, x := M −1u (3.9)

3. Two-sided preconditioning. In many applications the preconditioner is constructed in factored form M = M1M2. In this case we can apply apply the iterative method to:

−1 −1 −1 −1 M1 AM2 u = M1 b, x := M2 u (3.10)

Whenever we precondition a system, we are in essence solving a different system. For ex- ample, when we are left-preconditioning a system we are solving a system M −1Ax = M −1b. From the Arnoldi method we know that in the first step we have to compute the vector v1 = −1 −1 −1 M b/kM bk2, hence we need to compute the vector c = M b. We do not want to determine the inverse of M explicitly as this is likely to be too expensive. Instead we determine c by solving the system Mc = b indirectly. From eqn (2.5) we know that in the following steps we have to solve systems of the form M −1Au = v. This is done in two steps, first we apply the matrix-vector product u1 = Au and next we solve Mv = u1 indirectly. Here the subscript on u1 is to emphasize that it is a dummy variable. Hence, in order to be useful for numerical applications we require that the system Mx = b is much easier solved than the original system Ax = b. Note that we will almost never determine M −1 explicitly. This is with the exception of preconditioning techniques where M is a sparse approximate inverse preconditioner, i.e. the preconditioner is a sparse matrix such that M ≈ A−1. In this case left-preconditioning is of the form MA = Mb and now in each step we have to perform an additional matrix-vector product in stead of the additional linear solve we had to perform before. Matrix-vector products are, especially for sparse matrices, in general much cheaper to perform than linear solves. The construction of sparse approximate inverses is however less straightforward and we will treat this in the next section more extensively.

Another issue of preconditioning is that M should be chosen such that the preconditioned system is better conditioned; the condition number is directly related to the stability of a linear system. Recall the definition on the forward error.

13 Definition 3.3 Let x be such that Ax = b and let x∗ be an approximated solution to the system ∗ corresponding to A and b. The forward error is the quantity δx such that x = x − δx. 

With a posteriori analysis we aim to find the nearby system of Ax = b for which x∗ is the exact solution, i.e. we aim to find the minimal perturbations δA and δb such that:

(A + δA)(x + δx) = b + δb.

This is known as the backward error. Now assume for simplicity that δA = 0, hence A(x + δx) = b−r where we defined the residual r := b−Ax∗ = A(x−x∗) = −Aδx. With a posteriori analysis it can be shown that the forward error is related to residual in the following way [14, Section 3.1]: kδxk krk 2 ≤ K(A) 2 (3.11) kxk2 kbk2 where the condition number has the property K(A) ≥ 1. We say that the system is ill-conditioned when the condition number is relatively large. In general, a good preconditioner M will improve the condition number. That means M is chosen such thatK(M −1A)  K(A). In this case we expect from eqn (3.11) a more accurate approximated solution to the linear system.

3.1.1 Discussion on preconditioning

The concept of preconditioning is in theory quite straightforward, but in practical implemen- tations there are a lot of technical details we have to take in account. In this discussion on preconditioning we discuss some of the major issues.

When we construct a preconditioner M we have to make up a balance. A well chosen precondi- tioner will decrease the amount of iterations steps needed, but as a rule a good preconditioner is likely to be expensive to construct. Also, as we will see later, a very good preconditioner may contain more entries in comparison to other cheap preconditioners. In this case we also have to take in account the additional work needed per iteration step. In general the amount of information we have from the spectrum of A and that of the preconditioned system is limited. Therefore we can not state a priori much about the convergence behavior of the preconditioned system. For example, the spectrum of an indefinite systems may have eigenvalues on both sides of the complex plane. When we precondition such a system it may happen that the preconditioned matrix has eigenvalues close to zero. It can be shown that this is bad for convergence. Hence, improving a preconditioner can potentially be bad for convergence. This is merely to clarify that the construction of a preconditioner is not an exact science. We discussed that convergence of Krylov methods depends for an important part on the spectrum of A, or on the spectrum of the preconditioned matrix when a preconditioner is applied. We have mentioned three types of preconditioning. Notice the following:

det(M −1A − λI) = det(M −1(A − λM)) = det(A − λM)det(M −1) = det(AM −1 − λI).

14 In a similar fashion we can show the following for preconditioners in factored form M = M1M2:

−1 −1 −1 det(M A − λI) = det(M2 M1 A − λI) −1 −1 = det(M2 (M1 A − λM2)) −1 −1 = det(M1 A − λM2)det(M2 ) −1 −1 = det(M1 AM2 − λI). We conclude that spectrum of the preconditioned matrices does not depend on the type of implementation. However M −1Av = λv =6 ⇒ AM −1v = λv and hence the set of eigenvectors does depend on the choice of implementation. As convergence also depends on the degree in which the initial residual b − Ax(0) is in the dominant eigenvector direction, the convergence behavior will depend on the chosen implementation.

There are other issues we have to keep in mind when we apply left-, right- or two-sided precon- ditioning.

ˆ When we are applying left-preconditioning this will have effect on the projection of an approximated solution on the Krylov subspace. For example, in the preconditioned GMRES algorithm we are minimizing the residual

M −1(b − Ax(k))

which may differ significantly from the actual residual (b−Ax(k)) when M is ill-conditioned, see eqn (3.11). ˆ When we are applying right-preconditioning the stopping criteria may be based on

(k) ku − u k2

(k) −1 (k) which may differ significantly from the actual error kx−x k2 = kM (u−u )k2 when M is ill-conditioned. A benefit from these type of preconditioning is that we do not transform the right-hand side of the the system, see eqn (3.9).

For applications in a general sense we distinguish two types of preconditioners. The first type is the problem-specific preconditioner for which construction is based on very specific information of the problem, such as its geometry and physical properties. Mostly these type of problems are related to applications involving partial differential equations (PDEs). For a narrow class of problems one may obtain good preconditioners in this way, however, this approach requires a deep understanding of the problem. For a broad range of problems this problem specific information is very difficult to exploit or might not even be available. In this case a preconditioner can be based purely on the information that is contained in the coefficient matrix A. These are called algebraic or general-purpose preconditioners. Preconditioner of these types are generally not as efficient as the problem-specific types, however, the achieved convergence can certainly be reasonable good. Finally, one of the most important issues in the development of preconditioning techniques nowadays is the need for parallel processing. Krylov subspace methods are already efficiently implemented in parallel on high-performance computers. However, preconditioning is currently the main stumbling block in achieving high performance for large, sparse linear systems [2]. There is still a need for preconditioners which are inherently parallelizable both in construction as in implementation.

15 The characteristics of a good preconditioner are summarized:

ˆ The preconditioner clusters the eigenvalues of A to a point away from the origin.

ˆ The preconditioner is (relatively) cheap to construct. ˆ The preconditioner does not demand much memory. ˆ Linear solves with the preconditioner are cheap to perform.

ˆ The preconditioner improves the condition number of A. ˆ The preconditioner is parallelizable, both in construction as in implementation.

3.2 Preconditioning techniques

The most straightforward approach in preconditioning techniques is obtained by considering the exact solution by a direct method. A well-known direct method is the LU-factorization. The factors are constructed by the Gaussian elimination algorithm and the system to be solved is LUx = b where L and U T are lower triangular matrices. A preconditioning technique that is derived from this method is the incomplete LU-factorization (ILU). The method constructs a preconditioner M = L˜U˜ (3.12) from incomplete factors and tries to capture the biggest entries from L and U.

3.2.1 Incomplete LU-factorization

In an ILU factorization we force the factors to have some sparsity pattern. This is done by performing Gaussian Elimination and dropping elements from nondiagonal positions in a suitable way. The factorization is as follows: A = L˜U˜ − R (3.13) where R is a residual matrix. In general ILU-factorizations we define a zero pattern P such that

P ⊂ {(i, j) | i 6= j; 1 ≤ i, j, ≤ n} (3.14) and drop elements during Gaussian Elimination that are part of the zero pattern. Suppose we have predetermined the zero pattern P . In this case we perform a Static Pattern ILU- factorization. This is implemented in the following algorithm [15]: There are some remarks on the above algorithm:

ˆ The process may terminate due to a zero pivot. ˆ In practice the for-loops are run more efficiently by exploiting the zero pattern P . When P has some known structure we can implement this in the second and third for-loop such that we only evaluate these loops at a small amount of nonzero entries. In this way the algorithm can still be feasible for large n.

16 Algorithm 3.1 General Static Pattern ILU for each (i, j) ∈ P do set aij = 0 end for for k = 1, . . . , n − 1 do for i = k + 1, . . . , n and if (i, k) ∈/ P do aik = aik/akk for j = k + 1, . . . , n and for (i, j) ∈/ P do aij = aij − aik ∗ akj end for end for end for

ˆ Since the diagonal elements of L˜ are all equal to one, we don’t need to store them. Hence we could overwrite A during the process to store L˜ in the strict lower triangular part and U˜ in the upper triangular part. This is done with a variant of Gaussian elimination, called the IKJ variant. This is implemented in the algorithm 3.2. Although we could overwrite the matrix A to store both factors L˜ and U˜, this is in general not how the algorithm is implemented. Every iterative Krylov method requires to perform matrix-vector products with A, see e.g. section 2.2 on the Arnoldi method. In practical implementations we need to allocate additional memory for storing the preconditioner.

Algorithm 3.2 General Static Pattern ILU, IKJ variant for each (i, j) ∈ P do set aij = 0 end for for i = 2, . . . , n do for k = 1, . . . , i − 1 and if (i, k) ∈/ P do aik = aik/akk for j = k + 1, . . . , n and for (i, j) ∈/ P do aij = aij − aik ∗ akj end for end for end for

For general ILU-factorizations it can be shown that the following theorem holds, we will state it without proof.

Theorem 3.4 Algorithm 3.2 produces factors L˜ and U˜ such that

A = L˜U˜ − R and the entries of R are such that

rij = 0 when (i, j) ∈/ P and the elements −rij for (i, j) ∈ P are determined by algorithm 3.2. 

17 The nonzero entries −rij for (i, j) ∈ P are called fill-in elements. We conclude that for a preconditioner M = L˜U˜ we have mij = aij for (i, j) ∈/ P . Hence, depending on the choice of the zero pattern, M might be a good approximation of A. Matrix-vector products with ILU-preconditioner of the form v = M −1u are computed in two steps:

u = Mv = L˜Uv˜

=⇒ Lu˜ 1 = u

=⇒ Uv˜ = u1 where u1 is a dummy variable. In practical implementations we overwrite u1 when we are solving for v.

3.2.2 Zero Fill-In ILU (ILU(0))

For a complete LU factorization of a matrix A the factors L and U are typically less sparse than the original matrix A. In ILU(0) we allow no fill-in, i.e. we take as a zero pattern P the zero pattern of A: PA = {(i, j) | aij = 0, i 6= j} (3.15) ILU(0) is implemented with this P in algorithm 3.2. This technique ensures us that sparsity properties are maintained in the preconditioner.

3.2.3 ILU(p)

It is possible that ILU(0) produces a preconditioner that is not a good approximation to A. In ILU(p) we allow more fill-in in the factors Lp and Up in order to produce a more accurate approximation of A [15]. We will show how ILU(1) is performed and how this process generalizes to performing ILU(p) for p > 1. To this end we consider an example with a 5-point matrix that is derived from the finite difference discretization of Poisson’s equation on the unit square. Hence we consider the problem:  ∂2 ∂2  + u(x, y) = f(x, y) (3.16) ∂x2 ∂y2 for some function f and (x, y) ∈ [0, 1] × [0, 1]. We discretize the problem by applying a 5-point operator on a regular n-by-n mesh on the unit square. In this way we obtain a block tridiagonal matrix. In MATLAB we construct with the command A = gallery(´Poisson´, 4) the matrix that corresponds to the considered problem. Next we compute the ILU(0) factorization with the command [L0,U0] = ilu(A,setup) where we have set setup.type = ´nofill´. In figure 1 3 we represented the nonzero structure of the matrices A, L0, U0 and the product L0U0. Notice that there is indeed some additional fill-in in the product of the incomplete factors.

We can use the nonzero structure of the product L0U0 to allow more fill-in for some new to construct factors L1 and U1. This is what is done in ILU(1). We have performed ILU(1) on the matrix A with algorithm 3.2 where we defined the zero pattern by the complement of the nonzero structure of L0U0. In figure 2 we represent sparsity patterns of the factors L1, U1 and the incomplete product L1U1. Notice that the additional fill-in has increased. However, with

3The nonzero structure, or sparsity pattern, of a matrix is a set of the row and column indices corresponding to the nonzero entries of the matrix.

18 L0 U0 0 0

5 5

10 10

15 15

0 5 10 15 0 5 10 15

A L0U0 0 0

5 5

10 10

15 15

0 5 10 15 0 5 10 15

Figure 1: Four matrices involved in ILU(0): A, L0, U0 and L0U0 the commands normest(A-L0*U0) and normest(A-L1*U1), where L1 and U1 correspond to the incomplete factors obtained by ILU(1), it is shown that the incomplete factorization obtained by ILU(1) is a much better approximation to A.

If we define the ILU(0) factorization by [L0,U0] = ilu(A, 0), then [L1,U1] = ilu(A, 1) is the ILU(1) factorization that uses the pattern of L0U0. Thus in general [Lp,Up] = ilu(A, p) is the ILU(p) factorization that uses the pattern of the previously determined product Lp−1Up−1. As p increases this process will become too expensive. Therefore, in practice the same process is approximated by using clever algorithms where we initially associate each entry aij with a level of fill, denoted by levij, such that:

 0 if a 6= 0 or i = j lev = ij ij ∞ otherwise

During the process of Gaussian elimination we update the level of fill whenever an element is modified. This is done in the following way:

levij = min{levij, levik + levkj + 1}

For diagonal dominant matrices this approach ensures that a high level of fill is associated with a small entry. Evaluated entries in algorithm 3.2 are dropped whenever the level of fill is bigger than p.

19 L1 U1 0 0

5 5

10 10

15 15

0 5 10 15 0 5 10 15

L0U0 L1U1 0 0

5 5

10 10

15 15

0 5 10 15 0 5 10 15

Figure 2: Four matrices involved in ILU(1): L0U0, L1, U1 and L1U1

3.2.4 ILUT

For matrices that are not diagonally dominant it is possible that in the factors there are many small entries with a low level of fill. The preconditioner is not significantly improved by these small entries, hence the ILU(p) factorization is not very efficient in this case. In an alternative ILU factorization we drop these entries based on their numerical value, this method is known as ILUT. In algorithm A.1, see appendix A, we give an outline of the ILUT algorithm. In this th algorithm the i row of a matrix A is denoted by the MATLAB notation ai∗.

In the ILUT algorithm we include to algorithm 3.2 a set of rules for dropping small elements based on their numerical values.

ˆ We define a drop tolerance τ and for each i in the first for-loop, line 1, we define τi = τ · kai∗k2. Next we drop elements that are smaller than τi. By defining τi for each i we take in account that a matrix might be badly scaled. This is what is done in line 5.

ˆ In line 10 we again drop elements smaller than the relative tolerance τi. Besides that we restrict the use of memory storage by allowing a maximal amount of fill. For both line 11 and 12 we keep only (at most) the p largest elements.

By this dropping rules we reduce both the computational cost and the memory usage. The benefit of ILUT in comparison to ILU(p) is that we determine the level of fill in the factors based on there numerical values rather than on the structure of A.

20 4 Sparse Approximate Inverse Preconditioners

A different type of preconditioning is established by constructing a direct approximation of the inverse. With the preconditioner M ≈ A−1 the preconditioning step reduces to performing one ore more (sparse) matrix-vector products. In subsection 3.1.1 we discussed that in recent developments there is a need for efficient general-purpose, parallel preconditioners. The standard incomplete factorization techniques we described in the previous section are in general highly sequential; this becomes clear from algorithm 3.2, where in each step of the process information from the previous steps is needed to proceed. Similarly, linear solves of the form z = M −1y are highly sequential for incomplete factorization techniques. Hence, these standard techniques are certainly not inherently parallelizable. As we will see later, sparse approximate inverse techniques are indeed inherently parallelizable. The need for parallel processing has been the main driving force in the development of these techniques. There is yet another reason why approximated inverse techniques may be favored over stan- dard incomplete factorization techniques; ILU preconditioner are known to have a high failure rate for indefinite systems. In an article by Chow and Saad on ILU preconditioners for indefinite systems, it is explained that failures may be caused by breakdowns due to zero pivots, inaccuracy and instability of triangular solves [8]. We treat this issues in more detail and make clear that approximated inverse techniques do not suffer from these problems.

Incomplete LU factorizations may suffer from the same problems as complete LU factorizations do. Small pivots are common for indefinite systems and as a consequence entries in the fac- tors can grow uncontrollably. The factorization becomes unstable and the factorization will be inaccurate, which means that LU will not be a good approximation to A. Besides that, small pivots lead to unstable triangular solves; linear systems corresponding to L and U might be ill-conditioned. Moreover, this problem might even occur without the presence of small pivots. To fully understand the consequences to this particular case, consider a preconditioner M = L˜U˜ from incomplete factorization. Assume M is an accurate preconditioner, that is M ≈ A. By applying two-sided preconditioning we find from equation (3.13) the following equality:

L˜−1AU˜ −1 = I + L˜−1RU˜ −1 where the term R is negligible. In order for the system to be efficiently preconditioned, the term L˜−1RU˜ −1 should be negligible as well. However, it is possible that the incomplete factors are unstable even when M is an accurate preconditioners. This means that the last term is not necessarily small. Approximated inverse preconditioners do not suffer from this problem as the preconditioning step consist of a matrix-vector product. In these techniques we may expect M to be a satisfactory preconditioner as long as it is a good approximation to A−1.

In sparse approximate inverse techniques we assume that we are able to approximate the inverse of a sparse matrix with another sparse matrix M. This is not necessarily true since the inverse of sparse matrices is generally dense. However, for many sparse matrices it turns out that a lot of entries are very small. We will see an example of a matrix with this properties in subsection 6.1 on numerical experiments with an sparse approximate invers.

21 There are different sparse approximate inverse techniques and they can be grouped in three categories. The different techniques are based on:

ˆ Frobenius norm minimization. ˆ Factorized sparse approximate inverses. ˆ Approximate inverses of ILU-factorizations.

First we will give the motivation to approximate the inverse by a sparse matrix and then will treat the three categories of sparse approximate inverse techniques separately.

4.1 Motivation to sparse approximate inverse techniques

Many linear systems arise from partial differential equations of physical problems. In this field of research it was observed that, very often, the inverse contains many small entries. We would like to be able to predict where the smallest entries are located such that we could drop these entries in an approximation of the inverse. We have already seen that for standard techniques, based on incomplete LU decomposition, pattern selection can be a big issue in the construction of the preconditioner. As we will see later, the same is true for sparse approximate inverse techniques. However, once the sparsity pattern is determined, the construction of a sparse approximate inverse is in many cases fairly straightforward. In this subsection we will also describe how a sparsity pattern can be selected a priori. Now consider again the example of the Poisson equation from eqn (3.16), this time on some 2D domain Ω. Let P = (x, y) ∈ Ω, then the exact solution on Ω is given by: Z u(P ) = G(P,Q)f(Q)dQ Ω where Q = (ξ, η) is a variable point used for integration and G is the Green’s function. For the Poisson equation the Green’s function is given by: 1 1 G(x, y, ξ, η) = ln . 2π p(x − ξ)2 + (y − η)2

We notice that there is a rapid decay of the Green’s function when the distance between P and Q increases. Also, if we discretize the problem, we obtain a linear system of the form Ax = b. The matrix A is a discrete approximation to the continuous operator of the Poisson Equation and we observe from the exact solution that A−1 should be a good, discrete, approximation to the Green’s function. The inverse will be typically dense, however, from the rapid decay of the Green’s function we may expect that the inverse contains many small entries which can be dropped. Hence, the inverse can be approximated by a sparse matrix. We can generalize the idea for other problems arising from partial differential equations. In general, the Green’s function is a mathematical description of the decreasing influence between two points if the distance between them increases. Hence, a lot of entries in the inverse of the discrete Green’s function will be small. Unfortunately, for higher-dimensional problems the decay of the Green’s function has a more complex structure and the distance between two nodes is not necessarily a good measure to characterize the influence between them. As a consequence, it might be difficult to predict a priori sparsity patterns and we need different strategies to obtain an effective sparsity pattern.

22 Another motivation to approximate the inverse by a sparse matrix arises from matrices that are said to be diagonally dominant. Recall the following definition.

Definition 4.1 The matrix A = [aij] is said to be strictly diagonally dominant by row if X |aii| > |aij| for all i j6=i



For diagonally dominant matrices the following theorem holds:

Theorem 4.2 If A is a strictly diagonally dominant matrix by row, then the matrix is nonsin- −1 gular and the explicit inverse A is strictly diagonally dominant by column. 

In optimal problems, A is diagonally dominant and has a banded structure. For these matrices −1 we have exponential decay in the inverse A = [aij]:

|i−j| |aij| ≤ Cγ for some constant C and γ < 1. Hence, entries far away from the banded structure are small and may be dropped. For many problems from practical applications, the matrix A might only be diagonally dominant by row to some degree, i.e. a few rows might not be diagonally dominant. Still, we might expect the inverse to by diagonally dominant by column to some degree and the inverse can still be approximated by a sparse matrix.

In conclusion, we observe that for many systems arising from practical implementations the inverse can be approximated by a sparse matrix. However, for more complex problems it might be hard to predict where the smallest entries are located. In the next subsection we treat a strategy to determine a priori an effective sparsity pattern for the inverse of problems in general.

4.1.1 A priori pattern selection

In this section we describe a priori pattern selection. The pattern selection may be based on the specific structure, e.g. a band structure, and characteristics of the matrix. However, in general the nonzero structure of powers of A will provide us with a good choice for the a priori pattern. To understand this, consider the second power A2; we denote the ith row of the second power by 2 ai∗. For this row it holds that   a1∗ n 2  .  X ai∗ = ai∗A = ai∗  .  = aij · aj∗ (4.1) j = 1 an∗ and hence the ith row is obtained by the merging of rows of A. Note that for sparse matrices fewer ( n) rows will be merged. In numerical applications, the cancellation of elements is rare due to finite precision and hence we expect additional fill-in in each row of the second power. If th k k in general we denote the i row of A by ai∗, then it holds that  l−1 a1∗ n X al = a Al−1 = a  .  = a · al−1 (4.2) i∗ i∗ i∗  .  ij j∗ l−1 j = 1 an∗

23 l l−1 and hence ai∗ is obtained by the merging of rows of A . If we assume that there is now cancellation of elements, then for general matrices the fill-in in powers of A will increase, i.e. the nonzero structure of Al−1 is in general contained in the nonzero structure of Al. By equation (2.3) we know that for nonsingular matrices A−1 can be written as a linear combinations of powers of A. For this reason we expect the nonzero structure of Al to be a good a priori pattern selection for some l < m, where m is the degree of the minimal polynomial of A.

The process of computing higher powers of A becomes increasingly expensive. When we increase l by one we have to compute an additional sparse matrix-matrix product AAl. Also, this product becomes increasingly more expensive because the fill-in in Al increases as well. In practice this means that we choose small values for l. An additional argument for choosing small values is that this produces a very sparse preconditioner. Since the preconditioning step consist of a matrix- vector product with M, this reduces the computing time of a single iteration step. Although the process will require more iterations because the sparser preconditioner is a less accurate approximated inverse, the total computing time for solving MA = Mb may be reduced in this way. Notice that we have to find a balance since both too small as too large values for l will increase the total computing time. The cost of constructing a good pattern is reduced by sparsifying the matrix and compute powers of the sparsified matrix A˜ = [˜aij]. This is realized by thresholding A while transforming it to a binary matrix. With a given threshold τ this is done in the following way:

 1 if i = j or |{D−1/2AD−1/2} | > τ a˜ = ij (4.3) ij 0 otherwise where D = [dii] prescribed by:

 |a | if |a | > 0 d = ii ii (4.4) ii 1 otherwise The matrix D is constructed to account for badly scaled matrices. The use of the a priori sparsity pattern is shown by an example of Chow [6]. Consider figure 3(a); we present a density plot of the discrete Green’s function of a partial differential equation at a point near the center of a square domain. The discrete Green’s function is directly related to the inverse. In figure 3(b) we present an approximation to the discrete function when we use the sparsity pattern of the original matrix A that arises from the discretization of the problem. In a similar way, an approximated inverse is obtained by using the pattern of the sparsified matrix A˜, figure 3(c). Notice that in this case we only included the largest entries in the pattern. Now we could try to improve the approximate inverse by using powers of the sparsified matrix; the nonzero structure of A˜L for some integer L, is called the level L − 1 pattern. With the level 1 pattern we obtain the approximation to the inverse of figure 3(d). Notice that we have captured the largest entries of the discrete Green’s function and that the approximation has indeed improved.

24 Figure 3: Discrete Green’s function arising from a partial differential equation on a square domain.

4.2 Frobenius norm minimization

Methods that are based on Frobenius norm minimization determine a preconditioner M with some known sparse structure. Mostly the sparse structure is determined iteratively in order to find a efficient preconditioner. The process of determining a sparse structure is treated later. Recall the following definition on the Frobenius norm.

Definition 4.3 The Frobenius norm of a m × n matrix is defined as m n X X 2 1/2 kAkF = aij j=1 i=1  Let S be the set of all sparse matrices with some known structure. We require M to be the best approximation to A−1 in Frobenius norm amongst all the matrices in S. Now suppose we want to find a right-sided preconditioner (or right approximated inverse). Then M is the solution to the following minimization problem:

minkI − AMkF (4.5) M∈S T T For left-preconditioning we observe that kI − MAkF = kI − A M kF . Hence for this case an equivalent minimization problem is formulated by finding a right approximate inverse for AT . Now notice: n 2 X 2 kI − AMkF = kej − Amjk2 (4.6) j=1

25 th where mj is the j column of M. Hence the minimization problem (4.5) is equivalent to solving n independent linear least square problems. Since they are independent, the construction of these type of preconditioners is inherently parallelizable. By exploiting sparsity properties of A and M the costs of solving each least square problem can be reduced significantly; Let G be the nonzero pattern of M, i.e. the complementary set

c G := {(i, j) | mij = 0}. is the zero pattern of M. This set contains the row and column indices of the preconditioner that are necessarily zero. Now we define for each j the following subset:

J := {i | (i, j) ∈ G}. (4.7)

This set contains the indices of the columns of A that will be evaluated in the matrix-vector product Amj. Next we define yet another set by:

I = {i | A(i, J ) is a nonzero row} (4.8) where A(:, J ) is the submatrix formed by columns of A that corresponds to the set J . With the set I we determine the submatrix Aˆ := A(I, J ) in order to drop the zero rows that may occur due to sparsity. The aforementioned matrix-vector product is computed more efficiently by Aˆmˆ j wherem ˆ j := mj(J ). Now we can replace each least square problem from equation (4.6) by much smaller least square problems of the form:

ˆ 2 minkeˆj − Amˆ jk2 for j = 1, . . . , n (4.9) wheree ˆj := ej(I). Assuming that due to sparsity each system is relatively small, we will solve each system with a direct method such as a QR factorization. Thus, for a known structure it is straightforward to compute an approximated sparse inverse. Methods in which a structure is pre-determined are called static pattern techniques. Unfortunately, for a general matrix we don’t know a suitable sparse structure that captures the large entries of its inverse. Therefore we will describe a method in which we start with an initial sparse structure, usually an empty or diagonal structure, which we will iteratively augment such that the new approximated inverse preconditioner is improved. These type of methods are called adaptive pattern techniques. The most successful approach in this category is known as the Sparse Approximate Inverse algorithm (SPAI), proposed by Grote and Huckle [11].

4.2.1 SPAI

Suppose that we have defined a sparse structure S that we aim to improve. From equation (4.9) we define the corresponding residuals:

rj = ej − A(:, J )m ˆ j for j = 1, . . . , n (4.10)

By augmenting the structure S we aim to reduce the residuals in the Euclidean norm; krjk2. We therefore determine the following set:

L = {` | rj(`) 6= 0}

In general we expect that due to sparsity most entries of rj are zero. Also, due to finite precision, we expect most entries ofr ˆj =e ˆj − Aˆmˆ j to be nonzero. Hence, typically L will equal I. To

26 reduce krjk2 it makes sense to evaluate columns of A that were not evaluated before and besides that corresponds to the set L. To this end we define the following set:

Jˆ = {i | i ∈ L, i∈ / J } and for each j ∈ Jˆ we want to determine whether the jth column of A can make a significant contribution to the reduction in krjk2. This is done in a cheap way by considering the following minimization problems: minkrj − µjAejk2 for each j ∈ Jˆ µj

th where µj ∈ R. Notice Aej is the j column of A. The solution is simply determined by T performing a projection of r on the space spanned by Aej. Hence µj = −rj Aej/kAejk2. Next we compute the Euclidean norm ρj of the new residuals rj + µjAej:

T 2 2 2 (r Aej) ρj = krk2 − . kAejk2 In practical implementations we will make a selection of the most profitable indices j, these are the indices that corresponds to the smallest values of ρj. We also restrict the memory storage by allowing a maximal amount of fill-in. As a consequence some indices are dropped in Jˆ. Next we determine the corresponding set Iˆ in a similar way as in equation (4.8). Now we consider the augmented matrix A(I ∪I˜, J ∪J˜) for which we again solve the minimization problem of equation 4 (4.9) by computing a QR factorization . Finally we compute the new residual norm krjk2. The process is repeated till the residual meets a certain tolerance or till a maximal amount of fill-in is achieved. The method is summarized in the algorithm 4.1.

We mention an important theoretical property for the SPAI algorithm. We denote by rk the residual for every mk and assume that it satisfies:

krkk2 = kAmk − ekk < .

It is derived that for the spectral properties of the preconditioned matrix AM the following theorem holds [11].

Theorem 4.4 Let p = min {number of nonzero elements of rk}. Then, the eigenvalues λk of 1≤k≤n √ √ AM are clustered at 1 and lie inside a circle of radius p. Furthermore, if p < 1, then λmax and λ satisfy min √ λmax 1 + p ≤ √ λmin 1 − p 

In conclusion, depending on the user-defined parameters  and the maximal amount of fill-in, we may expect good results from the SPAI algorithm. However, we mention that the construction of the SPAI preconditioner may be expensive. To reduce cost we could use a static technique by a priori pattern selection, which in many cases seems to be an equally effective approach [6].

4In practice we will expand the QR factorization of A(I, J ) to reduce cost [11]

27 Algorithm 4.1 SPAI algorithm

1: for every column mj of M do 2: (a) Choose an initial sparsity pattern J . 3: (b) Determine the set of corresponding row indices I. 4: Compute a QR decomposition of Aˆ = A(I, J ) and compute the solutionm ˆ j from equation (4.9). 5: Set rj = ej − A(:, J )m ˆ j 6: while krjk2 >  do 7: (c) Set L equal to the set of indices ` for which r(`) 6= 0 8: (d) Set J˜ equal to the set of all new column indices that appear in all L but not in J . 9: for each k ∈ J˜ do 10: (e) Set

2 2 T 2 2 ρk = krjk2 − (rj Aek) /kAekk2

11: end for 12: Delete from J˜ all but the most profitable indices. 13: (f) Determine the new indices I˜ and update the QR decomposition of the submatrix A(I ∪ I˜, J ∪ J˜). Next solve for newm ˆ j. 14: Set rj = ej − A(:, J )m ˆ j 15: Set I = I ∪ Iˆ 16: Set J = J ∪ Jˆ 17: end while 18: end for

4.3 Factorized sparse approximate inverses

A different type of sparse approximate inverse preconditioner is constructed by incomplete inverse factorization; suppose that A is nonsingular and admits the factorization A = LDU where L and U T are unit lower triangular and D is diagonal, that is A admits a scaled LU factorization. Hence the inverse is factorized as follows: A−1 = U −1D−1L−1 = ZT D−1W where Z := U −T and W := L−1 are unit lower triangular matrices. We expect Z and W to be dense, even when the original matrix A is sparse. In methods of sparse approximate inverses we compute sparse approximations Z˜ ≈ Z and W˜ ≈ W such that the preconditioner is defined by [2]:

M = Z˜T D˜ −1W˜ ≈ A−1 (4.11) where D˜ is a non singular diagonal matrix such that D˜ ≈ D. In this subsection we will treat two well-known methods that are based on this concept. The first one is the Factorized Sparse Approximate Inverse (FSAI) which computes a direct approximation of the inverse factors, the method was introduced by Kolotilina and Yeremin [13, 16]. The second method constructs the Approximate Inverse preconditioner of factors with an incomplete biconjugation procedure. This method is known by AINV and was proposed by Benzi and T˚uma[3, 4].

4.3.1 FSAI

The FSAI method is particularly useful for SPD matrices because, as we will see later, the obtained preconditioner will be SPD as well. The same goes for the preconditioned matrix and hence the preconditioned system can be solved with the CG method. However, we will first

28 treat general nonsymmetric matrices [16]. Next we will derive that this method leads to SPD preconditioners for SPD matrices [2, 13]. In FSAI we assume A to be a non singular matrix that allows an unscaled LU factorization, i.e. A = LU and A−1 = U −1L−1. The aim is to construct an incomplete inverse factorization T of the form M = GLGU where GL and GU are lower triangular matrices that have some sparse T structure. The sparsity pattern must be described beforehand; we require GL and GU to be constrained to a certain triangular sparsity zero pattern prescribed by [16]:

{(i, j) | i < j} ⊆ SL ⊆ {(i, j) | 1 ≤ i 6= j ≤ n}. (4.12)

¯ ¯T Initially we will construct unscaled factors GL and GU that are constrained to the same set. When we denote the indices of matrix K by {K}ij, then these factors are constrained to SL in the following way:

 {G¯ A} = δ , (i, j) ∈/ S  {AG¯ } = δ , (j, i) ∈/ S L ij ij L and U ij ij L (4.13) {G¯L}ij = 0 (i, j) ∈ SL {G¯U }ij = 0 (j, i) ∈ SL

th The factor G¯L is computed by rows. Denote the i row by li, from equation (4.13) if follows that: {liA}ij = δij for (i, j) ∈/ SL, for i = 1, . . . , n. (4.14) If we additionally define the set M = {j | (i, j) ∈/ SL} where i is fixed, then the systems of equation (4.14) are equivalent to the systems of submatrices

ˆ liAˆi =e ˆi for i = 1, . . . , n ˆ where li := li(M), Aˆi := A(M, M) ande ˆi := ei(M). Notice that the each system is independent, thus the method is just like SPAI inherently parallelizable. Also notice that if the matrices Aˆi are nonsingular, then the factor G¯L is uniquely determined.

For the factor G¯U we can make a similar reasoning, its entries are computed by column. Denote the columns by uj and define the set:

N = {i | (j, i) ∈/ SL} where j is fixed. Then the columns of G¯U are determined by solving the systems of submatrices:

Aˆiuˆi =e ˆi for i = 1, . . . , n whereu ˆi := ui(N ), Aˆi := A(N , N ) ande ˆi := ei(N ). Again notice that if the matrices Aˆi are nonsingular, then the factor G¯U is uniquely determined. Also, in this case it can be shown that the diagonal entries of G¯U and G¯L are the same [16].

¯ ¯T Assume that the diagonal entries of both GL and GU are all nonzero. Now the actual factors GL and GU are defined by:

−1/2 −1/2 GL := diag(|{G¯L}11| ,..., |{G¯L}nn| )G¯L (4.15) and

−1/2 −1/2 GU := G¯U diag(sign({G¯U }11)|{G¯U }11| ,..., sign({G¯U }nn)|{G¯U }nn| ) (4.16)

29 The preconditioner M = GLGU can be applied as a two-sided preconditioner (see equation(3.10). In this case the diagonal entries of the preconditioned matrices are all equal to 1, that is {GLAGU }ii = 1 for i = 1, . . . , n.

We will show that FSAI is a symmetry preserving preconditioner; since for symmetric matrices ¯ T ¯T ¯ ¯T it holds that (AGU ) = GU A it follows from equation (4.13) that the factors GL and GU are ¯ ¯T constrained to SL in the same way, hence GL = GU . If both factors are uniquely determined, it is obvious from equations (4.15) and (4.16) that the resulting preconditioner M = GLGU is symmetric. If in addition the matrix is also positive definite, it can be shown that the diagonal entries of G¯L are all positive [16]. Now we only require the computation of the following factor: ¯ −1/2 ¯ −1/2 ¯ GL := diag({GL}11 ,..., {GL}nn )GL (4.17) ¯ T where GL is constructed according to equation (4.13). It follows that GU = GL and the precon- T T ditioner is given by M = GLGL. Moreover, the preconditioned matrix GLAGL is SPD as well [2]. The main benefit of FSAI is that for SPD matrices we can apply the CG method to the preconditioned matrix. Finally, we mention that the following theorem can be derived for SPD matrices [16].

Theorem 4.5 Let A be a SPD matrix and define the following nonnegative quadratic functional: 2 F(X) := kI − XLkF . ¯ ¯T If we require for the unscaled factors that GL = GU constrained to the triangular sparsity pattern SL, then the computation of the unscaled factor for A of the FSAI preconditioner prescribed in subsection 4.3.1 is equivalent to the following minimization problem:

minF(X) with X constrained to SL  In conclusion, for SPD matrices the FSAI algorithm is equivalent to the Frobenius norm mini- mization of the Cholesky factor L of A, i.e. the lower triangular matrix L such that A = LLT which is known to exist for SPD matrices.

4.3.2 AINV

The AINV method is based on an incomplete biconjugation procedure. The complete biconju- gation procedure is an alternative method for explicitly determining A−1. The following is a definition on A-biconjugation.

n n Definition 4.6 Two sets of vectors {zi}i=1 and {wi}i=1 are A-biconjugate if: T wi Azj = 0 if and only if i 6= j  In the biconjugation procedure we construct two sets of vectors that are A-biconjugate, assuming that two such sets exist. Let two matrices Z and W be defined by:

Z := [z1, . . . , zn]

W := [w1, . . . , wn]

30 T By definition 4.6 we can also define pi := wi Azi 6= 0 for i = 1, . . . , n. Then it follows that: T W AZ = D := diag(p1, . . . , pn). When A is nonsingular, then so are W and Z and if follows that:

n T X ziw A−1 = ZD−1W T = i (4.18) p i=1 i We conclude that when we are able to construct two sets of vectors which are A-biconjugate, we have explicitly determined A−1. The two sets can be constructed from any two linear independent n n sets {ui}i=1 and {vi}i=1 by performing a generalized Gram-Schmidt orthogonalization [4]. For numerical applications one would generally choose ui = vi = ei. The biconjugation procedure is T given in algorithm 4.2. Here we have denoted the rows of A and A by ai∗ and ci∗ respectively.

Algorithm 4.2 Biconugation algorithm (0) (0) 1: Let wi = zi = ei for i = 1, . . . , n 2: for i = 1, . . . , n do 3: for j = i, . . . , n do (i−1) (i−1) 4: pj := ai∗zj (i−1) (i−1) 5: qj := ci∗wj 6: end for 7: If i = n go to (13) 8: for j = i + 1, . . . , n do (i) (i−1) (i−1) (i−1) 9: zj = zj(i − 1) − (pj /pi )zi (i) (i−1) (i−1) (i−1) 10: wj = wj(i − 1) − (qj /qi )wi 11: end for 12: end for (i−1) (i−1) (i−1) 13: Let zi := zi , wi := wi and pi := pi for i = 1, . . . , n 14: Set Z = [z1, . . . , zn], W = [w1, . . . , wn] and diag(p1, . . . , pn)

Notice the process could fail in line 9 and 10 due to a zero or very small division. The following theorem tells us when the biconjugation algorithm does not break down.

Theorem 4.7 In exact arithmetic, algorithm 4.2 does not break down if and only if all n leading leading minors of A are nonzero. 

Recall the definition of leading minors:

Definition 4.8 The kth leading minor of a matrix A is the determinant of its upper-left k × k submatrix. 

In case the process does not break down, the matrices Z and W are unit upper triangular. Hence A = W −T DZ−1 is a LDU factorization and the algorithm computes W = L−T and Z = U −1 explicitly. Again we expect the factors Z and W to be dense, which makes the complete biconjugation procedure in general too expensive. By an incomplete biconjugation we aim to produce, in a relative cheap way, the factors W˜ , Z˜ an D˜ such that we obtain the preconditioner: M := W˜ D˜ −1Z˜ ≈ A−1 (4.19)

31 which is of the form of equation (4.11). The incomplete biconjugation procedure can be realized by producing a priori sparsity patterns, see e.g. subsection 4.1.1. Next we drop elements in algorithm 4.2 that are outside the sparsity pattern. The factors Z and W are typically dense. However for sparse matrices it is observed that very often many entries have small magnitude. Hence, the sparsity pattern can also be determined by an adaptive technique; we can drop elements according to a user-defined tolerance and allow a maximal amount of fill-in, similar as in the ILUT algorithm (A.1). Although in practice the algorithm is implemented in a more complex way to prevent break downs due to small pivots. This is done by transforming the system with a diagonal shift A1 = A+αI and performing pivot modifications. The interested reader is referred to the article by Benzi and T˚uma[4].

4.3.3 Factorized vs unfactorized preconditoners

We end this subsection with a discussion on factorized- versus unfactorized preconditioners. We already mentioned that with factorized preconditioners we are able to preserve symmetry. However, we are also interested in the construction cotst. Let m be the number of nonzeros in a row of M. Then for the described methods, the required amount of flops in constructing this row 3 is of the order O(m ). For the preconditioner in factorized form, M = M1M2, the same holds for the separate factors M1 and M2. However, in the factorized case, the number of nonzeros per row are nomally halved [6]. Therefore the construction of the factorized preconditioner may be cheaper to construct. We conclude that it is advantageous to compute factorized preconditioners for symmetric and indefinite systems as well.

4.4 Approximate inverses of ILU-factorizations

In this subsection we will treat the type of sparse approximate inverse preconditioners that are constructed by an approximate inversion of an incomplete LU factorization. The factorization may be obtained by techniques such as ILU(p) or ILUT, but for the purpose of this discussion we assume that the factorization L˜U˜ ≈ A is given. The preconditioner M ≈ A−1 is obtained by constructing an approximation of the product U˜ −1L˜−1. The approximation is constructed by considering the incomplete factors separately. We will treat shortly how the approximated inverse of L˜ is determined (notice that the process for U˜ will be similar).

Given the factor L˜, we could construct its inverse column-wise. The ith column of the inverse −1 can be denoted by L˜ ei. Hence, this column is the solution to the following linear system:

Lx˜ i = ei (4.20)

By solving the system for i = 1, . . . , n we have constructed L˜−1. In principle the columns can be constructed parallel. But in this method there are some issues we need to take in account. First of all, when we construct the ith column we have to solve a system with the submatrix L˜(n − i + 1 : n, n + i − 1 : n). Thus the cost for constructing x1 are much higher than the cost for constructing xn, which means that we have to apply load balancing. Secondly, we note that there is some need for communication in the process.

32 The system of equation (4.20) will obviously be solved approximately. This is for example done by applying forward substitution while forcing a sparsity pattern to the solution, either by applying a drop tolerance or by setting a sparsity structure and perhaps a maximal amount of fill- in. For practical reasons it is best to apply dropping rules during the substitution. In determining a structure for the approximate inverse of the incomplete, there will be some communication in process. When the factors from an incomplete LU factorization are ill-conditioned, they can still be used by these type of techniques. However, they are less popular than other approaches because they are more difficult two implement: The two levels of incompleteness, the ILU factorization and the approximate inverse of the factors, requires more user-defined parameters. Moreover, both phases of the process are not readily implemented in parallel.

33 5 Software Implementation

In parallel software implementations the work is distributed over multiple processors to achieve high performance. In the previous section we discussed that for sparse approximated inverse preconditioning techniques, such as SPAI and FSAI, the rows or columns are computed indepen- dently from each other. Hence, the construction of the rows and columns can be distributed over multiple processors. The parallel implementation does, however, add to the complexity of the process. In order to achieve high performance we require specialized programming techniques. We will briefly mention the most important ones.

The computation of a sparse approximated inverse that contains many nonzero entries per row may be excessive. Assume that the preconditioner is computed by rows. In order to construct the rows efficiently, we have to perform a load balancing, i.e. we have to distribute the work in such a way that each processor has approximately the same workload. Even when each processor owns the same number of rows of the matrix, the construction phase may be unbalanced because the sparsity pattern may be irregular. The load balancing is realized by a repartitioning of the distribution of the work. In this process the data transfer should be reduced as much as possible, as this slows down the process as well. A consequence of parallel implementation, is the need for communication between the pro- cessors. For example, the construction of an a priori zero pattern may be determined by powers of the sparsified matrix A˜, as described in section 4.1.1. Rows of A˜l are constructed by the merging of rows of A˜. In this process it will occur that a processors requires rows that are stored on a different processor. The first processor will have to communicate with the other, this is called one-sided communication. This type of communication might become asynchronous and slow down the process. By predetermining the number of request a processor will receive, this behavior is avoided. This technique is called global communication. One can imagine that the communication becomes even more complex for highly sequential processes such as incomplete LU factorizations. Finally we have to take in account the matrix data structure. If in the construction of a preconditioner the matrix is mainly accessed by rows, then the data structure of A should allow for an efficient access by row. It might be necessary to store A twice, such that is efficiently accessed by both rows and columns.

5.1 Software libraries and packages

In this section we discussed the difficulties in parallel implementations of large systems. For a wide range of problems, there are libraries and packages readily available from the internet that deal with this issues. We will discuss the ones that are relevant to the thesis. First we will take a look at the library PETSc, a large library which can be used both for parallel implementation of parallel Krylov subspace solvers as for parallel preconditioning (including sparse approximated inverse preconditioners). Next we will treat some packages that are more specific to certain sparse approximated inverse techniques.

34 5.1.1 PETSc

Name PETSc - The Portable, Extensible Toolkit for Scientific Computation. [1]

Author(s) S. Balay, J. Brown, K. Buschelman, V. Eijkhout, W. Gropp, D. Kaushik, M. Knepley, L. Curfman McInnes, B. Smith, and H. Zhang.

Language The software is written in C, C++, Fortran and Python.

MATLAB PETSc is directly callable from MATLAB.

Description PETSc is a large library of data structures and routines, specialized for par- allel implementations on high-performance computers. It includes parallel Krylov subspace solvers as well as sparse approximated inverse precondition- ers.

Website http://www.mcs.anl.gov/petsc/index.html

PETSc is a large library for data structures and routines in parallel computation on high- performance computers. Many of the routines have been derived from computations on nu- merical solutions of partial differential equations and related problems. The experience from the authors have led to efficient parallel implementations; PETSc deals with the practical issues that are associated with parallel programming and computations with large systems, some examples are[1]:

ˆ Index sets (IS), including permutations, for indexing into vectors, renumbering, etc. ˆ Vectors (Vec).

ˆ Matrices (Mat)(generally sparse). ˆ Krylov subspace methods (KSP)(over fifteen) ˆ Dozens of preconditioners, including sparse direct solvers (PC).

In conclusion, PETCs provides a rich environment of efficient algorithms and data structures for a user that has to perform parallel computations. The library provides parallel implementation of Krylov subspace solvers and is there especially useful for the theory of this thesis.

5.1.2 SPAI

Name SPAI - SParse Approximate Inverse Preconditioner [7].

Author(s) Marcus Grote, Stephen Barnard. The configuration was programmed by Oliver Br¨oker and Michael Hagemann.

35 Language The software is written in C/MPI5.

MATLAB SPAI provides a MATLAB interface.

Description Given a sparse matrix A the SPAI Algorithm computes a sparse approximate inverse M by Frobenius norm minimization. The algorithm can be applied both as a static as an adaptive technique. The preconditioner can be applied to an iterative method. The package includes a Bi-CGSTAB routine which is only intended for testing.

Website http://cccs.unibas.ch/en/education/software-packages

The SPAI algorithm computes, in parallel, a sparse approximated inverse preconditioner by Frobenius norm minimization, see equation (4.5). The algorithm can be used as a static technique. In this case the sparsity pattern is fixed by either a banded structure or a subset of the sparsity pattern of A. This simple approach may not be robust and as an alternative we can determine the sparsity iteratively by an adaptive technique, as is done in algorithm 4.1; this algorithm was also suggested by Marcus Grote, one of the main authors of the package. If SPAI is used as an adaptive technique, the algorithm proceeds until the Frobenius norm of AM − I is less than a user defined threshold eps. The threshold must be between 0 and 1. It is suggested to start with a relative large value, like 0.7, and next to decrease eps until one finds an accurate preconditioner. The user must also define a parameter ns: this is the maximum number of improvement steps per column in SPAI. From section 4.2 we know that the preconditioner M is constructed by column. If the maximum number of improvement steps is reached and the residual is not less than eps, then SPAI uses the best approximation obtained so far. It is suggested to use values of ns such that some columns do not achieve the required accuracy eps. The included Bi-CGSTAB routine can be used for testing, but it is mentioned that this routine is not efficient in a parallel environment. For practical implementations, there is a PETSc interface to SPAI. The SPAI algorithm is robust, inherently parallel, ordering independent and effective on nonsymmetric and ill-conditioned problems.

5.1.3 ParaSails

Name ParaSails - Parallel sparse approximate inverse preconditioner, using a priori sparsity patterns and least-squares (Frobenius norm) minimization (symmet- ric positive definite and general versions) [7].

Author(s) Edmond Chow.

Language The software is written in C.

MATLAB ParaSails does not provide a MATLAB interface.

5Message Passing Interface (MPI) is a standardized and portable message-passing system designed by a group of researchers from academia and industry to function on a wide variety of parallel computers

36 Description ParaSails is a parallel implementation of a sparse approximate inverse precon- ditioner, using a priori sparsity patterns and least-squares (Frobenius norm) minimization. The package includes parallel CG and GMRES solvers and a parallel matrix class. SPD problems are preconditioned with a factorized SPD preconditioner and general (nonsymmetric and/or indefinite) problems are handled with an unfactorized preconditioner. ParaSails uses post-filtering techniques to reduce cost of applying the preconditioner.

Website http://www.llnl.gov/CASC/parasails

The a priori sparsity pattern in ParaSails is determined by constructing, in parallel, powers of the binary matrix A˜, see equation (4.3). In tests performed by Chow, powers of levels up to 4 and thresholds less than 0.3 were suitable parameters in producing an accurate preconditioner [6]. T For SPD problems a factorized sparse approximated inverse M = GLDL is constructed by Frobenius norm minimization of the Cholesky factor. General problems are preconditioned with a factorized sparse approximated inverse M of the original matrix, see equation (4.5). The preconditioning step is executed more efficiently by a post-filtering technique on M (or GL). Entries of the scaled preconditioner that are smaller than a user defined threshold, called the filter value, are dropped. In tests performed by Chow, filter values between 0.05 and 0.1 have shown to give good results.

5.1.4 HSL MI12

Name HSL MI12.

Author(s) Numerical Analysis Group at the STFC Rutherford Appleton Laboratory

Origin N.I.N. Gould and J.A. Scott

Language The software is written in Fortran 77.

MATLAB HSL MI12 does not provide a MATLAB interface.

Description The routine finds a sparse approximated inverse M by Frobenius norm min- imization. The process may be improved by first performing a block trian- gularization of A and then computing approximate inverses to the resulting diagonal blocks.

Website http://www.hsl.rl.ac.uk/catalogue/mi12.xml

The abbreviation HSL stands for Harwell Subroutine Library, it is a collection of packages that were written and developed by the Numerical Analysis Group at the STFC Rutherford Appleton Laboratory and other experts. The MI packages are specialized for iterative methods of sparse matrices. The MI12 package computes an sparse approximated inverse preconditioner for sparse unsymmetric matrices. The process is similarly to the SPAI package, based on Frobenius

37 norm minimization of AM − I. In addition, the package includes an option to perform a block triangularization of A as follows:   A11 A12 ...A1b  A22 ...A2b P AQ =    .. .   . .  Abb where P and Q are permutation matrices. When a block triangularization is performed, the preconditioner is constructed by sparse approximated inverses of the diagonal blocks. These are determined by a Frobenius norm minimization of AjjMjj − Ij for j = 1, . . . , b. In both cases the sparsity pattern is determined iteratively starting with a complete zero pattern. The algorithm allows a maximum amount l of nonzero entries in each column of M according to a user defined parameter. The sparsity pattern is augmented in a similar way as in the SPAI algoritm (4.1). As in the SPAI package, a threshold eps should be defined such that the process is terminated i i th whenever kAjjmjj − eikF < eps, where mjj is the i column of Mjj.

38 6 Numerical experiments

In this section we describe some numerical experiments with sparse approximate inverse tech- niques. First we consider test problems with two matrices from practical implementations. We will test the effectiveness of SPAI preconditioners in comparison to the more standard used ILUT preconditioners. Secondly, we will see a practical implementation of SPAI preconditioners in electromagnetic scattering problems.

6.1 Comparative study on SPAI and ILUT preconditioners

In this section we report on experiments with SPAI and ILUT preconditioners applied to Krylov methods. In this comparative study on preconditioners we show that the sparse approximate inverse technique SPAI gives good results for two indefinite and unsymmetric systems. In the second case we consider an ill-conditioned system. Even then SPAI has shown to produce an accurate preconditioner and fast convergence was achieved. The more general used ILUT al- gorithm leads to bad convergence behavior and for the ill-conditioned the process breaks down because we encounter a zero pivot in the construction phase of the preconditioner.

The two matrices that we considered in the experiments originate from physical problems and they are available for free on the website http://math.nist.gov/MatrixMarket. The first matrix belongs to the set denoted by BRUSSEL, a set of matrices obtained from the reaction- diffusion Brusselator Model. The matrix that we consider has the matrix name RDB2048, for the sake of both notation as brevity we will refer to this matrix by Ardb. The matrix is supplied by K. Meerbergen from the Katholieke Universiteit Leuven and A. Spence from the University of Bath. The field of application is is chemical engineering. The matrix is unsymmetric indefinite, 3 is of size n = 2048 and has condition number K(Ardb) = 1.81 · 10 , see figure 4 (left). The nonzero structure of the matrix is represented with a density plot; this is a coloured plot of the sparsity pattern from which we readily observe the pattern of the largest entries.

35 35

30 30

25 25

20 20

15 15

10 10

5 5

Figure 4: Matrices RDB2048 (left) and FIDAP032 (right).

The second matrix belongs to the set FIDAP; a set of matrices that is generated by the sofware package FIDAP. The matrix that we consider has the matrix name FIDAP032 and we

39 refer to it by Afdp. The matrix is supplied by Isaac Hasbani from Fluid Dynamics International and Barry Rackner from Minnesota Supercomputer Center. The field of application is finite element modeling. The matrix is unsymmetric indefinite and is of size n = 1159, see figure 4 20 (right). With condition number K(Afdp) = 5.24 · 10 we consider the corresponding system to be ill-conditioned.

We have explicitly determined the inverse of Ardb in MATLAB. The density plot is shown in figure 5. We notice that, although the original matrix is very sparse, the inverse is completely dense. This is typically observed for sparse matrices. However in many cases, and also in this particular case, a lot of the entries outside the sparsity pattern of the original matrix have small magnitude. We conclude that an inverse may be effectively approximated by a sparse matrix. We mention that Afdp is too ill-conditioned to invert in MATLAB.

35

30

25

20

15

10

5

−1 Figure 5: Density plot of the explicit inverse Ardb.

With the package SPAI, described in subsection 5.1.2, we have constructed sparse approxi- mate inverses of the two matrices. The right-hand side of the considered systems was computed by taking for x the vector of all ones, so that the exact solution is always known. SPAI is im- plemented as an adaptive technique; for the threshold eps we have chosen 0.6, for the parameter ns we choose 5 and we allowed a maximum number of 5 nonzero entries in each column. The meaning of these parameter is described in subsection 5.1.2. The parameters were chosen such that we obtained good results with the included Bi-CGSTAB routine from the SPAI package. Although we could compute the forward error explicitly, since we know the exact solution, we used the residuals for the stop criterion as this is standard for the Bi-CGSTAB routine. We have applied right-preconditioning such that the residuals are not changed. The tolerance for the stop criterion was 10−8.

The obtained preconditioners for Ardb and Afdp ( Mrdb and Mfdp respectively), are shown in figure 6 (the top left en -right corner, respectively). Notice that both preconditioners have a very sparse pattern, comparable with the sparsity pattern of the original matrix. Also notice −1 that Mrdb is much sparser than we may have suspected from the density plot of Ardb (figure 5). We represented convergence plots of the Bi-CGSTAB routine applied to the preconditioned systems ArdbMrdb and AfdpMfdp, see figure 6 (bottom left and -right corner, respectively). Convergence was achieved in 146 iterations steps for Ardb and in 255 steps for Afdp. In practice, however, it is possible that we require a lower tolerance of say 10−3 and we could have achieve convergence in approximately 50 and 100 iteration steps respectively.

40 SPAI preconditioner for RDB2048 SPAI preconditioner for FIDAP032 0 0

200 500

400

1000 600

800 1500

1000

2000 0 500 1000 1500 2000 0 200 400 600 800 1000

Convergence Plot for RDB2048 Convergence Plot for FIDAP032 102 100

100 10−2

10−2 10−4

10−4

−6 Residual Residual 10 10−6

10−8 10−8

10−10 10−10 0 50 100 0 50 100 150 200 250 Iteration Step Iteration Step

Figure 6: Results from numerical experiments with the package SPAI to matrices Ardb and Afdp.

We tested the effectiveness of the SPAI preconditioners Mrdb and Mfdp in comparison to the popular and more general used ILUT preconditioners. We used the ITSOL package6; this package provides a routine for the standard ILUT algorithm described in subsection 3.2.4. In the package there is included a routine for the flexible GMRES with restart m (FGMRES(m)). First, the routine of GMRES(m) is prescribed as follows: In each iteration step we construct a Krylov subspace of size up to m and check whether te solution is accurate enough. In the next iteration step, the previous Krylov subspace is destroyed and GMRES(m) restarts the process with the latest approximation as an initial guess. For unpreconditioned systems we know from the Arnoldi

6Available on the website http://www-users.cs.umn.edu/saad/software/ITSOL/index.html, the software is written in C.

41 Table 1: Results from GMRES(60) applied to the preconditioned systems RDB2048L and FIDAP032. By ”iter” we refer to the outer iterations, the residuals and errors are given in 2-norm and the abbreviation ”b.d.” stands for a break down in the construction of the preconditioner.

GMRES(60), Toler. 1e − 8 RDB2048L FIDAP032 iter residual iter residual SPAI 3 8.98e-09 23 1.00e-08 ILUT +5000 3.93e+02 b.d. b.d.

method that for each iteration of GMRES(m) we perform m matrix-vector products. Therefore we refer to this process by the outer iteration of GMRES. Notice that an outer iteration of GMRES(m) is, in terms of computational work, equivalent to m iterations of GMRES. We call the algorithm FGMRES flexible because it allows for a variable preconditioner in each outer iteration step, however, in this case the preconditioner remains the same. For this reason we interpret the included solver as GMRES(m). The right-hand sides of the systems are again computed by taking for x the vector of all ones. For the construction of the ILUT preconditioners we have chosen a drop tolerance of 0.3 and a level of fill of 50. For the dimension of the Krylov subspace in outer FGMRES we have chosen m = 60. The SPAI preconditioners Mrdb and Mfdp were stored in MATLAB and we solved the system with a similar GMRES(60) routine. We present the results in table 1. For the well-conditioned system RDB2048L we notice that the SPAI preconditioner is very effective, whereas the ILUT preconditioner did not manage to converge within 5000 outer it- erations (the maximal amount of iterations allowed). At this point the forward error is still significant. The ill-conditioned system FIDAP032 converges in 23 outer iterations when we use the SPAI preconditioner. However, from the forward error we observe that the system did not converge to an accurate solution. Since the system is ill-conditioned it is prone to numerical errors and we can not always fix this with a preconditioner. The SPAI preconditioner still per- forms much better, as for the second matrix the construction of the ILUT preconditioner breaks down due to a zero pivot. In this comparative study on preconditioners we have shown that sparse approximate inverse techniques may provide accurate preconditioners for unsymmetric indefinite system and that they should be valued as alternatives for standard techniques based on incomplete LU decomposition.

42 6.2 Electromagnetic Scattering - the Satellite Problem

In this subsection we review the electromagnetic scattering problem for a simplified satellite, see figure 7. The problem addresses to the physical issue of detecting the diffraction pattern of the electromagnetic radiation scattered from the satellite when it is illuminated by an incident incoming wave. Electromagnetic scattering problems of large and complex bodies are of interest in the design of many industrial devices like radars, antennae, computer microprocessors, cellular telephone and so on. Maxwell’s equations describe how electric and magnetic fields are generated and altered by each other, and by charges and currents. However, for computational purposes the problem is converted to an integral formulation of the surface current of the object. Once the surface current is known, we could derive the diffraction pattern of the electromagnetic radiation using Maxwell’s equations. Hence, the surface current is the unknown quantity. The linear systems that are derived from these type of problems are among the most difficult problems to solve by iterative methods. The need for preconditioning of these type of problems, and especially the effectiveness of sparse approximate inverse techniques based on Frobenius norm minimization in comparison to other preconditioning techniques, is described by B. Carpentieri [5].

Figure 7: Simplied model of a satelite with a mesh of triangles on its surface

The discretization of the problem is realized by constructing a mesh of triangles on the surface of the satellite, see figure 7. Now, the unknowns of the system are associated with the vectorial flux across an edge in the mesh. Hence, each column of the matrix that arises from this problem corresponds to such an edge in the mesh. The right-hand side depends on the frequency and the direction of the incoming wave. For the purpose of this thesis, we will not go in to detail in the construction of the linear system. We do mention that the frequency ω is related to the wavelength λ by ω = 2πc/λ, where the constant c is the speed of light. For physical consistency we require about ten discretization points per wavelength, so that the linear systems arising from

43 high-frequency scattering problems can be very large.

For the SPAI preconditioner we compute the sparsity pattern in advance, next the preconditioner is constructed by solving the n least square problems of eqn (4.9). For the a priori pattern selection we consider a model problem: The electromagnetic scatter problem of a sphere, see figure 8. The model problem is representative of the general trend and we will use the same approach for the satellite problem.

Figure 8: Model problem of a sphere with a mesh of triangles on its surface.

In the upper left corner of figure 9 we present the density plot of matrix that arises form the electromagnetic scatter problem of the sphere. The matrix is completely dense, however, it contains many small entries. We sparsify the matrix by dropping these small values, see figure 9 (the bottom left corner). In this figure we also present a density plot of the explicit inverse and the nonzero structure of a sparsified version of the explicit inverse (the upper- and bottom right corner, respectively). The general trend that is observed for these type of problems is that the nonzero structure of the sparsified matrix and the sparsified explicit inverse are very similar. Since the most important information is maintained in the sparsified matrices, it makes sense to choose as a sparsity patten, for the preconditioner M ≈ A−1, the nonzero structure of the sparsified matrix A. For the satellite problem we use the same approach. The sparsity pattern that we obtained for the preconditioner is shown in figure 10 (left); the size of the matrix is n = 1699. The preconditioner is obtained by Frobenius norm minimization and is applied as a left-sided pre- conditioner, eqn (3.8). We solve the system with GMRES(60) with a tolerance of 1.0e − 10 for the stop criterion. Notice that the stop criterion is based on the preconditioned residual Mr(k) := M(b − Ax(k)), which we call the Arnoldi residual. The method converges within 121 iterations at which point the actual residual, for the unpreconditioned system, takes value 0.43e − 08. We present the convergence plot in figure 10 (right).

44 Figure 9: The matrix A corresponding to the model problem of the sphere (upper left), a sparsified version of A (bottom left), the explicit inverse A−1 (upper right) and a sparsified version of the explicit inverse (bottom right).

6.2.1 Spectral deflation

In this subsection we propose a refinement technique for the SPAI preconditioner. The goal is to remove the effect of the smallest eigenvalues (in magnitude) from the preconditioned matrix (we know from section 3.1 that eigenvalues close to the origin are bad for convergence). This technique is known as spectral deflation. We apply this technique to the satellite problem to see how the convergence improves. We first consider the model problem of the sphere as it is representative for the general trend. The eigenvalue distribution of the corresponding matrix is presented in figure 11 (left). We observe that spectrum is widely spread in the complex plane, which explains the need for preconditioning. When we precondition the system with a Frobenius norm minimization method,

45 0 10

−2 10

−4 10

−6 10

Arnoldi residual −8 10

−10 10

−12 10 0 20 40 60 80 100 120 Iteration Step Figure 10: The pattern for the SPAI preconditioner to the satellite problem (left) and the convergence plot for GMRES(60) applied to the preconditioned system (right). as described earlier for the model problem, the eigenvalues become clustered at the point one, see figure 11 (right). However, we observe that there are still a few eigenvalues located near the origin and we expect these eigenvalues to slow down the convergence significantly.

We are considering a system that is left-preconditioned, i.e. the system M1Ax = M1b where M1 is the preconditioner obtained by Frobenius norm minimization. Assume for simplicity that the preconditioned matrix M1A is diagonalizable, this means that there exists a diagonal matrix D = diag(λi), where |λ1| ≤,..., ≤ |λn|, and a nonsingular matrix V = [v1, . . . , vn], where vi is an associated right eigenvector, such that:

−1 M1A = VDV .

Figure 11: The eigenvalue distribution of respectively the coefficient matrix corresponding to the model problem (left) and of the same coefficient matrix preconditioned by the Frobenius norm minimization method (right).

In addition we define the matrix U = [u1, . . . , un], where ui is an associated left eigenvector;

46 H H H H we then have U V = diag(ui vi), with ui vi 6= 0 for every i. Here U denotes the Hermitian of U as it may contain complex eigenvectors. Now let V denote the set of right eigenvectors associated to the eigenvalues with |λi| ≤  and define U in a similar way. It is shown that the following theorem holds [5].

Theorem 6.1 Let

H Ac = U M1AV, −1 H Mc = VAc U M1 and M = M1 + Mc. −1 Then MA is diagonalizable and we have MA = V diag(ηi)V with  ηi = λi if |λi| > , ηi = 1 + λi if |λi| ≤ .



H We do not provide a proof for the theorem, but it is easy to derive that Ac = diag(λi · ui vi); a matrix of size k × k where k is the amount of eigenvalues that are smaller (in absolute value) than . By using the diagonalization of M1A multiple times, the result follows. Computing the eigenvalues and the associated eigenvectors is expensive and we want to reduce the cost as much as possible. The following theorem will help to reduce cost by obtaining a preconditioner that is similar to the preconditioner obtained in theorem 6.1. We state the theorem without proof and refer the interested reader to the article by B. Carpentieri [5].

Theorem 6.2 Let W be such that

H A˜c = W AV has full rank, ˜ ˜−1 H Mc = VAc W and M˜ = M1 + M˜ c. Then MA˜ is similar to a matrix whose eigenvalues are  ηi = λi if |λi| > , ηi = 1 + λi if |λi| ≤ .



We expect V to be of full rank k; consequently, a natural choice for the matrix W is to choose the same matrix V. Now, without computing the left eigenvectors, we are able to construct a preconditioner M such that the preconditioned matrix MA has the desired properties; in general we assume   1 so that the smallest eigenvalues become clustered at the point one as well. We have applied spectral deflation to the satellite problem. With the package ARPACK we have computed the smallest eigenvalues and their corresponding approximate eigenvectors. By successively shifting the eigenvalues, as described in theorem 6.2, we observe how the convergence behavior of GMRES(60) improves. The results are shown in table 2; the convergence behavior

47 Table 2: Effect of shifting the eigenvalues nearest zero on the convergence of GMRES(60). We show the magnitude of the successively shifted eigenvalues and the number of iterations required when these eigenvalues are shifted.

The Satellite Problem Nr. of shifted GMRES(60) Eigenvalues Toler. 1e − 10 0 121 1 112 2 107 3 97 4 91 5 85 6 80 7 76 8 66 9 60 10 59

keeps improving by successively shifting the smallest eigenvalues. By shifting the ten smallest eigenvalues, GMRES(60) already converges twice as fast. We mention some remarks on spectral deflation. First of all, the preconditioning phase of GMRES(60) requires a matrix-vector (M-V) product with the matrix M = M1 + M˜ c. The costs 2 of the correction update M˜ cv for the M-V product Mv are 2nk + k , where k is the amount of shifted eigenvalues. Secondly, the ARPACK package requires a significant amount of M-V products for the computation of the smallest eigenvalues and the associated eigenvectors. For example, in order to shift 12 eigenvalues for the model problem of the sphere, we require 131 such products. For the slightly more complex problem of the cylinder, this already increases to 597. This should be compared with the approximate 120N M-V products that are required for N steps of the preconditioned GMRES(60) routine. However, the additional cost for constructing the preconditioner can be amortized if the matrix is reused for solving linear system with the same coefficient matrix and several right-hand sides. This is actually a typical scenario for realistic electromagnetic simulations [5].

48 7 Conclusion

In my thesis I have reviewed techniques for computing a sparse approximation to the inverse of a matrix. We have seen that the inverse of a sparse matrix is typically dense, but may contain a lot of entries with small magnitude. Sparse approximate inverse techniques can be effectively used in many applications of numerical analysis, however, the main focus in this thesis was on the preconditioning of linear systems. We have discussed the concept of preconditioning and reviewed some well-known standard preconditioning techniques based on incomplete LU decomposition. These type of techniques are effectively used in many applications, however, we have seen that might be ineffective, or might even fail, on unsymmetric and indefinite systems. The most important causes for failure are a break down due to a zero pivot and unstable linear solves in the preconditioning phase. We have shown that sparse approximate inverse techniques do not suffer from these problems and can be effectively used on indefinite and unsymmetric systems. Another issue we addressed to in this thesis is the need for parallel implementation on high-performance computers. The standard techniques are in general highly sequential, both in construction as in implementation, whereas sparse approximate inverse techniques are in general inherently parallelizable. Preconditioning is still the main stumbling block in achieving high performance for large, sparse linear systems and sparse approximate inverse techniques are promising techniques to overcome this. In a comparative study on preconditioners, we have shown that sparse approximate inverse techniques are competitive and sometimes even superior to the standard techniques based on incomplete LU decomposition. The focus was mainly on preconditioners obtained by Frobenius norm minimization of the inverse, i.e. the SPAI preconditioner. With the SPAI precondi- tioner we achieved good results on unsymmetric and indefinite (and in a particular case even ill-conditioned) systems from practical implementations. With the ILUT preconditioner con- vergence was either slow or the construction of the preconditioner broke down due to a zero pivot. Finally, we have reviewed the use of SPAI preconditioners in practical implementations of electromagnetic scattering problems. Linear systems arising from electromagnetic scattering problems are among the most difficult systems to solve by iterative methods. According to a model problem of a sphere, we have studied the scattering problem of a simplified satellite. With the a priori pattern selection obtained from the nonzero structure of the sparsified matrix to the original problem, we achieved good results when the system was solved with GMRES(60). In ad- dition we proposed a refinement technique by spectral deflation and showed that the convergence was significantly improved by shifting up to ten eigenvalues of smallest value.

49 References

[1] S Balay, J Brown, K Buschelman, V Eijkhout, W Gropp, D Kaushik, M Knepley, L Curfman McInnes, B Smith, and H Zhang. Petsc users manual revision 3.4. -, 2013.

[2] M. Benzi and M. T˚uma.A comparative study of sparse approximate inverse preconditioners. Applied Numerical Mathematics, 30(2–3):305–340, 1999. [3] Michele Benzi, Carl D Meyer, and Miroslav Tuma. A sparse approximate inverse pre- conditioner for the conjugate gradient method. SIAM Journal on Scientific Computing, 17(5):1135–1149, 1996. [4] Michele Benzi and Miroslav Tuma. A sparse approximate inverse preconditioner for non- symmetric linear systems. SIAM Journal on Scientific Computing, 19(3):968–994, 1998. [5] Bruno Carpentieri. Fast iterative solution methods in electromagnetic scattering. Progress In Electromagnetics Research, 79:151–178, 2008.

[6] Edmond Chow. Parallel implementation and performance characteristics of least squares sparse approximate inverse preconditioners. In Int. J. High-Perform. Comput. Appl. Cite- seer, 2000. [7] Edmond Chow. Parasails: Parallel sparse approximate inverse (least-squares) precondi- tioner. Software distributed by, 21, 2001. [8] Edmond Chow and Yousef Saad. Experimental study of ilu preconditioners for indefinite matrices. Journal of Computational and Applied Mathematics, 86(2):387–414, 1997. [9] J.J. Dongarra, I.S. Duff, D.C. Sorensen, and H.A. van der Vorst. for high-performance computers, volume 7 of Software, Environments, and Tools. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1998. [10] A. Greenbaum. Iterative Methods for Solving Linear Systems. Frontiers in Applied Mathe- matics, No. 17. SIAM, Philadelphia, 1997. [11] Marcus J Grote and Thomas Huckle. Parallel preconditioning with sparse approximate inverses. SIAM Journal on Scientific Computing, 18(3):838–853, 1997. [12] I. C. F. Ipsen and C. D. Meyer. The idea behind Krylov methods. The American Mathe- matical Monthly, 105(10):889–899, 1998. [13] L Yu Kolotilina and A Yu Yeremin. Factorized sparse approximate inverse preconditionings i. theory. SIAM Journal on Matrix Analysis and Applications, 14(1):45–58, 1993.

[14] Alfio Quarteroni, Riccardo Sacco, and Fausto Saleri. Numerical mathematics, volume 37. Springer, 2007. [15] Y. Saad. Iterative Methods for Sparse Linear Systems. SIAM Publications, second edition, 2003.

[16] A Yu Yeremin and AA Nikishin. Factorized-sparse-approximate-inverse preconditionings of linear systems with unsymmetric matrices. Journal of Mathematical Sciences, 121(4):2448– 2457, 2004.

50 Appendices

A ILUT algorithm

Algorithm A.1 ILUT algorithm, IKJ variant 1: for i = 2, . . . , n do 2: w := ai∗ 3: for k = 1, . . . , i − 1 and when wk 6= 0 do 4: wk = wk/akk 5: Apply a dropping rule to wk 6: if wk 6= 0 then 7: w := w − wk ∗ uk∗ 8: end if 9: end for 10: Apply a dropping rule to row w 11: lij = wj for j = 1, . . . , i − 1 12: uij = wj for j = i, . . . , n 13: w := o 14: end for

B MATLAB code for GMRES(m)

The GMRES(m) routine is implemented in MATLAB with the following code.

1 function [x, error, iter, mvprod, flag] = gmres spai( A, x, b, M, restrt, ... max it, tol ) 2 3 % −− Iterative template routine −− 4 % Univ. of Tennessee and Oak Ridge National Laboratory 5 % October 1, 1993 6 % Details of this algorithm are described in"Templates for the 7 % Solution of Linear Systems: Building Blocks for Iterative 8 % Methods", Barrett, Berry, Chan, Demmel, Donato, Dongarra, 9 % Eijkhout, Pozo, Romine, and van der Vorst, SIAM Publications, 10 % 1993.(ftp netlib2.cs.utk.edu; cd linalg; get templates.ps). 11 % 12 %[x, error, iter, mvprod, flag]= gmres espl(A,x,b,M, restrt, max it, tol) 13 % 14 % gmres.m solves the linear system Ax=b 15 % using the Generalized Minimal residual( GMRESm) method with restarts. 16 % 17 % inputA REAL nonsymmetric positive definite matrix 18 %x REAL initial guess vector 19 %b REAL right hand side vector 20 %M REAL preconditioner matrix 21 % restrt INTEGER number of iterations between restarts 22 % max it INTEGER maximum number of iterations 23 % tol REAL error tolerance 24 %

51 25 % outputx REAL solution vector 26 % error REAL error norm 27 % iter INTEGER number of iterations performed 28 % mvprod INTEGER number of actualM −V products 29 % flag INTEGER:0= solution found to tolerance 30 %1= no convergence given max it 31 32 iter = 0;% initialization 33 flag = 0; 34 mvprod = 0; 35 36 bnrm2 = norm( b ); 37 if ( bnrm2 == 0.0 ), bnrm2 = 1.0; end 38 39 r = M * ( b−A*x ); 40 mvprod = mvprod + 1; 41 42 error = norm( r ) / bnrm2; 43 if ( error < tol ) return, end 44 45 [n,n] = size(A);% initialize workspace 46 m = restrt; 47 V(1:n,1:m+1) = zeros(n,m+1); 48 H(1:m+1,1:m) = zeros(m+1,m); 49 cs(1:m) = zeros(m,1); 50 sn(1:m) = zeros(m,1); 51 e1 = zeros(n,1); 52 e1(1) = 1.0; 53 54 for iter = 1:max it,% begin iteration 55 56 r = M * ( b−A*x ); 57 mvprod = mvprod + 1; 58 59 V(:,1) = r / norm( r ); 60 s = norm( r )*e1; 61 for i = 1:m,% construct orthonormal 62 w = M * (A*V(:,i));% basis using Gram −Schmidt 63 mvprod = mvprod + 1; 64 65 for k = 1:i, 66 H(k,i)= w'*V(:,k); 67 w = w − H(k,i)*V(:,k); 68 end 69 H(i+1,i) = norm( w ); 70 V(:,i+1) = w / H(i+1,i); 71 for k = 1:i−1,% apply Givens rotation 72 temp = cs(k)*H(k,i) + sn(k)*H(k+1,i); 73 H(k+1,i) = −sn(k)*H(k,i) + cs(k)*H(k+1,i); 74 H(k,i) = temp; 75 end 76 [cs(i),sn(i)] = rotmat( H(i,i), H(i+1,i) );% formi −th rotation matrix 77 temp = cs(i)*s(i);% approximate residual norm 78 s(i+1) = −sn(i)*s(i); 79 s(i) = temp; 80 H(i,i) = cs(i)*H(i,i) + sn(i)*H(i+1,i); 81 H(i+1,i) = 0.0; 82 error = abs(s(i+1)) / bnrm2; 83 if ( error ≤ tol ),% update approximation 84 y = H(1:i,1:i) \ s(1:i);% and exit 85 x = x + V(:,1:i)*y; 86 break;

52 87 end 88 end 89 fprintf('iter=%d,res=%g\n',iter,error); 90 91 if ( error ≤ tol ), break, end 92 y = H(1:m,1:m) \ s(1:m); 93 x = x + V(:,1:m)*y;% update approximation 94 r = M * ( b−A*x );% compute residual 95 mvprod = mvprod + 1; 96 s(i+1) = norm(r); 97 error = s(i+1) / bnrm2;% check convergence 98 if ( error ≤ tol ), break, end; 99 end 100 101 if ( error > tol ) flag = 1; end;% converged 102 103 % END of gmres.m

The routine is applied to the systems RDB2049 and FIDAP032, m = 60, with the following script.

1 clear all; close all; clc 2 3 load matrices 4 5 fprintf('\n'); 6 fprintf('* RDB2048L problem * \n'); 7 8 A=rdb2048; 9 M=rdb2048 spai; 10 11 n=size(A,1); 12 restrt=60; 13 tol=1e−8; 14 b=A*ones(n,1); 15 x0=zeros(n,1); 16 max it=n; 17 18 [x, error, iter, mvprod, flag] = gmres spai( A, x0, b, M, 60, max it, tol); 19 20 %% 21 fprintf('\n\n'); 22 fprintf('* FIDAP032 problem * \n'); 23 24 pause(0) 25 26 A=fidap032; 27 M=fidap032 spai; 28 29 n=size(A,1); 30 restrt=60; 31 tol=1e−8; 32 b=A*ones(n,1); 33 x0=zeros(n,1); 34 max it=n; 35 36 [x, error, iter, mvprod, flag] = gmres spai( A, x0, b, M, 60, max it, tol); 37 38 fprintf('\n');

53