Towards a Cost-Effective Ilu Preconditioner with Higher

TOWARDS A COST-EFFECTIVE ILU PRECONDITIONER WITH HIGHER LEVEL FILLS z y z E. F. D'AZEVEDO , P. A. FORSYTH AND WEI-PAI TANG Abstract. A recently prop osed Minimum Discarded Fill (MDF ) ordering (or pivoting) technique is e ective in nding high quality ILU (`) preconditioners, esp ecially for problems arising from unstructured nite element grids. This algorithm can identify anisotropy in complicated physical structures and orders the unknowns in a \preferred" direction. However, the MDF ordering is costly, when ` increases. In this pap er, several less exp ensive variants of the MDF technique are explored to pro duce cost- e ective ILU preconditioners. The Incomplete MDF and Threshold MDF orderings combine MDF ideas with drop tolerance techniques to identify the sparsity pattern in the ILU preconditioners. These techniques pro duce orderings that encourage fast decay of the entries in the ILU factorization. The Minimum Up date Matrix (MUM ) ordering technique is a simpli cation of the MDF ordering and is an analogue of the minimum degree algorithm. The MUM ordering is esp ecially e ective for large matrices arising from Navier-Stokes problems. Key Words. minimum discarded ll(MDF ), incomplete MDF , threshold MDF , minimum up- dating matrix(MUM ), incomplete factorization, matrix ordering, preconditioned conjugate gradient, high-order ILU factorization. AMS(MOS) sub ject classi cation. 65F10, 76S05 1. Intro duction. The use of Preconditioned Conjugate Gradient (PCG ) metho ds has proven to b e a robust and comp etitive solution technique for large sparse matrix problems [1, 22, 28]. A vital step for the successful application of PCG metho ds is the computation of a high quality preconditioner. Many previous studies have explored this topic and their various approaches can b e summarized as follows: When the incomplete LU (ILU ) preconditioner was rst prop osed it was observed that as the ll level ` in an ILU decomp osition increases, the quality of the ILU (`) preconditioner improves [1, 19, 22, 25, 29]. Unfortunately, the resulting reduction in the numb er of iterations cannot comp ensate for the in- creased cost of the factorization, and forward and backward solve in each iteration. The most cost-e ective ll level is ` = 1; 2 in most cases. For some problems, the high-level ll entries in the ILU decomp osition are numerically very small and contribute little to the quality of the preconditioner. This observation leads naturally to the improved technique describ ed b elow. Based on this observation with the high-level- ll approach, a drop tolerance technique was investigated by Munksgaard [30] and Zlatev [37]. The strategy This work was supp orted by the Natural Sciences and Engineering Research Council of Canada, by the Information Technology Research Centre, which is funded by the Province of Ontario, and by the Applied Mathematical Sciences subprogram of the Oce of Energy Research, U.S. Department of Energy under contract DE-AC05-84OR21400 with Martin Marietta Energy Systems, Inc., through an app ointment to the U.S. Department of Energy Postgraduate Research Program administered by Oak Ridge Asso ciated Universities. y Mathematical Sciences Section, Oak Ridge National Lab oratory, Oak Ridge, Tennessee 37831. z Department of Computer Science, University of Waterlo o, Waterlo o, Ontario, Canada N2L 3G1. 1 for deciding the sparsity pattern of a preconditioner was to discard all \small" ll entries that are less than a given tolerance. This approach works well for the mo del Laplace problem. However, it was so on noticed that if the original ordering was \p o or", then the decay of the ll entries in the decomp osition was very slow, sometimes leading to an unacceptably dense decomp osition. Consider the following problem, ! ! @ @P @ @P (1) K + K = q x y @ x @ x @ y @ y with a 5-p oint discretization on the regular grid, where K , K > 0. When x y K K , the drop tolerance approach will pro duce a very dense precondi- x y tioner, if a natural row ordering is used. In contrast, a cost-e ective preconditioner is yielded using the same approach, when K K [8]. Interesting y x results concerning the e ects of ordering on the p erformance of PCG were rep orted recently in [8, 14, 12, 13, 15, 31]. Greenbaum and Ro drigue [18] addressed the problem of computing an optimal preconditioner for a given sparsity pattern. The values of the nonzero elements in the preconditioner were determined by numerical optimization techniques for a given sparsity pattern. Kolotilina and Yeremin investigated some least squares approximations for blo ck incomplete factorization [24]. Their results provide insights of a theoretical nature, but have limited practical application. Numerous sp ecial techniques that pro duce go o d preconditioners for elliptic problems and domain decomp osition metho ds have also b een rep orted [3, 4, 20, 23]. These approaches are very successful for applying PCG to the solutions of elliptic PDE's. However, they are dicult to generalize to Jacobians arising from systems of PDE's and to general sparse matrix problems. In summary, the following factors are observed to have a signi cant impact on the quality of a preconditioner: 1. The ordering of the unknowns in the original matrix A. 2. The sparsity pattern of the preconditioner M . 3. How closely the sp ectrum of M resembles that of A. Motivated by the signi cant e ect of ordering on the preconditioner, we have pro- p osed the Minimum Discarded Fill (MDF (`)) ordering (or pivoting strategy) [8] for general sparse matrices. This ordering technique is e ective in nding high quality ILU (`) preconditioners, esp ecially for problems arising from unstructured nite element grids. This algorithm can identify anisotropy in complicated physical structures and orders the unknowns in a \preferred" direction. Numerical testing of the MDF (`) metho d shows that it yields b etter convergence p erformance than some other orderings. The MDF (`) ordering is successful b ecause it takes into account the numerical values of matrix entries during the factorization pro cess, and not just the top ology of the mesh. However, the MDF (`) ordering may b e costly to pro duce. A rough estimate of the 3 cost of MDF (`) ordering is N d , where d is the average numb er of nonzero elements in 2 t 1 each row of the ll matrix, L + L , and N is the size of the original matrix . Thus, if the average numb er of nonzero in each row is large, the MDF (`) ordering is practical only when similar matrix problems need to b e solved several times, such as solving the Jacobian matrices in a Newton iteration [5, 7]. In this pap er, several variants of the MDF ordering algorithm for use with a drop tolerance incomplete factorization are investigated. Section 2 contains a detailed de- scription of these ordering techniques. Test problems are describ ed in Section 3 and the numerical results and discussion are provided in Section 4, with our concluding remarks in the last section. Results from applying the MDF algorithm and its variants to unsymmetric matrices derived from a system of PDE's will b e discussed in another pap er [5]. 2. Algorithms . 2.1. ILU (`) factorization. Let us de ne the nonzero sparsity pattern of a matrix C as the the set (2) NZ [C ] = f(i; j ) j c 6= 0g : ij ~ Given a sparsity pattern P , denote C := C [P ] to b e the matrix extracted from C with sparsity pattern de ned by P , ( c if (i; j ) 2 P ; ij (3) c~ = ij 0 otherwise. Apply an Incomplete LU (ILU ) factorization to a sparse matrix problem Ax = b. 2 After k steps of the factorization, we have the following (incomplete LD U ) decomp osition , " #" #" # t L 0 D 0 U Q k k k k (4) A ! ; P I 0 A 0 I k nk k nk where L (U ) is k k lower (upp er) unit triangular, D is k k diagonal matrix, P and k k k k Q are (n k ) k , I is the (n k ) (n k ) identity, and A is the (n k ) (n k ) k nk k submatrix remaining to b e factored. Let A = A and G = (V ; E ), k = 0; 1; ; n 1 0 k k k b e the graphs [32] of A , i.e. k o n (k ) (5) 6= 0; i; j > k ; : V = fv ; : : : ; v g ; E = (v ; v ) j a k k +1 n k i j ij Some new ll entries in A may b e created in the factorization pro cess, yet many lls k are also discarded. A common criterion for discarding a new ll is by its ll level. The notion of \ ll-level" can b e de ned through reachable sets and the shortest path length in the graph G [17]. Let S : fv ; : : : ; v g b e the set of the eliminated no des at step 0 k 1 k k , S V , and cho ose no des u; v 62 S . No de u is said to b e reachable from vertex v k 0 k through S if there exists a path (v ; u ; : : : ; u ; u) in graph G , such that each u 2 S , k i i 0 i k m 1 j 1 To simplify notation, a case of symmetric nonzero structure is presented here. 2 we assume the nal elimination sequence is v ; : : : ; v . 1 n 3 1 j m. Note that m can b e zero, so that any pair of adjacent no des u; v 62 S is k reachable through S .

Load more