Proceedings of the World Congress on Engineering 2007 Vol II WCE 2007, July 2 - 4, 2007, London, U.K.

Avant-Garde Splitting for the Solution of Sparse Non-symmetric Linear Systems

A. A. Shah Institute of Business Administration, City Campus, Garden/Kiyani Shaheed Road, Karachi. [email protected]

Abstract--A non-symmetric matrix splitting polynomial of degree n with and pn(0) = 1. is presented for the solution of certain Convergence will take place if and only if pn sparse linear systems. The author reports exists for which ║pn(A)r0║decreases rapidly, the comparison and the convergence and a sufficient condition for this is that performance of the previous and the ║pn(A)║ should decrease rapidly. GMRES is in contemporary methods and includes explicit fact well suited to normal matrices unfortunately comments. It is studied that some recent nonsymmetric matrices are rarely normal. There methods may work efficiently with a is always storage requirement problem and to symmetric matrix but show insufficiencies keep storage requirement under control, like numerical instability and non-scaling GMRES is often restarted after each k- steps invariant with non-symmetric matrix. The (see [15]). It is also reported that GMRES proposed matrix splitting can overcome algorithm is unreliable as there are instances these deficiencies. where the residual norms produced by the Keywords: , M-matrices, algorithm, although non-increasing, do not orthogonalization, Krylov Space, Iterative converge to zero (see [16]). As far as the Methods GMRES-Like methods, for example BCG 1-Introduction. method and a sufficient condition for this is that ║pn(A)║ should decrease rapidly. GMRES is in Recently some orthogonalization iterative fact well suited to normal matrices and methods e.g. GMRES methods or generalized unfortunately nonsymmetric matrices are rarely minimal residual methods have been used for the normal. There is always storage requirement solution of sparse symmetric linear systems, problem and to keep storage requirement under Ax = b …………………………………….(1.1) control, GMRES is often restarted after each k- Where A is a sparse matrix. In most of the steps (see [15]). It is also reported that GMRES studies these methods are matched up to and algorithm is unreliable as there are instances compared in many studies (see, for example [2], where the residual norms produced by the [3], [15], [16], [17] and [19]). These methods in algorithm, although non-increasing, do not fact minimize rn = b – Axn = nth residual in a converge to zero (see [16]). As far as the n Krylov Space K (A;r0). With an initial GMRES-Like methods, for example BCG presumption x0 an inimitable sequence {xn} is method and CGS methods, are concerned it is created with: reported that they are susceptible to the 2 3 4 xn  x0 + < Ar0, A r0, A r0, A r0 ……. , possibility of breakdown-division by zero (see n-1 A r0 > …………..(1.2) [15]). Gutknecht ([9]) has presented a satisfying: BICGStab method for matrices with complex rn = minimum, ……………………(1.3) spectrum. which is equivalent to the orthogonality The slightly older methods for the solution of condition: (1.1) rely on variants of point block ICCG, r ⊥ < Ar A2r A3r , A4r ……. , An-1r > where the conjugate gradient algorithm is n 0, 0, 0 0 0 preconditioned by a partial factorization of the …………………..(1.4) coefficient matrix. However, these highly refined The vigorous execution of (1.2) – (1.4) is and efficient techniques do not extend to the GMRES iteration method. The execution uses more general case of unsymmetric linear systems Arnoldi Method (see [1]) to create an orthogonal n (see [4]). Kershaw [12] generalizes the ICCG basis for the Krylov Space K (A;r0) which leads method to arbitrary non-singular sparse matrices to an (n + 1) x n Hessenberg Least-Square arising from the partial differential equations and problem ( see [16]). At each step we have: en = obtains the modified equation: pn(A)e0 , rn = pn(A)r0 where pn(Z) is a

ISBN:978-988-98671-2-6 WCE 2007 Proceedings of the World Congress on Engineering 2007 Vol II WCE 2007, July 2 - 4, 2007, London, U.K.

Is called the generalized regular splitting or x(k) = x(0) + GRS if S = [si,j] is a symmetric M-matrix such i = k that: T i - 1 T ∑ α (A A) A i i = 1 max{ai,j, aj,I) if i ≠ j s = i,j ai,j if i = j (Ax(0) - y ) { The matrix H = [hi,j] is such that (k ) where α is chosen to minimize x - x . hi,j = si,j - ai,j ≥ 0 for all i,j. The non-negative Δ = [δi,j] is The main problem with this approach is that the such that: amount of work per iteration is almost doubled T because it is necessary to multiply by A and A 0 if i ≠ j at each iteration in order to avoid the actual δ i,j = T if i = j production of A A in the sparse case. It can be { δ 1 T noted that when A is not symmetric ()A A 2 2- Choice of δ. may be a poor approximation to AT or A and, If matrix S is not a diagonally dominant matrix, a 1 non-negative diagonal matrix Δ = [δi,j] is added therefore, ()ATA 2 cannot be used in place of A. to make the matrix S a diagonally dominant In other studies more importance is given to the matrix to avoid the breakdown, where δ is small splitting of the coefficient matrix, which draws and is chosen so that (S + ∆) is a symmetric and

to a close to the storage problem and provides diagonally dominant M-matrix and (H + ∆) ≥ 0. reliable methods for the solution of non- A simple choice of δ could be as follows: symmetric systems (see [17] and [13]). We ⎛ n ⎞ -7 δ = max ⎜ ∑ aij - aii ⎟ + 10 , ∀ j ≠ i present the results and findings of one of these i = 1,....,n ⎝ j=1 ⎠ studies for a general reader. The computer codes Such matrices (S + ∆) can be easily factorized in of the method are published. We also present a stable manner into incomplete Cholesky some new theorems which are proved recently. factors (see, for example, [14]). It is noted that The method is designed for the solution of non- the smaller the ∆ the faster the convergence of symmetric systems (1.1) but it is capable of the GRSI method. using all the robust techniques, developed for the 3- Description of the algorithm. solution of symmetric linear systems. The Let us consider the system Ax = b, where A = method is found to be more suitable for the [ai,j] is a sparse nonsymmetric, N x N, M- iterative solution of (1.1), where A is a sparse matrix . Let unsymmetric M-matrix or nearly symmetric A = (S + ∆) - (H` + ∆)…………………..(3.1) structured M-matrix. We propose a particular be the GRS of A. class of regular splitting and call this class of As (S + ∆) is a symmetric and diagonally regular splitting the generalized regular splitting dominant M-matrix it can be decomposed into or GRS and the corresponding , triangular factors or incomplete Cholesky factors the generalized regular splitting method or GRSI i.e. (S + ∆) = LDLT – E is a regular splitting (see method. A simple technique is designed to [14]), where E is computed explicitly. produce a symmetric and positive definite matrix It then follows that A = LDLT – (H` + E + ∆) is or SPD splitting matrix. This allows for the a regular splitting. Substituting H for H` + E the stable efficient incomplete Cholesky iterative system can be written: factorization of the splitting matrix as the LDLT x(k ) = b + (H + Δ)x(k−1) , preliminary steps to an iterative solution. Because of the presence of a SPD matrix any k ∈ < n >, other efficient technique developed for where x(0) is arbitrary. Substituting the values of symmetric system can be used. x(k-1) in turn we obtain,

Definition1.1: x(k) = [ I + {(LDLT)-1(H + ∆)} + {(LDLT)-1(H Given an M-matrix A, the splitting + ∆)}2 + {(LDLT)-1(H + ∆)}3 + ………... + A = (S + ∆) - (H + ∆)………………...…(1.2) {(LDLT)-1(H + ∆)}(k-1)](LDLT)-1b +{(LDLT)-1(H + ∆)}k x(0) …………………………..………(3.2)

ISBN:978-988-98671-2-6 WCE 2007 Proceedings of the World Congress on Engineering 2007 Vol II WCE 2007, July 2 - 4, 2007, London, U.K.

T -1 (k) (0) If G = {(LDL ) (H + ∆)} then, x = [ I + x of x when the coefficient matrix A is an G + G2 + G3 + G4 …… + G(k-1)](LDLT)-1b + + Gkx(0)…….…………………………….(3.3) M-matrix. (k) -1 Proof: Follows directly from theorem 4-1 since Moreover x → A b, -1 i.e. the solution of the linear system because the GRS is a regular splitting and A ≥ 0. spectral radius ρ(G) is less than one, so that Lemma 4-1: G(k) → 0 and as k → ∞, The iteration matrix G of the GRSI method is a non-negative matrix when A is an M-matrix. k -1 p - 1 [ ∑ G ] → (I - G) . p=0 Proof: A = (S + Δ) − (H + Δ) A number of authors like [7] and [11] have Now (S + Δ)−1 ≥ 0 as devised similar splittings for the case when A is (S + Δ) is an M-matrix a large sparse singular and irreducible M-matrix. They also worked with nearly symmetric And (H + Δ) ≥ 0 therefore, matrices, which they define as having a G = (S + Δ)−1 (H + Δ) ≥ 0. symmetric, zero structure. The splitting matrix is constructed to be symmetric by dropping terms Theorem 4-3: ([20], p 125) from A and, as a result, the direct part of the Let A be a monotone matrix and let A = Q1 – R1 direct iterative method can take advantage of (i) and A = Q2 – R2 be two regular splittings of A. −1 −1 symmetric pivoting (ii) a standard symmetric If R2 ≤ R1, then S(Q2 R2 ) ≤ S(Q1 R1 ) . ordering scheme and (iii) a static storage scheme Theorem 4-4. The smaller is the δ the faster is for the factor L. The author reports that the the convergence of the GRSI method. direct-iterative method is faster than Proof: Obvious from theorem 4-3. straightforward iteration with the Gauss-Seidel

method. The splitting we propose has similar Theorem 4-5. Let A = [a ] be an N x N non- advantages but no terms from A are dropped. i,j symmetric M-matrix and x ≥ 0 be any non-zero 4-Convergence ⎧ n ⎫ The GRSI method associated with the GRS vector such that rx = min⎨∑ ai, j x j / xi ⎬ . splitting is given by ⎩ j=1 ⎭ (k ) (k−1) (S + Δ)x = (H + Δ)x + b If A = (S + Δ) − (H + Δ) is a GRS of A, then x (k ) = (S + Δ)−1 (H + Δ)x (k−1) + β ≥ α , where β is any eigenvalue of or or (S + Δ)−1 b sup (k ) (k−1) (S + Δ) and α = x ≥ 0 r . x = G x + η {}x where the iteration matrix is given by x ≠ 0 G = (S + Δ)−1 (H + Δ) and Proof: A= (S + Δ) − (H + Δ) ⇒ η = (S + Δ)−1 b . (S + Δ)−1 A = I − (S + Δ)−1 (H + Δ) Theorem 4-1: ([18], p.89). If A = P – Q is a regular splitting of matrix A = I – G. −1 and A-1 ≥ 0, then i.e. G = I − (S + Δ) A and by Lemma 4-1 -1 -1 -1 ρ (P Q)= ρ (A Q)/(1+ ρ (A Q)<1 (4.1) G ≥ 0 . -1 i.e. the matrix P Q is convergent, and the i.e. [I − (S + Δ)−1 A]A−1 ≥ 0 ⇒ A−1 ≥ 0. iterative method −1 −1 (k ) −1 (k−1) −1 i.e. A − (S + Δ) ≥ 0 , by definition 1.1 x = (P Q)x + P b converges for −1 −1 (0) A ≥ (S + Δ) . any initial value x of x if (4.1) is satisfied. 1 1 Theorem 4-2: By theorem 4-3, ≥ ⇒ β ≥ α. The spectral radius of the iteration matrix of α β GRSI method is less than 1, and hence the Theorem 4-6. Let G = [gi,j] ≥ 0 be an N x N method converges for any initial value * GRSI matrix, and P be the hyperoctant of vectors x ≥ 0 . Then, for any x ∈ P * either

ISBN:978-988-98671-2-6 WCE 2007 Proceedings of the World Congress on Engineering 2007 Vol II WCE 2007, July 2 - 4, 2007, London, U.K.

⎡ n ⎤ be numerically unsound (see [13]). The GRSIM gi, j x j subroutine is designed to use the ease and min ⎢∑ ⎥ ⎢ j=1 ⎥ comfort of all the efficient methods currently 1 ≤ i ≤ n < ρ(G) solving symmetric systems of linear equations, ⎢ xi ⎥ ⎢ ⎥ for the solution of nonsymmetric systems. Any ⎣ ⎦ technique, which can solve a symmetric system or n easily and economically, can be incorporated to ⎡ ⎤ the GRSI method to form a pristine version of gi, j x j max ⎢∑ ⎥ the GRSI method e.g. the GRSI Cholesky j=1 < 1 ≤ i ≤ n⎢ ⎥ factorization or GRSI-CF version and the GRSI ⎢ xi ⎥ Incomplete Cholesky factorization or GRSI-ICF ⎢ ⎥ version. The method uses a regular splitting. ⎣ ⎦ Therefore the convergence is guaranteed. For an ⎡ n ⎤ unsymmetric M-matrix it can be easily proved ⎢∑ gi, j x j ⎥ numerically that the GRSI method converges j=1 ⎢ ⎥ = ρ(G) . faster than the Jacobi and Gauss-Seidel methods. ⎢ xi ⎥ The GRSI method is not compared with the ⎢ ⎥ direct solution method of the unsymmetric ⎣ ⎦ system because of the fact that the direct solution Moreover, method is only about twice as expensive as a ⎧ ⎡ n ⎤⎫ Cholesky factorization (see [6]). gi, j x j sup ⎪ min ⎢∑ ⎥⎪ ⎪ j=1 ⎪ 6-Numerical Testing. x ∈ P * ⎨1 ≤ i ≤ n⎢ ⎥⎬ = ρ(G) ⎢ ⎥ ⎪ xi ⎪ ⎢ ⎥ We illustrate the numerical behaviour of the ⎩⎪ ⎣ ⎦⎭⎪ GRSI method by the following simple examples. In most of the experiments we consider the linear ⎧ ⎡ n ⎤⎫ elliptic equation: gi, j x j inf ⎪ max ⎢∑ ⎥⎪ au×x + cuyy + dux + euy + fu = g(x,y) ⎪ j=1 ⎪ = x ∈ P * ⎨1 ≤ i ≤ n⎢ ⎥⎬ ……………...(6.1) ⎢ ⎥ ⎪ xi ⎪ in the rectangular region R: 0 ≤ x ≤ α, 0 ≤ y ≤ β, ⎪ ⎢ ⎥⎪ with Dirichlet boundary conditions. We suppose, ⎩ ⎣ ⎦⎭ for definiteness, that a > 0, b > 0 and f ≤ 0 and Proof: Follows the steps of theorem 2.2 of all are bounded in the region R and on its (Varga [1962], p.32). boundary B. Upon employing second-order Each upper and lower bound for the spectral central difference procedures, the finite- radius of the iteration matrix of the GRSI method difference approximation for the above equation can thus be obtained by simple arithmetic steps. becomes: β1Ui+1,j + β2Ui-1,j + β3Ui,j+1 + 5-Comparison with the standard methods. 2 β4Ui,j-1 - β0Ui,j = h gi,j There are economical and where βi are functions of xi = ih, yi = jh, efficient techniques for the solution of linear 2 given by β = 2(a + c - ½h f ), β = a + symmetric systems. Furthermore, the situation 0 i,j i,j i,j 1 i,j hd regarding the availability of software ½ i,j, β2 = ai,j - ½hdi,j, β3 = ci,j + ½hei,j, β4 implementing iterative schemes for symmetric = ci,j - ½hei,j. The notation ai,j refer to a(ih, jh), systems is also satisfactory. As mentioned earlier evaluated at the point where the computational these techniques for the solution of symmetric module is centered. The coefficient matrix A, so systems cannot be extended to the case of obtained is an irreducible M-matrix. nonsymmetric linear systems. As far as the availability of the software is concerned apart Example-1. from the codes of Paige and Saunders most are experimental. The packages like SPARSPAK We now give the approximate number of (see Geogge and Liu, 1980) and the Yale multiplication operations needed for the solution Sparse Matrix Package (see Duff and Reid, of (1.1), obtained by the discretization of (6.1), 1983) are numerically sound if the coefficient by different iterative methods (see [14]). matrix in a system of linear equations is SPD. If Suppose N denotes the order of the coefficient it is not, the packages may still be used but may matrix A. The initial work, such as the work

ISBN:978-988-98671-2-6 WCE 2007 Proceedings of the World Congress on Engineering 2007 Vol II WCE 2007, July 2 - 4, 2007, London, U.K.

necessary for the estimation of the iteration- standard five-point finite-difference parameter for the SOR method and the work for discretization we obtain the difference equation: the decomposition of the GRSI matrix into h-2{u(x + h, y) + u(x-h, y) + Incomplete Cholesky factors has been neglected. u(x, y+h) + u(x, y-h) -4u(x,y)} + (½)βh-1{u(x This work will in general be small compared to + h, y) - u(x-h, y)} = 0. the computational work needed to carry out the Let us apply the RF-SI method and the GCW actual iterations. Accurate determination of the 1 SOR parameter may be difficult in some method for the case β = -3 and h = . There circumstances. 3 are four interior points and the βk for this special case are β1 = 0.5, β2 = 1.5, β3 = 1.0, β4 = 1.0, and TABLE -1 3 Approximate Number of Multiplication β0 = 4.0. Since (½)h |β| ≤ 1, the eigenvalues of Operation needed for the RF method are given one iteration in the solution of Ax = b (A is by an N X N matrix) 1 λ p ,q = ( )cos ( p π h ) + Method No. of 2 Operations 1 1 2 SOR 6N ( ) cos ( p π h ) 1 − {( )h β } 2 Gauss-Seidel 5N 2 Jacobi 5N p,q = 1, ….h-1 – 1. GRSI 7N The spectral radius of the RF-SI method is A number of nonsymmetric linear 1.4542307, which clearly shows that the RF-SI systems of equations were obtained by using the method fails to converge. The spectral radius of natural mesh ordering and discretizing (6.1) with the iteration matrix of the GCW method is arbitrary values of the coefficients a, c, d, e, f given by: and the mesh size h. A comparison on the basis hβ cos πh of asymptotic rate of convergence of the GRSI- S(G) = = 0.1767767 πh CF, the GRSI-ICF, the Jacobi, the Gauss- 4 2 sin( ) Seidel and the SOR methods was made. We 2 report a few of these results. The matrix ∆ is The coefficient matrix is an irreducibly considered to be a zero matrix and the optimum diagonally dominant M-matrix. parameter for SOR is used through out. The asymptotic average rate of convergence of Example-2. We considered a BST matrix with the GRSI-ICF, GCW-SI, SOR (with opt. relax. known sparsity and eigenvalues. The result is fact.), Gauss-Seidel, and Jacobi methods are given in the following table. noted respectively as 2.9, 1.7, 1.6, 1.5, 0.76. TABLE II Which shows that GRSIM is four times faster Number of Iteration Needed for Convergence than the , two times faster than -7 with Tolerance ℑ = 10 . the SOR method and the Gauss-Seidel method. Method Spectral η = rate of Radius S(G) convergence GRSI-ICF 0.0781830 2.5487025 SOR 0.01528233 1.8784726 Gauss- 0.3003972 1.2026496 Seidel Jacobi 0.5433333 0.6100323

Example-3. (cf. [10], p.354) This example illustrate that in some cases the GRSI method converges even though the RF method with Chebyshev acceleration (RF-SI method) fails to converge. Consider the partial differential equation:

uxx + uyy + βux = 0…………...…………(6.2) Assuming Dirichlet boundary conditions in the

unit square 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, and using the

ISBN:978-988-98671-2-6 WCE 2007 Proceedings of the World Congress on Engineering 2007 Vol II WCE 2007, July 2 - 4, 2007, London, U.K.

REFERENCES systems of equations. CANMER 2, 1986, pp. 499-508. [1] W. E. Arnoldi, The principle of [14] J. A. Meijerink, and H. A. van der minimized iteration in the solution of Vorst, An iterative solution method for the matrix eigenvalue problem. Quart. linear systems of which the coefficient Appl. Math. 9, 1951. pp. 17-29. matrix is symmetric M-matrix. Maths. [2] S. L. Campbell, I. C. F. Ipsen, C. T. Comp. 31, 1977, pp. 148-162. Kelly, C. D. Meyer and Z. Q. Xue, [15] N. M. Nachtigal, S. C. Reddy, and L. Convergence estimates for solution of N. Trefethen, How fast are integral equations with GMRES. J. of nonsymmetric iterations? SIAM J. integ. equ. and appl. 8,1996, pp. 19-34. Matrix Anal. Appl.13, 1992, pp. 778- [3] L. Collatz, Aufgaben monotoner Art. 795. Arch Math. 3, 1952, pp. 366-376. [16] Y. Saad, and M. H. Schultz, GMRES: [4] I. S. Duff, The solution of nearly A generalized minimum residual symmetric sparse linear equations. algorithm for solving nonsymmetric CSS150, Computer Science and System linear systems. SIAM J. Sci. Statist. Division, A.E. R. E. Harwell, Oxon, Com.7,1986, pp. 856-869. 1983. [17] A. A. Shah, GRSIM-A FORTRAN [5] I. S. Duff, and J. K. Reid, The Subroutine for the Solution of non- multifrontal solution of indefinite sparse symmetric Linear Systems, Commun. symmetric linear systems. ACM Trans. Numer. Meth. Engng, 18, 2002, pp. 803- Math. Softw. 9, 1983, pp. 302-325. 815. [6] I. S. Duff, and J. K. Reid, Some [18] R. S. Varga, Matrix iterative analysis, design feature of a sparse matrix code. Prentice Hall, Englewood Cliffs, New ACM Trans. Math. Softw. 5, 1979a, pp. Jersey, 1962. 18-35. [19] C. Vuik, and H. A. van der Vorst, A [7] R. E. Funderlic, and R. J. Plemmons, comparison of some GMRES-like A combined direct-iterative ethod for methods. Lin. Alg. and its Appl. 160, certain M-matrix Linear systems. SIAM 1992, pp. 131-162. J. ALG. DISC. METH., 5, 1984, pp. 33- [20] D. M. Young, Iterative Solution of 42. large linear systems. Academic Press, [8] A. George, J. Liu, and E. Ng, User New York, 1971. Guide for SPARSPAK: Waterloo Sparse Linear Package. Report CS-78- 30.Department of Computer Science, University of Waterloo, 1980. [9] M. H. Gutknecht, Variants of BICGtab for matrices with complex spectrum. SIAM J. Sci. Comput., 14, 1993, pp. 1020- 1033. [10] L. A. Hageman, D. M. Young, Applied iterative methods. Academic press, New York, 1981. [11] W. J. Harrod, and R. J. Plemmons, Comparison of some direct methods for computing stationary distributions of Markov Chains. SIAM. J. SCI. STAT. COMPUT., 5, 1984, pp. 453-469. [12] D. S. Kershaw, The incomplete Cholesky-conjugate gradient method for the iterative solution of systems of linear equations. J. Com. Phys. 26, 1978, pp. 43-65. [13] G. J. Makinson, and A. A. Shah, OAHM: a FORTRAN subroutine for solving a class of unsymmetric linear

ISBN:978-988-98671-2-6 WCE 2007