Some P-RAM Algorithms for Sparse Linear Systems

Journal of Computer Science 3 (12): 956-964, 2007 ISSN 1549-3636 © 2007 Science Publications Some P-RAM Algorithms for Sparse Linear Systems 1Rakesh Kumar Katare and 2N.S. Chaudhari 1Department of Computer Science, A.P.S. University, Rewa (M.P.) 486003, India 2Nanyang Technological University, Singapore Abstract: PRAM algorithms for Symmetric Gaussian elimination is presented. We showed actual testing operations that will be performed during Symmetric Gaussian elimination, which caused symbolic factorization to occur for sparse linear systems. The array pattern of processing elements (PE) in row major order for the specialized sparse matrix in formulated. We showed that the access function in2+jn+k contains topological properties. We also proved that cost of storage and cost of retrieval of a matrix are proportional to each other in polylogarithmic parallel time using P-RAM with a polynomial numbers of processor. We use symbolic factorization that produces a data structure, which is used to exploit the sparsity of the triangular factors. In these parallel algorithms number of multiplication/division in O(log3n), number of addition/subtraction in O(log3n) and the storage in O(log2n) may be achieved. Key words: Parallel algorithms, sparse linear systems, cholesky method, hypercubes, symbolic factorization INTRODUCTION BACKGROUND In this research we will explain the method of We consider symmetric Gaussian elimination for representing a sparse matrix in parallel by using P- the solution of the system of linear Eq. RAM model. P-RAM model is a shared memory model. According to Gibbons and Rytter[6], the PRAM model A x = b will survive as a theoretically convenient model of parallel computation and as a starting point for a Where A is an n×n symmetric, positive definite methodology. There are number of processors working matrix. Symmetric Gaussian elimination is equivalent to the square root free Cholesky method, i.e., first synchronously and communicating through a common T random access memory. The processors are indexed by factoring A into the product U DU and then forward the natural number and they synchronously execute the and back solving to obtain x and we often talk same program (through the central main control). interchangeably about these two methods of solving A x = b. Furthermore, since almost the entire of A, we Although performing the same instructions, the restrict most of our attention to that portion of the processors can be working on different data (located in solution process. a different storage location). Such a model is also called [13] T Sherman developed a row-oriented U DU a single-instruction, multiple-data stream (SIMD) factorization algorithm for dense matrices A and model. The symmetric factorization is in logarithmic [2,10] considers two modifications of it which take advantage time of the hypercube SIMD model . of sparseness in A and U. We also discuss the type of To develop these algorithms we use method of information about A and U which is required to allow symbolic factorization on sparse symmetric sparse symmetric factorization to be implemented factorization. Symbolic factorization procedures a data efficiently. structure that exploits the sparsity of the triangular Here we will present a parallel algorithm for sparse factors. We will represent an array pattern of processing symmetric matrix and will be implemented to elements (PE) for sparse matrix on hypercube. We have hypercube. already developed P-RAM algorithms for linear system Previously we have already developed matrix (dense case)[7,8,9]. These algorithms are implemented on multiplication algorithm on P-RAM model[7] and Hypercube architecture. implementation of back-substitution in Gauss- Corresponding Author: Rakesh Kumar Katare, Department of Computer Science, A.P.S. University, Rewa (M.P.) 486003 India 956 J. Computer Sci., 3 (12): 956-964, 2007 [8] elimination on P-RAM model . Now we further extend ≈ a (k) ’ Ω ik (k) ∆ (k) ÷a jk < 0 this idea for sparse linear systems of Eq.s. «a kk ◊ Ω (k) (k) ANALYSIS OF SYMMETRIC GAUSS aik or a jk is negative ELIMINATION ≈ a (k) ’ Ω (k+1) (k) ik (k) aij = aij + ∆ (k) ÷a jk «a ◊ In this section we suggest the actual testing kk operations that will be performed during Symmetric (k) (k+1) (k) (k) Gaussian elimination. Basically the Gaussian Lemma 2: If (aij − aij ) > 0 then a jk or aik both (k) elimination is applied to transform matrices of linear negative or both positive and for pivot element aij is Eq. to triangular form. This process may be performed always positive. in general by creating zeros in the first column, then the second and so forth. For k = 1,2,...,n-1 we use the Proof: - formulae ≈ a(k) ’ (k) (k +1) Ω ik (k) (aij − aij ) > 0 ∆ (k) ÷a jk > 0 «a ◊ (k) ’ kk (k +1) (k) aik (k) a = a − ∆ ÷a , i, j > k (1) ij ij (k) kj Ω (k) (k+1) (k+1) «a kk ◊ aij > aij then aij is positive Ω (k) (k) a jk or aik both negative or both positive and and (k) for pivot element aij is always positive. ≈ (k) ’ (k+1) (k) aik (k) (k) (k+1) bi = bi − ∆ (k) ÷bk , i > k (2) Lemma 3: If (aij − aij ) = 0 then no elimination will «a kk ◊ take place and one of the following condition will be (1) satisfied. where, aij = aij , i, j = 1, 2,…,n. The only assumption (k) (k) (i) aik = 0 or (ii) a jk = 0 or both (i) and (ii) are required is that the inequalities a (k) ≠ 0 , k = 1,2,..., n kk zero. hold. These entries are called pivot in Gaussian elimination. It is convenient to use the notation, Proof: (k) (k+1) if (aij -aij ) = 0 then (k) (k) A x = b ≈ a(k) ’ (k) (k+1) ∆ ij ÷ (k) aij = aij and∆ (k) ÷a jk = 0 «a kk ◊ For the system obtained after (k-1) steps, k = 1, Ω (k) (k) (1) = (1) = (n) (aik .ajk ) = 0 2,..., n with A A and b b. The final matrix A is (k) (k) [3] Ω aik = 0 or ajk = 0 or both are zero. upper triangular matrix (k) Ωaik will remain same after elimination i.e. no ≈ (k) ’ (k+1) (k) aik (k) elimination is required As we know that in aij = aij − ∆ (k) ÷a jk we are «aik ◊ (k) Case 2: if i = j then eliminating position aik ≈ a(k) ’ a (k+1) = a (k) − ∆ ik ÷a (k) ii ii a(k) ik Case 1: If i ≠ j then the above expression can be « kk ◊ ≈ (k) 2 ’ written as follows: a Ω (k+1) (k) ∆( ik ) ÷ aii = aii − ∆ a (k) ÷ ≈ (k) ’ « kk ◊ (k) (k+1) aik (k) (aij − aij ) = ∆ (k) ÷a jk (3) «a kk ◊ (k +1) (k) (k) (k) Lemma 4: If aii = 0 then aik = aii .a kk (k) (k+1) (k) (k) ≈ (k) 2 ’ Lemma 1: if (aij − aij ) < 0 then aik or a jk is a (k +1) (k) ∆( ik ) ÷ (k) (k) Proof: If a = 0 then a = Ωaii .akk = negative ii ii ∆ a (k) ÷ « kk ◊ (k) (k+1) (k) 2 Ω (k) (k) (k) Proof: (aij − aij ) < 0 (aik ) aik = aii .akk 957 J. Computer Sci., 3 (12): 956-964, 2007 (k) (k+1) (k) Lemma 5: If aik = 0 then aii = aii no iteration and for the second case the element which is to be will take place. eliminated will be greater than zero and for the third case elimination will not take place. Proof: The result is obvious. We suggested testing operation for variation occurring in the elements of Symmetric Gaussian (k) (k +1) elimination. There are more testing operations rather Lemma 6: If (aii − aii ) > 0 then aik > 0 than arithmetic operations, so that the running time of the algorithms could be proportional to the amount of Proof: testing rather than amount of arithmetic operation, (k) 2 ’ a 2 which will cause symbolic factorization to occur. To (k) (k +1) Ω ∆( ik ) ÷ Ω (k) (k) (aii − aii ) > 0 (k) > 0 (aik ) > a kk ∆ a ÷ avoid this problem Sherman pre computed the sets rak, « kk ◊ ruk, cuk and showed that in implementation of this Ω (k) 2 scheme, the total storage required is proportional to the (aik ) > 0 number of non zeroes in A and U and that the total running time is proportional to the number of arithmetic If (a (k)-a (k+1)) = 0 then a = 0 Lemma 7: ii ii ik operations on nonzero. In addition to this the (k) (k+1) (k) (k+1) preprocessing required to compute the sets { rak} { ruk} Proof: (aii -aii ) = 0 Ω aii = aii and {cuk} can be performed in time proportional to the (k) 2 (aik ) total storage. To see how to avoid these problems, let us Ω = 0 (k) assume that for each k, 1≤ k ≤ N, we have pre- a kk (k) computed: Ω aii = 0 (k) (k+1) (k) • The set rak of columns j ≥ k for which akj ≠ 0; Lemma 8: if (aii -aii ) < 0 then a kk will be • The set ruk of columns j > k for which ukj ≠ 0; eliminated. • The set cuk of columns i < k for which uik ≠ 0; Proof: Line (k) (k+1) Ω (k) (k+1) Ω (k+1) (aii -aii ) < 0 aii > aii aii = ≈ (k) 2 ’ a 1. For k ← 1 to N do (k) ∆( ik ) ÷ aii − (k) 2. [m 1; ∆ a ÷ kk ← « kk ◊ 3. dkk ← akk; 4. For j ruk do In Gauss elimination, a nonzero position in 5. [mkj← 0]; position jk, implied that there was also a nonzero 6. For j {n rak : n > k} do position in kj, which would cause fill to occur in 7.

Some P-RAM Algorithms for Sparse Linear Systems

Triangular Factorization

An Efficient Implementation of the Thomas-Algorithm for Block Penta

Pivoting for LU Factorization

Mergesort / Quicksort Steven Skiena

4. Linear Equations with Sparse Matrices 4.1 General Properties of Sparse Matrices

7 Gaussian Elimination and LU Factorization

Mixing LU and QR Factorization Algorithms to Design High-Performance Dense Linear Algebra Solvers✩

Maintaining LU Factors of a General Sparse Matrix*+ Department of Operations Research Stanford Uniuersity Stanfmd, Califmiu 9430

2 Partial Pivoting, LU Factorization

The Cholesky Factorization in Interior Point Methods

Why Sparse Matrix?

28 Matrix Operations