A Distributed Memory Parallel Fourth-Order IADEMF Algorithm

(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 10, No. 9, 2019 A Distributed Memory Parallel Fourth-Order IADEMF Algorithm 1 Noreliza Abu Mansor , Ahmad Norma Alias2 Mohammad Khatim Hasan4 3 Kamal Zulkifle Ibnu Sina Institute of Fundamental Faculty of Information Science and College of Engineering Science Studies, Universiti Technology, Universiti Kebangsaan Universiti Tenaga Nasional Teknologi Malaysia, Johor, Malaysia Malaysia, Selangor, Malaysia Selangor, Malaysia Abstract—The fourth-order finite difference Iterative The studies showed that the accuracies of the IADEDY and Alternating Decomposition Explicit Method of Mitchell and the IADEMG are comparable to the IADEMF. Alias [5] Fairweather (IADEMF4) sequential algorithm has demonstrated studied the parallel implementation of the IADEMF on its ability to perform with high accuracy and efficiency for the distributed parallel computing using the parallel virtual solution of a one-dimensional heat equation with Dirichlet machine. A fragmented numerical algorithm of the IADEMF boundary conditions. This paper develops the parallelization of method was designed by Alias [6] in terms of the data-flow the IADEMF4, by applying the Red-Black (RB) ordering graph where its parallel implementation using LuNA technique. The proposed IADEMF4-RB is implemented on programming system was then executed. Sulaiman et al. [7, 8] multiprocessor distributed memory architecture based on proposed the half-sweep and the quarter-sweep IADEMF Parallel Virtual Machine (PVM) environment with Linux methods respectively, for the purpose of achieving better operating system. Numerical results show that the IADEMF4- RB accelerates the convergence rate and largely improves the convergence rate and faster execution time than the serial time of the IADEMF4. In terms of parallel performance corresponding full-sweep method. Alias [9] implemented the evaluations, the IADEMF4-RB significantly outperforms its Interpolation Conjugate gradient method to improve the counterpart of the second-order (IADEMF2-RB), as well as the parallel performance of the IADEMF. Shariffudin et al. [10] benchmarked fourth-order classical iterative RB methods, presented the parallel implementation of the IADEDY for namely, the Gauss-Seidel (GS4-RB) and the Successive Over- solving a two-dimensional heat equation on a distributed relaxation (SOR4-RB) methods. system of Geranium Cadcam cluster (GCC) using the Message Passing Interface. Keywords—Fourth-order method; finite difference; red-black ordering; distributed memory architecture; parallel performance A recent study made by Mansor [11] involved the evaluations development of a convergent and unconditionally stable fourth-order IADEMF sequential algorithm (IADEMF4). The I. INTRODUCTION proposed scheme is found to be capable of enhancing the The heat equation is a mathematical model that describes accuracy of the original corresponding method of the second- heat conduction processes of a physical system. Sahimi et al. order, that is, the IADEMF2. The IADEMF4 seems to be [1] had proposed a finite difference scheme known as the more accurate, more efficient and has better rate of Iterative Alternating Decomposition Explicit (IADE) method convergence than the benchmarked fourth-order classical to approximate the solution of a one-dimensional heat iterative methods, namely, the Gauss-Seidel (GS4) and the equation with Dirichlet boundary conditions. The IADE successive over-relaxation (SOR4) methods. However, the scheme employs the fractional splitting of the Mitchell and IADEMF4 may be too slow to be implemented especially Fairweather (MF) variant whose accuracy is of the order, when the problem involves larger linear systems of equations. It is thus justified to consider parallel computing to speed up O()(). t24 x The scheme, commonly abbreviated as the the execution time without compromising its accuracy. The IADEMF, is developed by applying the second-order spatial algorithm has explicit features which add to its advantage, accuracy to the heat equation. Due to the latter, in this paper, thus it can be fully utilized for parallelization. the IADEMF will also be referred to as the IADEMF2. It is a This paper attempts to parallelize the IADEMF4, by two-stage iterative procedure and has been proven to have applying the Red-Black (RB) ordering technique, for solving merit in terms of convergence, stability and accuracy. It is large sparse linear systems that arise from the discretization of generally found to be more accurate than the classical the one-dimensional heat equation with Dirichlet boundary Alternating Group Explicit class of methods [2]. conditions. It aims to effectively implement the IADEMF4- Several studies have later been developed based on the RB on parallel computers, with improved performance over its IADE method. Sahimi et al. [3, 4] developed new second- serial counterpart. The high computational complexity of the order IADE methods using different variants such as the IADEMF4-RB will be implemented on multiprocessor D‟Yakonov (IADEDY) and the Mitchell-Griffith variant distributed memory architecture based on Parallel Virtual Machine (PVM) environment with Linux operating system. (IADEMG). Each variant is of the order, O()() t24 x . 599 | P a g e www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 10, No. 9, 2019 This paper is outlined as follows. Section II recalls the of boundary values and known values at the previous time formulation of the IADEMF4 scheme. Section III presents the level k . development of the IADEMF4-RB parallel strategy. The computational complexity of the RB methods considered in Auf this paper is given in Section IV. Section V shows the c d e numerical experiment conducted in this study. The results and uf22 discussion on parallel performance of the methods under b c d e uf 33 consideration are discussed in Section VI. At the end of this a b c d e .. paper is the conclusion. .. II. FORMULATION OF THE IADEMF4 (AN OVERVIEW) a b c d. e .. In this section, the development of the IADEMF4 a b c d ufmm22 algorithm [11] is briefly reviewed. Consider the one- uf dimensional heat equation (1) which models the flow of heat a b c (m 2) x ( m 2) mm11 k 1 in a homogeneous unchanging medium of finite extent, in the (6) absence of heat source. The entries in f are defined as UU2 k k1 k k k f2 b() u 1 u 1 cuˆ 2 du 3 eu 4 2 t x (1) k k1 k k k k f3 a() u 1 u 1 bu 2 cuˆ 3 du 4 eu 5 subject to given initial and Dirichlet boundary conditions k k k k k fi au i2 bu i 1 cuˆ i du i 1 eu i 2 , i 4,5,..., m 3 U( x ,0) f ( x ), 0 x 1 k k k k k k 1 fm2 au m 4 bu m 3 cuˆ m 2 du m 1 e() u m u m U(0, t ) g ( t ), 0 t T f auk bu k cuˆ k d( u k uk1) U(1, t ) h ( t ), 0 t T m1 m 3 m 2 m 1 m m (7) (2) The IADEMF4 scheme secondly employs the fractional Based on the finite difference approach, the time-space splitting of the higher-order accuracy formula of the MF domain is discretized by using a set of lines parallel to the t variant [13], axis given by xi i x , i0,1, ... , m , m 1 and a set of lines (pp 1/2) ( ) parallel to the x axis given by tk k t , k0,1, ... , n , n 1 ()()rI G12u rI gG u f (8) . The grid spacings have uniform size, that is, xm 1/( 1) ()()rI Gu(pp 1) rI gG u ( 1/2) g f and t T/( n 1) . At a grid-point P(,) xik t in the solution 21 (9) domain, the dependent variable U(,) x t which represents the where G and G are two constituent matrices and r, I non-dimensional temperature at time t and at position x, is 1 2 and p represent an acceleration parameter, an identity matrix approximated by uk . i and the iteration index respectively. The value of g is defined The IADEMF4 is developed by firstly executing the 6 r as g , r 0 . The vectors u(p 1) and u(p 1/ 2) unconditionally stable fourth-order Crank-Nicolson 6 approximation (3) on the heat equation [12]. represent the approximate solution at the iteration level (p 1) and at some intermediate level (p 1/ 2) , 1k1 k 1 2 1 4 k 1 k (ui u i ) ( x x )( u i u i ) respectively. t 2 12 2(x ) (3) After some algebraic manipulations for the equations in (8) The discretization of (3) leads to the expression given in 1 (4), with the constants defined as in (5). and (9), the form, GGGG1 2 1 2 uf is obtained, 6 k1 k 1 k 1 k 1 k 1 k k suggesting that matrix A in (6) can be decomposed into. aui2 bu i 1 cu i du i 1 eu i 2 au i 2 bu i 1 cuˆ k du k eu k , i 2,3, ..., m 1 1 i i12 i (4) AGGGG 1 26 1 2 2 4 5 2 4 5 (10) a,,,,, b c d e cˆ 24 3 4 3 24 4 (5) To retain the penta-diagonal structure of A , the matrices G and G have to be in the form of lower and upper tri- In matrix form, the approximation in (4) can be 1 2 represented by (6), where A is a sparse penta- diagonal matrices respectively, Thus, Auf diagonal coefficient matrix, and the column vectors T u (u2 , u 3 ,..., umm 2 , u 1 ) contain the unknown values of u at T the time level k 1 and f (f2 , f 3 ,..., fmm 2 , f 1 ) consists 600 | P a g e www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol.

Load more