Direct and Line Based Iterative Methods for Solving Sparse Block Linear Systems

Direct and Line Based Iterative Methods for Solving Sparse Block Linear Systems A thesis submitted to the Graduate School of the University of Cincinnati in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in the Department of Aerospace Engineering and Engineering Mechanics of the College of Engineering and Applied Science by Xiaolin Yang B.S. Shandong University, June 2012 Date: Oct/31/2018 Committee Chair: Mark G. Turner, Sc.D Abstract Solving sparse linear system of equations represents the major computation cost in many scientific and engineering areas. There are two major approaches for solving large sparse linear system: direct method and iterative method. Both methods have their own advantages for certain type of problem. In general, the direct method is more robust and the iterative method has better scalability. High-order Discontinuous Galerkin (DG) Method has gained growing interest in Computation Fluid Dynamics (CFD) community. The Jacobian matrices that arise in the application of the DG method are sparse and block-structured. This thesis summarizes the development of direct and iterative solvers for sparse block linear system. Block capability is achieved by using Intel CPU library or Nvidia GPU based libraries. The direct solver uses left-looking method with fill-reducing ordering to factorize the matrices into lower/upper triangular parts. The iterative solver uses line-based Successive Over-Relaxation method (SLOR) and Alternating Direction Implicit method (ADI), which exploit the characteristic of structured grid. The direct and iterative solvers are tested with matrices from the simulation of a flow channel using DG method. The grid dimension is 6 × 2 × 2 . The results show that direct solver performs better on these small matrices. However, the iterative solver using ADI method demonstrates better scalability with respect to the degree of polynomial used in DG scheme. ii This work advances the development of linear solver for DG method. ii iii Acknowledgement First and foremost, I would like to express my sincere gratitude to my advisor Dr. Mark Turner for his guidance and patience. Dr. Turner and I spent a lot of time on this thesis and the discussion with him always benefits me. I would also like to thank the rest of my committee members, Dr. Shaaban Abdallah and Dr. Donald French, for their time and insightful comments. I am sincerely grateful to Nathan Wukie for providing me the test cases of this thesis. This thesis cannot be done without Nathan’s help. Finally, I would like to thank my parents and Sisi for their support over the years. They have always been there for me during the hard time of my life. iv Contents Abstract ....................................................................................................................... ii Acknowledgement ..................................................................................................... iv Contents ...................................................................................................................... v List of Figures ......................................................................................................... viii Nomenclature ............................................................................................................ xi 1 Introduction ......................................................................................................... 1 Background .................................................................................................... 1 Motivation ...................................................................................................... 2 Sparse Linear Solver Data Structure .............................................................. 3 Overview of Direct Method ........................................................................... 5 1.4.1 Symbolic Analysis .............................................................................. 5 1.4.2 Numerical Factorization...................................................................... 7 1.4.3 Solving Sparse Triangular System ...................................................... 8 Overview of Iterative Method ........................................................................ 8 Linear Algebra Package ............................................................................... 11 1.6.1 Intel Math Kernel Library ................................................................. 11 v 1.6.2 Nvidia cuBLAS and cuSolver ........................................................... 12 Thesis Outline .............................................................................................. 12 2 Direct Methodology ......................................................................................... 13 Overall Algorithm ........................................................................................ 13 Solving Sparse Triangular System ............................................................... 14 Block-wise Left-looking Method ................................................................. 17 Symbolic Analysis and Fill-reducing Ordering ........................................... 19 Error Analysis .............................................................................................. 23 Iterative Refinement..................................................................................... 24 Memory Management .................................................................................. 25 Data Structure .............................................................................................. 25 3 SLOR/SLOR-ADI Methodology..................................................................... 27 Overall Algorithm of SLOR Method ........................................................... 27 Symbolic Analysis of SLOR Method .......................................................... 28 Block-wise Thomas Algorithm .................................................................... 31 Convergence Analysis of SLOR Method ..................................................... 32 SLOR-ADI Method ..................................................................................... 34 Memory Management .................................................................................. 39 Data Structure .............................................................................................. 40 4 Implementation ................................................................................................ 41 Test Case ...................................................................................................... 41 vi Test Configuration ........................................................................................ 43 Direct Method .............................................................................................. 44 SLOR Method .............................................................................................. 51 ADI Method ................................................................................................. 57 GPU Application .......................................................................................... 62 5 Conclusion and Future Work.......................................................................... 65 References ................................................................................................................. 66 vii List of Figures 2.1: Illustration of symbolic analysis process. ....................................................................... 14 2.2: Finding the non-zero pattern of x. .................................................................................. 16 2.3: Row-merge tree of (2.6) .................................................................................................. 21 3.1: Illustration of SLOR method .......................................................................................... 27 3.2: A structured 4×3 2D grid with natural ordering. ............................................................. 28 3.3: Illustration of SLOR-ADI method. ................................................................................. 34 3.4: 3×2×2 structured 3D grid with natural ordering. ............................................................ 35 4.1: 6×2×2 3D structured grid ................................................................................................ 42 4.2: Contours of pressure coefficient. .................................................................................... 42 4.3: Nonzero pattern of Jacobian matrix. ............................................................................... 43 4.4: Nonzero pattern of L+U-I with original ordering. .......................................................... 44 4.5: Nonzero pattern of L+U-I with fill-reducing ordering. .................................................. 45 4.6: Direct method execution time (ms). ............................................................................... 46 4.7: Execution time per equation (ms). .................................................................................. 46 4.8: Log scaled execution time with original ordering. ......................................................... 47 4.9: Log scaled execution time with fill-reducing ordering. .................................................. 47 4.10: Log scaled execution time per equation with original ordering. .................................. 48 4.11:

Load more