ANALYSIS of CYCLIC REDUCTION for the NUMERICAL SOLUTION of THREE-DIMENSIONAL CONVECTION-DIFFUSION EQUATIONS by CHEN GREIF B

ANALYSIS OF CYCLIC REDUCTION FOR THE NUMERICAL SOLUTION OF THREE-DIMENSIONAL CONVECTION-DIFFUSION EQUATIONS by CHEN GREIF B. Sc. (Applied Mathematics), Tel Aviv University, 1991 M. Sc. (Applied Mathematics), Tel Aviv University, 1994 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES Department of Mathematics Institute of Applied Mathematics We accect this thesis as conforming to tre^ required ^md\rd THE UNIVERSITY OF BRITISH COLUMBIA April 1998 © Chen Greif, 1998 In presenting this thesis in partial fulfillment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for refer• ence and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of Mathematics The University of British Columbia Vancouver, Canada Abstract This thesis deals with the numerical solution of convection-diffusion equations. In particular, the focus is on the analysis of applying one step of cyclic reduction to linear systems of equations which arise from finite difference discretization of steady-state three-dimensional convection- diffusion equations. The method is based on decoupling the unknowns and solving the resulting smaller linear systems using iterative methods. In three dimensions this procedure results in some loss of sparsity, compared to lower dimensions. Nevertheless, the resulting linear system has excellent numerical properties, is generally better conditioned than the original system, and gives rise to faster convergence of iterative solvers, and convergence in cases where solvers of the original system of equations fail to converge. The thesis starts with an overview of the equations that are solved and general properties of the resulting linear systems. Then, the unsymmetric discrete operator is derived and the struc• ture of the cyclically reduced linear system is described. Several important aspects are analyzed in detail. The issue of orderings is addressed and a highly effective ordering strategy is pre• sented. The complicated sparsity pattern of the matrix requires careful analysis; comprehensive convergence analysis for block stationary methods is provided, and the bounds on convergence rates are shown to be very tight. The computational work required to perform cyclic reduction and compute the solution of the linear system is discussed at length. Preconditioning techniques and various iterative solvers are considered. ii Table of Contents Abstract ii Table of Contents iii List of Tables v List of Figures vi Acknowledgements viii Chapter 1. Introduction 1 1.1 Background 2 1.1.1 The Convection-Diffusion Equation 2 1.1.2 Finite Difference Methods 4 1.1.3 Solving the Linear System 9 1.2 One Step of Cyclic Reduction 14 1.2.1 The One-Dimensional Case 14 1.2.2 More Space Dimensions 20 1.3 Thesis Outline and Notation 23 Chapter 2. Cyclic Reduction 25 2.1 Complete Cyclic Reduction 26 2.2 The Three-Dimensional Cyclically Reduced Operator 29 2.2.1 The Constant Coefficient Model Problem 31 2.2.2 The Variable Coefficient Case 36 2.3 Properties of the Reduced Matrix ;. 39 Chapter 3. Ordering Strategies 46 3.1 Block Ordering Strategies for 3D Grids 48 3.2 The Family of Two-Plane Orderings 50 3.3 The Family of Two-Line Orderings 54 3.4 Comparison Results 58 Chapter 4. Convergence Analysis for Block Stationary Methods 64 4.1 Symmetrization of the Reduced System 64 4.1.1 The Constant Coefficient Case 65 4.1.2 The Variable Coefficient Case 72 4.2 Bounds on Convergence Rates 75 4.3 "Near-Property A" for ID Partitioning of the Two-Plane Matrix 87 4.4 Computational Work 94 4.5 Comparison with the Unreduced System 97 4.6 Fourier Analysis 103 4.7 Comparisons 107 m Table of Contents Chapter 5. Solvers, Preconditioners, Implementation and Performance 113 5.1 Krylov Subspace Solvers and Preconditioners - Overview 113 5.2 Incomplete Factorizations for the Reduced System 120 5.3 The Overall Cost of Construction of the Reduced System 126 5.4 Numerical Experiments 128 5.5 Vector and Parallel Implementation 140 Chapter 6. Summary, Conclusions and Future Research 153 6.1 Summary and Conclusions 153 6.2 Future Research 157 Bibliography 160 iv List of Tables 4.1 comparison between the computed spectral radius and the bound for the ID splitting 86 4.2 comparison between the computed spectral radius and the bound for the 2D splitting 86 4.3 comparison of computed spectral radii of the ID Jacobi iteration matrix with matrices which are consistently ordered 92 4.4 comparison between iteration counts for the reduced and unreduced systems ...... 108 4.5 iteration counts for different iterative schemes 110 4.6 iteration counts for one nonzero convective term Ill 4.7 iteration counts for two nonzero convective terms Ill 5.1 number of nonzero elements in the ILU(l) factorization 123 5.2 computational work involved in the construction of the reduced matrix 127 5.3 comparison between the computed spectral radii and the bounds 129 5.4 comparison of solving work/time for various mesh sizes 130 5.5 comparison between estimated condition numbers 132 5.6 performance of block stationary methods for Test Problem 1 132 5.7 construction time/work of ILU preconditioners 133 5.8 performance of QMR for Test Problem 1 134 5.9 performance of BiCG for Test Problem 1 134 5.10 performance of Bi-CGSTAB for Test Problem 1 135 5.11 performance of CGS for Test Problem 1 135 5.12 overall flop counts and average computed constants for Test Problem 2 138 5.13 performance of various incomplete factorizations for Test Problem 2 139 5.14 norm of the error for Test Problem 3 140 5.15 comparison of performance for Test Problem 3 141 5.16 comparison of iteration counts of natural and multicolor orderings 152 List of Figures 1.1 computational molecules 8 1.2 numerical solution of Eq. (1.24) 14 1.3 red/black ordering of the one-dimensional grid 15 1.4 the matrix associated with the red/black ordering for the one-dimensional case 16 1.5 eigenvalues of the reduced matrix (solid line) and the unreduced matrix (broken line) for the one-dimensional model problem with n = 65 and /3 = 0.3 19 1.6 natural lexicographic ordering of the tensor-product grids 21 1.7 sparsity patterns of the matrices 22 1.8 red/black ordering in the two-dimensional case 22 2.1 a three-dimensional checkerboard 30 2.2 red/black ordering of the 3D grid 31 2.3 points that are affected by the block Gaussian elimination 32 2.4 structure of the computational molecule associated with the reduced operator 33 2.5 sparsity pattern of the lexicographically ordered reduced matrix 35 2.6 eigenvalues of both systems for Poisson's equation 43 2.7 singular values of both matrices for Eq. (2.24) 44 3.1 orderings of the 2D block grid 49 3.2 three members in the family of natural two-plane orderings 50 3.3 red/black and toroidal two-plane ordering corresponding to x-y oriented 2D blocks . 51 3.4 sparsity patterns of two members of the two-plane family of orderings 53 3.5 block computational molecule corresponding to the family of orderings 2PN 53 3.6 four-color ID block ordering 54 3.7 ordering and sparsity pattern of the matrix associated with 2LNxy ordering 55 3.8 block computational molecule corresponding to the ordering strategy 2LNxy 58 3.9 possible block partitionings of the two-plane matrix 59 3.10 a zoom on 2D blocks 60 3.11 comparison of the spectral radii of block Jacobi iteration matrices for cross-sections of the mesh Reynolds numbers 61 3.12 spectral radii of iteration matrices vs. mesh Reynolds numbers 62 3.13 symmetric reverse Cuthill-McKee ordering of the reduced matrix 63 4.1 sparsity patterns of the matrices involved in the proof of Lemma 4.8 84 4.2 "Near Property A" for the ID splitting 89 4.3 sparsity pattern of the matrix CJ1' 90 4.4 the function hs (at) • 93 4.5 spectral radius of the SOR iteration matrix vs. the relaxation parameter 95 4.6 comparison of the spectral radii of the Gauss-Seidel iteration matrices 103 5.1 a two-dimensional slice of the stencil associated with U 121 5.2 fill-in in the construction of ILU(l) in the plane containing the gridpoint for which the discretization was done 121 vi List of Figures 5.3 fill-in in the construction of ILU(l) in the plane adjacent to the gridpoint for which the discretization was done 122 5.4 sparsity patterns of the factors of ILU(l) and ILU(2) 123 5.5 sparsity patterns of factors for ILU with drop tolerance 10-2 125 5.6 /2-norm of relative residual for preconditioned Bi-CGSTAB 131 5.7 a 2D slice of the numerical solution of Test Problem 3 140 5.8 2D mesh architecture (3 X 3) 143 5.9 a 2D slice of gridpoints associated with one processor 145 5.10 the part of the reduced matrix that contains the gridpoints associated with a certain subcube, and the rectangular matrix before local ordering 149 5.11 the "local part" of the rectangular matrix after re-ordering 150 5.12 a 2D slice of processors that hold entries of ID or 2D sets of unknowns 151 vii Acknowledgements I would like to thank my supervisor, Jim Varah, for his devoted guidance.

ANALYSIS of CYCLIC REDUCTION for the NUMERICAL SOLUTION of THREE-DIMENSIONAL CONVECTION-DIFFUSION EQUATIONS by CHEN GREIF B

Program Booklet

Program PMAA14.Indd

Icase Interim Report 6

A Review of Vortex Methods and Their Applications: from Creation to Recent Advances

METHOD for SOLVING INTERFACE and BOUNDARY VALUE PROBLEMS by HONGSONG FENG SHAN ZH

Robust and Scalable Hierarchical Matrix-Based Fast Direct Solver and Preconditioner for the Numerical Solution of Elliptic Partial Diﬀerential Equations

Nomenclature

CSE15 Abstracts

Programs and Algorithms of Numerical Mathematics 13

Chapter 2 of the Book by Prosperetti and Tryggvason