Open Lu-Dissertation.Pdf
Total Page:16
File Type:pdf, Size:1020Kb
The Pennsylvania State University The Graduate School THE AUXILIARY SPACE SOLVERS AND THEIR APPLICATIONS A Dissertation in Mathematics by Lu Wang c 2014 Lu Wang Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy December 2014 The dissertation of Lu Wang was reviewed and approved∗ by the following: Jinchao Xu Professor of Department of Mathematics Dissertation Advisor, Chair of Committee James Brannick Associate Professor of the Department of Mathematics Ludmil Zikatanov Professor of the Department of Mathematics Chao-Yang Wang Professor of Materials Science and Engineering Yuxi Zheng Professor of the Department of Mathematics Department Head ∗Signatures are on file in the Graduate School. Abstract Developing efficient iterative methods and parallel algorithms for solving sparse linear sys- tems discretized from partial differential equations (PDEs) is still a challenging task in scien- tific computing and practical applications. Although many mathematically optimal solvers, such as the multigrid methods, have been analyzed and developed, the unfortunate reality is that these solvers have not been used much in practical applications. In order to narrow the gap between theory and practice, we develop, formulate, and analyze mathematically optimal solvers that are robust and easy to use in practice based on the methodology of Fast Auxiliary Space Preconditioning (FASP). We develop a multigrid method on unstructured shape-regular grids by the construction of an auxiliary coarse grid hierarchy on which the multigrid method can be applied by using the FASP technique. Such a construction is realized by a cluster tree which can be obtained in O(N log N) operations for a grid of N nodes. This tree structure is used for the definition of the grid hierarchy from coarse to fine. For the constructed grid hierarchy, we prove that the condition number of the preconditioned system for an elliptic PDE is O(log N). Then, we present a new colored block Gauss-Seidel method for general unstructured grids. By constructing the auxiliary grid, we can aggregate the degree of freedoms in the same cells of the auxiliary girds into one block. By developing and a parallel coloring algorithm for the tree structure, a colored block Gauss-Seidel method can be applied with the aggregates serving as non-overlapping blocks. On the other hand, we also develop a new parallel unsmoothed aggregation algebraic multigrid method for the PDEs defined on an unstructured mesh from the auxiliary grid. It provides (nearly) optimal load balance and predictable communication patterns factors that make our new algorithm suitable for parallel computing. Furthermore, we extend the FASP techniques to saddle point and indefinite problems. Two auxiliary space preconditioners are presented. An abstract framework of the symmetric positive definite auxiliary preconditioner is presented so that the optimal multigrid method could be applied for the indefinite problem on the unstructured grid. We also numerically verify the optimality of the two preconditioners for the Stokes equations. iii Table of Contents List of Figures vii List of Tables ix Acknowledgments x Chapter 1 Introduction 1 Chapter 2 Iterative Method 8 2.1 Stationary Iterative Methods . 10 2.1.1 Jacobi Method . 10 2.1.2 Gauss-Seidel Method . 12 2.1.3 Successive Over-Relaxation Method . 14 2.1.4 Block Iterative Method . 16 2.2 Krylov Space Method and Preconditioners . 19 2.2.1 Conjugate Gradient Method . 20 2.3 Preconditioned Iterations . 24 2.3.1 Preconditioned Conjugate Gradient Method . 25 2.3.2 Preconditioning Techniques . 27 2.4 Numerical Example . 28 2.4.1 Comparison of the Iterative Method . 29 2.4.2 Comparison of the Preconditioners . 31 Chapter 3 Multigird Method and Fast Auxiliary Space Preconditioner 33 3.1 Method of Subspace Correction . 33 iv 3.1.1 Parallel Subspace Correction and Successive Subspace Correction . 35 3.1.2 Multigrid viewed as Multilevel Subspace Corrections . 40 3.1.3 Convergence Analysis . 45 3.2 The Auxiliary Space Method . 55 3.3 Algebraic Multigrid Method . 59 3.3.1 Classical AMG . 60 3.3.2 UA-AMG . 63 Chapter 4 FASP for Poisson-like Problem on unstructured grid 66 4.1 Preliminaries and Assumptions . 67 4.2 Construction of the Auxiliary Grid-hierarchy . 68 4.2.1 Clustering and Auxiliary Box-trees . 68 4.2.2 Closure of the Auxiliary Box-tree . 71 4.2.3 Construction of a Conforming Auxiliary Grid Hierarchy . 74 4.2.4 Adaptation of the Auxiliary Grids to the Boundary . 76 4.2.5 Near Boundary Correction . 79 4.3 Estimate of the Condition Number . 80 4.3.1 Convergence of the MG on the Auxiliary Grids . 82 4.3.1.1 Stable decomposition: Proof of (A1) . 83 4.3.1.2 Strengthened Cauchy-Schwarz inequality: Proof of (A2) . 84 4.3.1.3 Condition number estimation . 86 Chapter 5 Colored Gauss-Seidel Method by auxiliary grid 91 5.1 Graph Coloring . 92 5.2 Quadtree Coloring . 93 5.3 Tree Representations . 97 5.4 Parallel Implementation of the Coloring Algorithm . 99 5.5 Block Colored Gauss-Seidel Methods . 102 Chapter 6 Parallel FASP-AMG Solvers 103 6.1 Parallel Auxiliary Grid Aggregation . 106 6.2 Parallel Prolongation and Restriction and Coarse-level Matrices . 108 6.3 Parallel Smoothers Based on the Auxiliary Grid . 111 6.4 GPU Implementation . 112 6.4.1 Sparse Matrix-Vector Multiplication on GPUs . 113 6.4.2 Parallel Auxiliary Grid Aggregation . 114 v Chapter 7 Numerical Applications for Poisson-like Problem on Unstructured Grid 117 7.1 Auxiliary Space Multigrid Method . 117 7.1.1 Geometric Multigrid . 117 7.1.2 ASMG for the Dirichlet problem . 118 7.1.3 ASMG for the Neumann problem . 119 7.2 FASP-AMG . 121 7.2.1 Test Platform . 121 7.2.2 Performance . 122 Chapter 8 FASP for Indefinite Problem 128 8.1 Krylov Space Method for Indefinite Problems . 129 8.1.1 The Minimal Residual Method . 129 8.1.2 Generalized Minimal Residual Method . 134 8.2 Preconditioners for Indefinite Problems . 142 8.3 FASP Preconditioner . 144 Chapter 9 Fast Preconditioners for Stokes Equation on Unstructured Grid 147 9.1 Block Preconditioners . 148 9.2 Analysis of the FASP SPD Preconditioner . 154 9.3 Some Examples . 157 9.3.1 Use a Lower Order Velocity Space Pair as an Auxiliary Space . 158 9.3.2 Use a Lower Order Pressure Space as an Auxiliary Space . 159 Chapter 10 Conclusions 162 10.1 Conclusions . 162 10.2 Future works . 163 Bibliography 164 vi List of Figures 2.1 Matrix splitting of A . 11 2.2 Comparison of the number of Iterations . 30 2.3 Comparison of the CPU time . 30 2.4 Comparison of the number of iterations for preconditioners . 31 4.1 Left: The 2D triangulation T of Ω with elements τi. Right: The barycenters ξi (dots) and the minimal distance h between barycenters. 68 4.2 Examples of the region quadtree on different domains. 70 4.3 Tree of regular boxes with root B1 in 2D. The black dots mark the corre- sponding barycenters ξi of the triangles τi. Boxes with less than three points ξi are leaves. 70 4.4 The subdivision of the marked (red) box on level ` would create two boxes (blue) with more than one hanging node at one edge. 72 4.5 The subdivision of the red box makes it necessary to subdivide nodes on all levels. 73 4.6 Hanging nodes can be treated by a local subdivision within the box Bν. The top row shows a box with 1; 2; 2; 3; 4 hanging nodes, respectively, and the bottom row shows the corresponding triangulation of the box. 74 4.7 The final hierarchy of nested grids. Red edges were introduced in the last (local) closure step. 75 4.8 Case 1: σi is subdivided in the fine level . 75 4.9 Case 2: σi is not subdivided in the fine level . 75 4.10 A triangulation of the Baltic sea with local refinement and small inclusions. 76 4.11 Hanging nodes can be treated by a local subdivision within the cube Bν. Firstly erasing the hanging nodes on the face and then connecting the center of the cube. 77 4.12 The boundary Γ of Ω is drawn as a red line, boxes non-intersecting Ω are light green, boxes intersecting Γ are dark green, and all other boxes (inside of Ω) are blue. 77 vii 4.13 The boundary Γ of Ω is drawn as a red line, boxes non-intersecting Ω are light green, and all other boxes (intersecting Ω) are blue. 78 4.14 The finest auxiliary grid σ(10) contains elements of different size. Left: Dirich- let b.c. (852 degrees of freedom), right: Neumann b.c. (2100 degrees of freedom) 79 5.1 A balanced quadtree requires at least five colors . 94 5.2 Forced coloring rectangles . 95 5.3 Adaptive quadtree and its binary graph . 96 5.4 Six-Coloring for adaptive quadtree . 98 5.5 the Mordon code of an adaptive quadtree . 99 5.6 Adaptive quadtree and its binary graph . 100 5.7 Coloring of 3D adaptive octree . 101 6.1 Aggregation on level L............................... 107 6.2 Aggregation on the coarse levels. 108 6.3 Coloring on the finest level L . 112 6.4 Sparse matrix representation using the ELL format and the memory access pattern of SpMv. 114 7.1 Covergence rates for Auxiliary Space MultiGrid with n4 = 737; 933, n5 = 2; 970; 149, n6 = 11; 917; 397, and n7 = 47; 743; 157 degrees of freedom. 119 7.2 Covergence rates for Auxiliary Space MultiGrid with n4 = 756; 317, n5 = 3; 006; 917, n6 = 11; 990; 933, and n7 = 47; 890; 229 degrees of freedom.