![Direct and Iterative Methods for Block Tridiagonal Linear Systems](https://data.docslib.org/img/3a60ab92a6e30910dab9bd827208bcff-1.webp)
DIRECT AND ITERATIVE METHODS FOR BLOCK TRIDIAGONAL LINEAR SYSTEMS Don Eric Heller April 1977 Department of Computer Science Carnegie-Mellon University Pittsburgh, PA 15213 Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at Carnegie-Mellon University. This research was supported in part by the Office of Naval Research under Contract N00014-76-C-0370, NR 044-422, by the National Science Foundation under Grant MCS75-222-55, and by the National Aeronautics and Space Administration under Grant NGR-47-102-001, while the author was in residence at ICASE, NASA Langley Research Center. ABSTRACT Block tridiagonal systems of linear equations occur frequently in scientific computations, often forming the core of more complicated prob- lems. Numerical methods for solution of such systems are studied with emphasis on efficient methods for a vector computer. A convergence theory for direct methods under conditions of block diagonal dominance is developed, demonstrating stability, convergence and approximation properties of direct methods. Block elimination (LU factorization) is linear, cyclic odd-even reduction is quadratic, and higher-order methods exist. The odd-even methods are variations of the quadratic Newton iteration for the inverse matrix, and are the only quadratic methods within a certain reasonable class of algorithms. Semi-direct methods based on the quadratic conver- gence of odd-even reduction prove useful in combination with linear itera- tions for an approximate solution. An execution time analysis for a pipe- line computer is given, with attention to storage requirements and the effect of machine constraints on vector operations. PREFACE It is with many thanks that I acknowledge Professor J. F. Traub, for my introduction to numerical mathematics and parallel computation, and for his support and encouragement. Enlightening conversations with G. J. Fix, S. H. Fuller, H. T. Kung and D. K. Stevenson have contributed to some of the ideas presented in the thesis. A number of early results were obtained while in residence at ICASE, NASA Langley Research Center; J. M. Ortega and R. G. Voigt were most helpful in clarifying these results. The thesis was completed while at the Pennsylvania State University, and P. C. Fischer must be thanked for his forbearance. Of course, the most important contribution has been that of my wife, Molly, who has put up with more than her fair share. ii CONTENTS page I. Introduction .......................... I A. Summary of Main Results ................... 5 B. Notation .......................... 8 2. Some Initial Remarks ...................... I0 A. Analytic Tools ....................... I0 B. Models of Parallel Computation ............... 14 3. Linear Methods ......................... 18 A. The LU Factorization .................... 18 I. Block Elimination .................... 18 2. Further Properties of Block Elimination ......... 25 3. Back Substitution .................... 30 4. Special Techniques ................... 31 B. Gauss-Jordan Elimination .................. 33 C. Parallel Computation .................... 36 4. Quadratic Methods ........................ 37 A. Odd-Even Elimination .................... 37 B. Variants of Odd-Even Elimination .............. 45 C. Odd-Even Reduction ..................... 48 I. The Basic Algorithm ................... 48 2. Relation to Block Elimination .............. 51 3. Back Substitution .................... 56 4. Storage Requirements .................. 58 5. Special Techniques ................... 59 D. Parallel Computation .................... 60 E. Summary ........................... 64 5. Unification: Iteration and Fill- In ............... 65 iii 6. Higher-Order Methods ....................... 71 A. p-Fold Reduction ....................... 71 B. Convergence Rates ....................... 75 C. Back Substitution ....................... 80 7. Semidirect Methods ........................ 83 A. Incomplete Elimination .................... 84 B. Incomplete Reduction ..................... 86 C. Applicability of the Methods ................. 90 8. Iterative Methods ......................... 92 A. Use of Elimination ...................... 92 B. Splittings .......................... 93 C. Multi- line Orderings ..................... 94 D. Spiral Orderings ....................... 97 E. Parallel Gauss ........................ I01 9. Applications .......................... 105 A. Curve Fitting ......................... 105 B. Finite Elements ........................ 106 I0. Implementation of Algorithms ................... Ii0 A. Choice of Algorithms (Generalities). ............. II0 B. Storage Requirements ..................... 113 C. Comparative Timing Analysis .................. 120 Appendix A. Vector Computer Instructions ............... 131 Appendix B. Summary of Algorithms and Timings ............ 136 References .............................. 156 iv I. INTRODUCT ION Block tridiagonal matrices are a special class of matrices which arise in a variety of scientific and engineering computations, typically in the numerical solution of differential equations. For now it is sufficient to say that the matrix looks like bI c I 2 b2 c2 a3 b 3 c3 A _ • • • , aN-I bN-I _-I aN bN/ where b is an n X n matrix and a c are dimensioned conformaIly• Thus i i i N i ' i N nj) × ( nj). the full matrix A has dimension (_j=l In this thesis we study numerical methods for solution of a block tri- diagonal linear system Ax = v, with emphasis on efficient methods for a vector-oriented parallel computer (e.g., CDC STAR-100, llliac IV). Our analysis primarily concerns numerical properties of the algorithms, with discussion of their inherent parallelism and areas of application• For this reason many of our results also apply to standard sequential algo- rithms. A unifying analytic theme is that a wide variety of direct and iterative methods can be viewed as special cases of a general matrix itera- tion, and we make considerable use of a convergence theory for direct methods. Algorithms are compared using execution time estimates for a simplified model of the CDC STAR-100 and recommendations are made on this basis. There are two important subclasses of block tridiagonal matrices, depending on whether the blocks are small and dense or large and sparse. A computational method to solve the linear system Ax = v should take into account the internal structure of the blocks in order to obtain storage economy and a low operation count. Moreover, there are impor- tant applications in which an approximate solution is satisfactory. Such considerations often motivate iterative methods for large sparse systems, especially when a direct method would be far more expensive. Since 1965, however, there has been an increased interest in the direct solution of special systems derived from two-dimenslonal elliptic partial differential equations. With a standard five-point finite differ- ence approximation on a rectangular grid, the ai, c i blocks are diagonal and the b.i blocks are tridiagonal, so A possesses a very regular sparse structure. By further specializing the differential equation more struc- ture will be available. Indeed, the seminal paper for many of the impor- tant developments dealt with the simplest possible elliptic equation. This was Hockney's work [H5], based on an original suggestion by Golub, on the use of Fourier transforms and the cyclic reduction algorithm for the solu- tion of Poisson's equation on a rectangle. For an n X n square grid Hockney was able to solve Ax = v in O(n 3) arithmetic operations, while ordinary band- matrix methods required O(#) operations. Other O(n 3) methods for Poisson's equation then extant had considerably larger asymptotic constants, and the new method proved to be significantly less time-consuming in practice. S_b- sequent discovery of the Fast Fourier Transform and Buneman's stable version of cyclic reduction created fast (O(n2 log n) operations) and accurate methods that attracted much attention from applications programmers and numerical analysts [B20], [D2], [H6]. The Buneman algorithm has since been extended to Poisson's equation on certain nonrectangular regions and to general separable elliptic equations on a rectangle [$6], and well- tested Fortran subroutine packages are available (e.g., [$8]). Other recent work on these problems includes the Fourier-Toeplitz methods IF2], and Bank's generalized marching algorithms [B5]. The latter methods work by controlling a fast unstable method, and great care must be taken to maintain stability. Despite the success of fast direct methods for specialized problems there is still no completely satisfactory direct method that applies to a general nonseparable elliptic equation. The best direct method available for this problem is George's nested dissection [GI], which is theoreti- cally attractive but apparently hard to implement. Some interesting tech- niqJes to improve this situation are discussed by Eisenstat, Schultz and Sherman [Eli and Rose and Whitten [R5] among others. Nevertheless, in present practice the nonseparable case is still frequently solved by an iteration, and excellent methods based on direct solution of the separable case have appeared (e.g., [C4], [C5]). The multi-level adaptive techniques recently analyzed and considerably developed by Brandt [BI3] are a rather different approach, successfully combining iteration and a sequence of dis- cretization grids. Brandt discusses several methods suitable for parallel computation and although his work has not
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages167 Page
-
File Size-