Estimating the Optimal Extrapolation Parameter for Extrapolated Iterative Methods When Solving Sequences of Linear Systems A

ESTIMATING THE OPTIMAL EXTRAPOLATION PARAMETER FOR EXTRAPOLATED ITERATIVE METHODS WHEN SOLVING SEQUENCES OF LINEAR SYSTEMS A Thesis Presented to The Graduate Faculty of The University of Akron In Partial Fulfillment of the Requirements for the Degree Master of Science Curtis J. Anderson December, 2013 ESTIMATING THE OPTIMAL EXTRAPOLATION PARAMETER FOR EXTRAPOLATED ITERATIVE METHODS WHEN SOLVING SEQUENCES OF LINEAR SYSTEMS Curtis J. Anderson Thesis Approved: Accepted: Advisor Dean of the College Dr. Yingcai Xiao Dr. Chand Midha Co-Advisor Dean of the Graduate School Dr. Zhong-Hui Duan Dr. George Newkome Co-Advisor Date Dr. Ali Hajjafar Department Chair Dr. Yingcai Xiao ii ABSTRACT Extrapolated iterative methods for solving systems of linear equations require the selection of an extrapolation parameter which greatly influences the rate of convergence. Some extrapolated iterative methods provide analysis on the optimal extrapolation parameter to use, however, such analysis only exists for specific problems and a general method for parameter selection does not exist. Additionally, the calculation of the optimal extrapolation parameter can often be too computationally expensive to be of practical use. This thesis presents an algorithm that will adaptively modify the extrapolation parameter when solving a sequence of linear systems in order to estimate the optimal extrapolation parameter. The result is an algorithm that works for any general problem and requires very little computational overhead. Statistics on the quality of the algorithm's estimation are presented and a case study is given to show practical results. iii TABLE OF CONTENTS Page LIST OF FIGURES . v CHAPTER I. INTRODUCTION . 1 1.1 Stationary Iterative Methods . 2 1.2 Convergence of Iterative Methods . 3 1.3 Well-Known Iterative Methods . 4 1.4 Extrapolated Iterative Methods . 10 1.5 Sequence of Linear Systems . 13 II. AN ALGORITHM FOR ESTIMATING THE OPTIMAL EXTRAP- OLATION PARAMETER . 15 2.1 Spectral Radius Estimation . 15 2.2 Extrapolated Spectral Radius Function Reconstruction . 20 2.3 Parameter Selection . 28 2.4 Solver Integration . 39 III. PERFORMANCE ANALYSIS . 43 IV. CASE STUDY . 48 BIBLIOGRAPHY . 53 iv LIST OF FIGURES Figure Page 1.1 The CR transition disk in relation to the unit disk . 11 2.1 Statistics on the ratio of the estimated average reduction factor (¯σi) to the spectral radius (ρ) as i is varied for randomly generated systems. 17 2.2 Statistics on the ratio of the estimated spectral radius (¯ρ) to the spectral radius (ρ) compared against the number of iterations required for convergence for randomly generated systems. 18 2.3 Example ofρ ¯ estimating points on the extrapolated spectral radius function. 19 2.4 The square of the magnitude of the spectral radius shown as the com- position of the square of the magnitude of the respective eigenvalues for two randomly generated systems (only relevant eigenvalues are shown). 21 2.5 Example of segmentation of samples based upon algorithm 2.2.1. 25 2.6 Constrained regression (left) versus unconstrained regression (right) applied to the same problem. 26 2.7 Example of estimating the extrapolated spectral radius function from samples that are all superior to non-extrapolated iteration (be- low the black dotted line). 30 3.1 Performance results for Gauss-Seidel. 45 3.2 Performance results for SOR w = 1:5. 46 3.3 Performance results for SOR w = 1:8. 47 4.1 Slices of the volume solved in the example problem at particular times. 49 4.2 Sparsity plot of the Crank-Nicholson coefficient matrix for an 8x8x8 discretization (blue entries are nonzero entries of the matrix). 50 v 4.3 Benchmark results for a 'pulsed' source (S(x; y; z; t) = sin(t)). 51 4.4 Benchmark results for a constant source. (S(x; y; z; t) = 0) . 52 vi CHAPTER I INTRODUCTION Many applications in scientific computing require solving a sequence of linear systems Afigxfig = bfig for i = 0; 1; 2; ::: for the exact solution xfig = Afig−1bfig where Afig 2 Rn×n, xfig 2 Rn, and bfig 2 Rn. Several direct methods exist for solving systems such as Gaussian elimination and LU decomposition that do not require explicitly computing Afi}−1, however, such methods may not be the most efficient or accurate methods to use on very large problems. Direct methods often require the full storage of matrices in computer memory as well as O(n3) operations to compute the solution. As matrices become large these drawbacks become increasingly prohibitive. Iterative methods are an alternative class of methods for solving linear systems that alleviate many of the drawbacks of direct methods. Iterative methods start with an initial solution vector x(0) and subsequently { }1 (i) generate a sequence of vectors x i=1 which converge to the solution x. Iteration takes the form of a recursive function x(i+1) = Φ(x(i)); (1.1) 1 which is expected to converge to x, however, convergence for iterative methods is not always guaranteed and is dependent upon both the iterative method and problem. The calculation of each iteration is often achieved with significantly lower computation and memory usage than direct methods. There are several choices for iterative functions, leading to two major classes of iterative methods. Stationary methods have an iteration matrix that determine convergence while non-stationary methods, such as Krylov subspace methods, do not have an iteration matrix and convergence is dependent upon other factors [1, 2]. Throughout this thesis the name iterative method is intended to refer to stationary methods since only stationary methods and their extrapolations are studied. 1.1 Stationary Iterative Methods Let a nonsingular n × n matrix A be given and a system of linear equations, Ax = b, with the exact solution x = A−1b. We consider an arbitrary splitting A = N − P for the matrix A, where N is nonsingular. A stationary iterative method can be found by substituting the splitting into the original problem (N − P )x = b (1.2) and then setting the iteration Nx(i+1) = P x(i) + b (1.3) 2 and solving for x(i+1); x(i+1) = N −1P x(i) + N −1b: (1.4) This method was first developed in this generality by Wittmeyer in 1936 [3]. Conver- gence to the solution is heavily dependent upon the choice of splitting for N and P and is not generally guaranteed for all matrices, however, for certain types of matrices some splittings can guarantee convergence. Intelligent choices for splittings rarely require finding the matrix N −1 explicitly and instead computation occurs by solving the system in equation (1.3) for x(i+1). In the following sections the most well-known iterative methods are introduced along with general rules of convergence. 1.2 Convergence of Iterative Methods Convergence is the cornerstone of this thesis since it is required for the method to be of any use. It is well known that convergence occurs if and only if the spectral radius −1 of the iteration matrix, N P , is less than one [4], that is, if λi is an eigenvalue of the matrix N −1P then −1 ρ(N P ) = max jλij < 1: (1.5) 1≤i≤n Additionally, the rate of convergence for an iterative method occurs at the same rate that ρ(N −1P )i ! 0 as i ! 1. Thus, we are not only concerned about hav- ing ρ(N −1P ) within unity, but also making ρ(N −1P ) as small as possible to achieve the fastest convergence possible. The rate of convergence between two iterative methods can be compared through the following method: If ρ1 and ρ2 are the respective 3 spectral radii of two iterative methods then according to n m ρ1 = ρ2 (1.6) iterative method 1 will require n iterations to reach the same level of convergence as iterative method 2 with m iterations. Solving explicitly for the ratio of the number of iterations required results in n ln(ρ ) = 2 : (1.7) m ln(ρ1) Equation (1.7) allows for the iteration requirements for different methods to be easily ln(ρ2) compared. For example, if ρ1 = :99 and ρ2 = :999, then we see = 0:0995, thus, ln(ρ1) method 1 requires only 9:95% the iterations that method 2 requires. Additionally, if ln(ρ2) ρ1 = :4 and ρ2 = :5, then we see = 0:7565, thus, method 1 requires only 75:65% ln(ρ1) the iterations that method 2 requires. The first example shows how important small improvements can be for spectral radii close to 1 while the second example shows that the absolute change in value of a spectral radius is not an accurate predictor of the improvement provided by a method. Thus, for a proper comparison, equation (1.7) should be used when comparing two methods. 1.3 Well-Known Iterative Methods Given a nonsingular n × n matrix A, the matrices L, U, and D are taken as the strictly lower triangular, the strictly upper triangular, and the diagonal part of A, 4 respectively. The following sections detail the most well-known splittings and their convergence properties. 1.3.1 Jacobi Method The Jacobi method is defined by the splitting A = N − P , where N = D and P = −(L + U) [5,6]. Therefore, the Jacobi iterative method can be written in vector form as x(k+1) = −D−1(L + U)x(k) + D−1b: (1.8) In scalar form, the Jacobi method is written as 0 1 1 BXn C x(k+1) = − B a x(k) − b C for i = 1; 2; ··· ; n: (1.9) i a @ ij j iA ii j=1 j=6 i Algorithm 1.3.1 implements a one Jacobi iteration in MATLAB.

Load more