Rational Krylov Methods for Operator Functions
Total Page:16
File Type:pdf, Size:1020Kb
Rational Krylov Methods for Operator Functions Von der Fakult¨at f¨ur Mathematik und Informatik der Technischen Universit¨at Bergakademie Freiberg genehmigte Dissertation zur Erlangung des akademischen Grades doctor rerum naturalium (Dr. rer. nat.), vorgelegt von Dipl.-Math. Stefan G¨uttel, geboren am 27.11.1981 in Dresden. Betreuer: Prof. Dr. Michael Eiermann (Freiberg) Gutachter: Prof. Dr. Axel Ruhe (Stockholm) Prof. Dr. Nick Trefethen FRS (Oxford) Verleihung: Freiberg, am 12.03.2010 This page is intentionally left almost blank. Abstract We present a unified and self-contained treatment of rational Krylov methods for approximating the product of a function of a linear operator with a vector. With the help of general rational Krylov decompositions we reveal the connections between seemingly different approximation methods, such as the Rayleigh–Ritz or shift-and-invert method, and de- rive new methods, for example a restarted rational Krylov method and a related method based on rational interpolation in prescribed nodes. Various theorems known for polynomial Krylov spaces are generalized to the rational Krylov case. Computational issues, such as the computa- tion of so-called matrix Rayleigh quotients or parallel variants of ratio- nal Arnoldi algorithms, are discussed. We also present novel estimates for the error arising from inexact linear system solves and the approx- imation error of the Rayleigh–Ritz method. Rational Krylov methods involve several parameters and we discuss their optimal choice by consid- ering the underlying rational approximation problems. In particular, we present different classes of optimal parameters and collect formulas for the associated convergence rates. Often the parameters leading to best convergence rates are not optimal in terms of computation time required by the resulting rational Krylov method. We explain this observation and present new approaches for computing parameters that are prefer- able for computations. We give a heuristic explanation of superlinear convergence effects observed with the Rayleigh–Ritz method, utilizing a new theory of the convergence of rational Ritz values. All theoretical re- sults are tested and illustrated by numerical examples. Numerous links to the historical and recent literature are included. Acknowledgements I thank Prof. Dr. Michael Eiermann who made this thesis possible in the first place. Already when I was an undergraduate student he directed my interest towards Krylov methods and potential theory. I appreciate his constant support and belief in me, and the inspiring mathematical discussions. I am grateful to Prof. Dr. Oliver Ernst for his help. In particular, I thank him for reading all the drafts and suggesting countless improvements and corrections regarding content and language. Of course, I take full responsibility for remaining errors and typos in this thesis. I thank my colleagues and friends at TU Bergakademie Freiberg, in par- ticular at the Institutes of Numerical Mathematics & Optimization and Geophysics. I learned a lot in the past three years of fruitful collabora- tion with them. I thank Prof. Dr. Bernhard Beckermann for his hospitality during an intensive research week in Lille in February 2009, the outcome of which was a joint paper with Prof. Dr. Raf Vandebril, to whom I am also grateful. I thank Dr. Maxim Derevyagin for discussions about linear operators, and Prof. Dr. Mike Botchev for translating parts of the paper by A. N. Krylov. In 2008 I enjoyed two months of Japanese summer, working at the National Institute of Informatics, and I am thankful to Prof. Dr. Ken Hayami for this unforgettable experience. I thank Prof. Dr. Axel Ruhe and Prof. Dr. Nick Trefethen for serving on the reading committee. I acknowledge financial support by the Deutsche Forschungsgemeinschaft. Most importantly, I thank my parents for their invaluable support. Contents List of Figures iii Notation v 1 Introduction 1 1.1 AModelProblem................................. 2 1.2 AimandStructureofthisThesis . ... 7 2 Functions of Operators 9 2.1 BoundedOperators................................ 10 2.2 AlgebraicOperators .............................. 12 2.3 ClosedUnboundedOperators . 13 3 Subspace Approximation for f(A)b 15 3.1 TheRayleighMethod .............................. 16 3.2 RitzPairs ..................................... 19 3.3 PolynomialKrylovSpaces . 20 4 Rational Krylov Spaces 27 4.1 Definition and Basic Properties . 28 4.2 TheRayleigh–RitzMethod . 30 4.2.1 Rational Interpolation . 31 4.2.2 Near-Optimality ............................. 34 5 Rational Krylov Decompositions 39 5.1 TheRationalArnoldiAlgorithm . 40 5.2 Orthogonal Rational Functions . 43 5.3 Rational Krylov Decompositions . 47 5.4 VariousRationalKrylovMethods. 50 5.4.1 A Restarted Rational Krylov Method . 50 5.4.2 ThePAINMethod ............................ 51 5.4.3 TheShift-and-InvertMethod . 54 5.4.4 ThePFEMethod ............................ 57 5.5 Overview of Rational Krylov Approximations . ....... 59 i ii Contents 6 Computational Issues 61 6.1 Obtaining the Rayleigh Quotient . 62 6.2 LinearSystemSolvers ............................. 65 6.2.1 DirectMethods.............................. 66 6.2.2 IterativeMethods ............................ 67 6.3 InexactSolves................................... 69 6.4 LossofOrthogonality ............................. 74 6.5 Parallelizing the Rational Arnoldi Algorithm . ......... 75 6.6 A-Posteriori Error Estimation . ..... 81 6.6.1 NormofCorrection ........................... 81 6.6.2 CauchyIntegralFormula . 82 6.6.3 Auxiliary Interpolation Nodes . 83 7 Selected Approximation Problems 87 7.1 Preliminaries from Rational Approximation Theory . .......... 89 7.2 Preliminaries from Logarithmic Potential Theory . .......... 91 7.3 TheQuadratureApproach. 98 7.4 Single Repeated Poles and Polynomial Approximation . .........102 7.5 TheResolventFunction . .. .. .. .. .. .. .. .. 102 7.5.1 SingleRepeatedPoles . .103 7.5.2 Connection to Zolotarev Problems . 106 7.5.3 Cyclically Repeated Poles . 108 7.6 TheExponentialFunction . 111 7.6.1 RealPoleApproximations. 112 7.6.2 Connection to a Zolotarev Problem . 116 7.7 OtherFunctions,OtherTechniques . 118 8 Rational Ritz Values 121 8.1 APolynomialExtremalProblem . 121 8.2 Asymptotic Distribution of Ritz Values . 124 8.3 SuperlinearConvergence. 131 9 Numerical Experiments 135 9.1 Maxwell’sEquations .............................. 135 9.1.1 TimeDomain...............................135 9.1.2 FrequencyDomain .. .. .. .. .. .. .. .. .142 9.2 LatticeQuantumChromodynamics. 146 9.3 An Advection–Diffusion Problem . 149 9.4 AWaveEquation.................................152 Bibliography 155 List of Figures 1.1 Comparison of a polynomial and a rational Krylov method . ........ 5 3.1 Rayleigh–Ritz approximation does not find exact solution .......... 26 4.1 A-priori error bound for the 3D heat equation . ....... 37 4.2 A-priori error bound for the 1D heat equation . ....... 37 5.1 Interpolation nodes of the shift-and-invert method . ........... 57 6.1 Estimation of the sensitivity error . ...... 74 6.2 Variants of parallel rational Arnoldi algorithms . ........... 77 6.3 Error curves for variants of parallel rational Arnoldi algorithms . 78 6.4 Conditioning of Krylov basis in parallel rational Arnoldi algorithms . 80 6.5 Comparison of a-posteriori error estimates and bounds . ........... 86 7.1 Equilibriummeasure of an L-shapeddomain. ...... 92 7.2 Signedequilibriummeasure . 94 7.3 Bending an integration contour . 97 7.4 Convergence of quadrature methods and the PAIN method . .......101 7.5 Repeated pole approximation of the resolvent function . ...........106 7.6 Convergence rate as a function of the parameter τ ..............107 7.7 Convergence of Rayleigh–Ritz approximations for the resolvent function . 110 7.8 Comparison of optimal convergence rates . 111 7.9 Approximating the exponential function with cyclically repeated poles . 115 7.10 Approximating the exponential function with Zolotarev poles . 117 8.1 Comparison of two min-max polynomials . 124 8.2 Convergence of polynomial Ritz values . 127 8.3 Convergence of rational Ritz values . 130 8.4 Superlinear convergence of Rayleigh–Ritz approximation...........134 9.1 Solving Maxwell’s equations in the time domain . ........140 9.2 Error curves of Rayleigh–Ritz approximations . .........141 9.3 Solving Maxwell’s equation in the frequency domain . .........144 iii iv List of Figures 9.4 Pole sequences for Maxwell’s equations in frequency domain . 144 9.5 Residual curves of Rayleigh–Ritz approximations . ..........145 9.6 Nonzero structure of a Wilson–Dirac matrix . .......147 9.7 A-posteriori error estimates for the QCD problem . .........148 9.8 Spatial domain and eigenvalues for advection–diffusion problem. 150 9.9 A-posteriori error estimates for advection–diffusion problem . 151 9.10 Solutions of a 1D wave equation . 153 9.11 Rational Ritz values for a differential operator . ...........154 Notation By we denote set inclusion with possible equality, whereas denotes strict inclusion. ⊆ ⊂ Unless stated otherwise, the following conventions for variable names hold: Small Latin letters (a, b, . .) and Greek letters (α, β, Φ, Ψ,...) denote scalars or scalar-valued functions (including measures µ,ν,...). Bold Latin letters (a, b,...) stand for vectors, and these are usually in column format. Capital Latin letters (A,B,...) stand for linear opera- tors. Calligraphic letters ( , ,...) denote linear spaces and double-struck capital letters P Q (D, R,...) are special subsets of the complex plane