Computing Matrix Functions in Arbitrary Precision Arithmetic
Total Page:16
File Type:pdf, Size:1020Kb
COMPUTING MATRIX FUNCTIONS IN ARBITRARY PRECISION ARITHMETIC A thesis submitted to The University of Manchester for the degree of Doctor of Philosophy in the Faculty of Science and Engineering 2019 Massimiliano Fasi School of Mathematics CONTENTS list of figures 4 list of tables 6 abstract 7 declaration 8 copyright statement 9 acknowledgements 10 publications 11 1 introduction 12 Bibliography 16 2 background material 21 2.1 Linear algebra 21 2.2 Floating point arithmetic 34 Bibliography 43 3 optimality of the paterson–stockmeyer method 47 3.1 Introduction 47 3.2 Evaluation of matrix polynomials 53 3.3 Rational matrix functions of order rk{ks 59 3.4 Diagonal Padé approximants to the matrix exponential 63 3.5 Conclusion 69 Bibliography 69 4 solution of primary matrix equations 73 4.1 Introduction 73 4.2 Background and notation 75 4.3 Classification of the solutions 79 4.4 A substitution algorithm 88 4.5 Numerical experiments 100 4.6 Conclusions 105 Bibliography 106 5 mutliprecision algorithms for the matrix exponential 108 5.1 Introduction 109 5.2 Padé approximation of matrix exponential 113 5.3 A multiprecision algorithm 120 5.4 Numerical experiments 132 5.5 Conclusions 139 Bibliography 140 2 contents 3 6 multiprecision algorithms for the matrix logarithm 145 6.1 Introduction 146 6.2 Support for multiple precision arithmetic 150 6.3 Approximation of hypergeometric functions 151 6.4 Schur–Padé algorithm 154 6.5 Transformation-free algorithm 160 6.6 Numerical experiments 162 6.7 Conclusions 170 Bibliography 171 7 weighted geometric mean times a vector 176 7.1 Introduction 176 7.2 Notation and preliminaries 179 7.3 Quadrature methods 181 7.4 Krylov subspace methods 192 ´1 7.5 Computing pA#tBq v 199 7.6 Numerical tests 200 7.7 Conclusions 211 Bibliography 213 8 conclusions 218 Bibliography 219 Word count: 66 973 LISTOFFIGURES Figure 3.1 Number of matrix multiplications required to evaluate a poly- nomial of degree k, for k?between 1 and?50, by means of the scheme (3.3) with s “ k and s “ k . The dotted and dashed lines mark the values of k that are integer multiples ? ? X \ P T of k and k , respectively; the circles mark the number of matrix multiplications required to evaluate polynomials of X \ P T optimal degree (in the sense of Definition 3.1) for the Paterson– Stockmeyer method. 54 Figure 3.2 Number of matrix multiplications required to evaluate a ratio- nal function of order rk{ks, for?k between 1 and? 50, by means of the scheme (3.15), for s “ 2k and s “ 2k . The dotted and dashed lines mark the values of k that are integer mul- ? ? X \ P T tiples of 2k and 2k , respectively; the circles mark the number of matrix multiplications required to evaluate ratio- X \ P T nal matrix functions of optimal order (in the sense of Defini- tion 3.1) for the evaluation scheme (3.15). 59 Figure 3.3 Number of matrix multiplications required to evaluate rk{ks Padé approximant to the matrix exponential, for k between 1 and 50, by means of the scheme (3.20), for s “ k ´ 1{2 and s “ k ´ 1{2 . The dotted and dashed lines mark the values Xa \ of k for which k´1 is an integer multiple of k ´ 1{2 and Pa T 2 k ´ 1{2 , respectively; the circles mark the number of ma- Xa \ trix multiplications required to evaluate the diagonal Padé ap- Pa T proximants to the matrix exponential of optimal order (in the sense of Definition 3.1) for the evaluation scheme (3.20). 64 Figure 4.1 Relative forward error of methods for the solution of rational matrix equations. 102 Figure 4.2 Forward error of the Schur algorithm. 104 Figure 5.1 Forward error of algorithms for the matrix exponential in dou- ble precision. 134 Figure 5.2 Forward error of algorithms for the matrix exponential in ar- bitrary precision. 136 Figure 6.1 Comparison of old and new bounds on the forward error of diagonal Padé approximants to the matrix logarithm. 154 Figure 6.2 Forward error and computational cost of algorithms for the matrix logarithm in double precision. 165 Figure 6.3 Forward error to unit roundoff ratio. 167 Figure 6.4 Forward error of algorithms for the matrix logarithm in arbi- trary precision. 168 Figure 7.1 Parameters of convergence of Gaussian quadrature formulae for the inverse matrix square root. 204 Figure 7.2 Convergence profiles for quadrature and Krylov methods for the geometric mean. 205 4 list of figures 5 Figure 7.3 Clusters in the adjacency matrices of the Wikipedia RfA signed network. 209 Figure 7.4 Execution time of algorithms for solving pA#tBqx “ v. 211 LISTOFTABLES Table 4.1 Solutions to the equation rpXq “ A in Test 4.2. 104 Table 5.1 Timings of algorithms for the matrix exponential in quadruple precision. 138 Table 6.1 Execution time of algorithms for the matrix logarithm in high precision. 169 Table 7.1 Summary of algorithms for the matrix geometrix mean used in the experiments. 202 Table 7.2 Summary of matrices used in the experiments. 207 Table 7.3 Execution time of methods for the weighted geometric mean. 207 6 ABSTRACT Functions of matrices arise in numerous applications, and their accurate and efficient evaluation is an important topic in numerical linear algebra. In this thesis, we explore methods to compute them reliably in arbitrary precision arithmetic: on the one hand, we develop some theoretical tools that are necessary to reduce the impact of the working precision on the algorithmic design stage; on the other, we present new numerical algorithms for the evaluation of primary matrix functions and the solution of matrix equations in arbitrary precision environments. Many state-of-the-art algorithms for functions of matrices rely on polynomial or rational approximation, and reduce the computation of f pAq to the evaluation of a polynomial or rational function at the matrix argument A. Most of the algorithms developed in this thesis are no exception, thus we begin our investigation by revis- iting the Paterson–Stockmeyer method, an algorithm that minimizes the number of nonscalar multiplications required to evaluate a polynomial of a certain degree. We introduce the notion of optimal degree for an evaluation scheme, and derive formu- lae for the sequences of optimal degree for the schemes used in practice to evaluate truncated Taylor and diagonal Padé approximants. If the rational function r approximates f , then it is reasonable to expect that a so- lution to the matrix equation rpXq “ A will approximate the functional inverse of f . In general, infinitely many matrices can satisfy this kind of equation, and we pro- pose a classification of the solutions that is of practical interest from a computational standpoint. We develop a precision-oblivious numerical algorithm to compute all the solutions that are of interest in practice, which behaves in a forward stable fashion. After establishing these general techniques, we concentrate on the matrix exponen- tial and its functional inverse, the matrix logarithm. We present a new scaling and squaring approach for computing the matrix exponential in high precision, which combines a new strategy to choose the algorithmic parameters with a bound on the forward error of Padé approximants to the exponential. Then, we develop two algo- rithms, based on the inverse scaling and squaring method, for evaluating the matrix logarithm in arbitrary precision. The new algorithms rely on a new forward error bound for Padé approximants, which for highly nonnormal matrices can be consid- erably smaller than the classic bound of Kenney and Laub. Our experimental results show that in double precision arithmetic the new approaches are comparable with the state-of-the-art algorithm for computing the matrix logarithm, and experiments in higher precision support the conclusion that the new algorithms behave in a forward stable way, typically outperforming existing alternatives. Finally, we consider a problem of the form f pAqb, and focus on methods for com- puting the action of the weighted geometric mean of two large and sparse positive definite matrices on a vector. We present two new approaches based on numerical quadrature, and compare them with several methods based on the Krylov subspace in terms of both accuracy and efficiency, and show which algorithms are better suited for a black-box approach. In addition, we show how these methods can be employed to solve a problem that arises in applications, namely the solution of linear systems whose coefficient matrix is a weighted geometric mean. 7 DECLARATION No portion of the work referred to in the thesis has been submitted in support of an application for another degree or qualification of this or any other university or other institute of learning. 8 COPYRIGHTSTATEMENT i. The author of this thesis (including any appendices and/or schedules to this thesis) owns certain copyright or related rights in it (the “Copyright”) and s/he has given The University of Manchester certain rights to use such Copyright, including for administrative purposes. ii. Copies of this thesis, either in full or in extracts and whether in hard or elec- tronic copy, may be made only in accordance with the Copyright, Designs and Patents Act 1988 (as amended) and regulations issued under it or, where appro- priate, in accordance with licensing agreements which the University has from time to time. This page must form part of any such copies made. iii. The ownership of certain Copyright, patents, designs, trade marks and other in- tellectual property (the “Intellectual Property”) and any reproductions of copy- right works in the thesis, for example graphs and tables (“Reproductions”), which may be described in this thesis, may not be owned by the author and may be owned by third parties.