Pseudospectral Shattering, the Sign Function, and Diagonalization In

Pseudospectral Shattering, the Sign Function, and Diagonalization in Nearly Matrix Multiplication Time Jess Banks ∗ Jorge Garza-Vargas Archit Kulkarni [email protected] [email protected] [email protected] UC Berkeley UC Berkeley UC Berkeley Nikhil Srivastava † [email protected] UC Berkeley September 15, 2020 Abstract We exhibit a randomized algorithm which given a square matrix n×n with and , computes with high probability an invertible and diagonal suchA∈ℂ that ‖A‖ ≤ 1 >0 V D −1 ‖A−VDV ‖ ≤ in MM 2 arithmetic operations on a floating point machine with 4 bits O(T (n) log (n/)) O(log (n/)logn) of precision. The computed similarity additionally satisfies −1 2.5 . Here MM is the number of arithmetic operations requiredV to multiply two‖V‖‖V complex‖ ≤ O(n matrices/) numericallyT (n) n × n stably, known to satisfy MM !+ for every where is the exponent of matrix multiplication [DDHK07]. TheT algorithm(n) = O(n is a) variant of the>0 spectral bisection! algorithm in numerical linear algebra [BJD74] with a crucial Gaussian perturbation preprocessing step. Our running time is optimal up to polylogarithmic factors, in the sense that verifying that a given similarity diagonalizes a matrix requires at least matrix multiplication time. It significantly improves the previously best known provable running times of arithmetic operations for diagonalization of general ma- O(n10/2) trices [ABB+18], and (with regards to the dependence on ) 3 arithmetic operations for Hermitian matrices [Par98], and is the first algorithm to achieve nearlyn O(n matrix) multiplication time for diagonal- arXiv:1912.08805v3 [math.NA] 13 Sep 2020 ization in any model of computation (real arithmetic, rational arithmetic, or finite arithmetic). The proof rests on two new ingredients. (1) We show that adding a small complex Gaussian perturbation to any matrix splits its pseudospectrum into small well-separated components. In particular, this implies that the eigenvalues of the perturbedn matrix have a large minimum gap, a property of independent interest in random matrix theory. (2) We give a rigorous analysis of Roberts’ [Rob80] Newton iteration method for computing the sign function of a matrix in finite arithmetic, itself an open problem in numerical analysis since at least 1986 [Bye86]. This is achieved by controlling the evolution of the pseudospectra of the iterates using a carefully chosen sequence of shrinking contour integrals in the complex plane. ∗Supported by the NSF Graduate Research Fellowship Program under Grant DGE-1752814. †Supported by NSF Grant CCF-1553751. 1 Contents 1 Introduction 3 1.1 Problem Statement ................................... 4 1.1.1 Accuracy and Conditioning .......................... 4 1.1.2 Models of Computation ............................ 6 1.2 Results and Techniques ................................. 7 1.3 Related Work ....................................... 10 2 Preliminaries 14 2.1 Spectral Projectors and Holomorphic Functional Calculus .............. 14 2.2 Pseudospectrum and Spectral Stability ........................ 15 2.3 Finite-Precision Arithmetic ............................... 17 2.4 Sampling Gaussians in Finite Precision ........................ 18 2.5 Black-box Error Assumptions for Multiplication, Inversion, and QR ........ 18 3 Pseudospectral Shattering 20 3.1 Smoothed Analysis of Gap and Eigenvector Condition Number ........... 20 3.2 Shattering ........................................ 23 4 Matrix Sign Function 27 4.1 Circles of Apollonius .................................. 28 4.2 Exact Arithmetic ..................................... 29 4.3 Finite Arithmetic .................................... 34 5 Spectral Bisection Algorithm 43 5.1 Proof of Theorem 5.5 .................................. 49 6 Conclusion and Open Questions 55 A Deferred Proofs from Section 4 61 B Analysis of SPLIT 64 C Analysis of DEFLATE 66 C.1 Smallest Singular Value of the Corner of a Haar Unitary ............... 67 C.2 Sampling Haar Unitaries in Finite Precision ...................... 69 C.3 Preliminaries of RURV ................................. 71 C.4 Exact Arithmetic Analysis of DEFLATE ........................ 72 C.5 Finite Arithmetic Analysis of DEFLATE ........................ 74 2 1 Introduction We study the algorithmic problem of approximately finding all of the eigenvalues and eigenvectors of a given arbitrary complex matrix. While this problem is quite well-understood in the special case of Hermitiann×n matrices (see, e.g., [Par98]), the general non-Hermitian case has remained mysterious from a theoretical standpoint even after several decades of research. In particular, the currently best known provable algorithms for this problem run in time 10 2 O(n / ) [ABB+18] or c [Cai94] with where is an error parameter, depending on the modelO(n of computationlog(1/)) and notionc of ≥ 12approximation > 0 considered.1 To be sure, the non- Hermitian case is well-motivated: coupled systems of differential equations, linear dynamical systems in control theory, transfer operators in mathematical physics, and the nonbacktracking matrix in spectral graph theory are but a few situations where finding the eigenvalues and eigenvectors of a non-Hermitian matrix is important. The key difficulties in dealing with non-normal matrices are the interrelated phenomena of non-orthogonal eigenvectors and spectral instability, the latter referring to extreme sensitivity of the eigenvalues and invariant subspaces to perturbations of the matrix. Non-orthogonality slows down convergence of standard algorithms such as the power method, and spectral instability can force the use of very high precision arithmetic, also leading to slower algorithms. Both phenomena together make it difficult to reduce the eigenproblem to a subproblem by “removing” an eigenvector or invariant subspace, since this can only be done approximately and one must control the spectral stability of the subproblem. In this paper, we overcome these difficulties by identifying and leveraging a phenomenon we refer to as pseudospectral shattering: adding a small complex Gaussian perturbation to any matrix yields a matrix with well-conditioned eigenvectors and a large minimum gap between the eigenvalues, implying spectral stability. This result builds on the recent solution of Davies’ conjecture [BKMS19], and is of independent interest in random matrix theory, where minimum eigenvalue gap bounds in the non-Hermitian case were previously only known for i.i.d. models [SJ12, Ge17]. We complement the above by proving that a variant of the well-known spectral bisection algorithm in numerical linear algebra [BJD74] is both fast and numerically stable (i.e., can be implemented using a polylogarithmic number of bits of precision) when run on a pseudospectrally shattered matrix. The key step in the bisection algorithm is computing the sign function of a matrix, a problem of independent interest in many areas such including control theory and approximation theory [KL95]. Our main algorithmic contribution is a rigorous analysis of the well-known Newton iteration method [Rob80] for computing the sign function in finite arithmetic, showing that it converges quickly and numerically stably on matrices for which the sign function is well-conditioned, in particular on pseudospectrally shattered ones. The end result is an algorithm which reduces the general diagonalization problem to a polylogarithmic (in the desired accuracy and dimension ) number of invocations of standard numerical linear algebra routines (multiplication, inversion,n and QR factorization), each of which is re- ducible to matrix multiplication [DDH07], yielding a nearly matrix multiplication runtime for the 1A detailed discussion of these and other related results appears in Section 1.3. 3 whole algorithm. This improves on the previously best known running time of 3 2 arithmetic operations even in the Hermitian case [Par98]. O(n +n log(1/)) We now proceed to give precise mathematical formulations of the eigenproblem and compu- tational model, followed by statements of our results and a detailed discussion of related work. 1.1 Problem Statement An eigenpair of a matrix n×n is a tuple n such that A ∈ ℂ (,v)∈ℂ×ℂ Av = v, and is normalized to be a unit vector. The eigenproblem is the problem of finding a maximal set ofv linearly independent eigenpairs of a given matrix ; note that an eigenvalue may appear more than once if it has geometric(i,v multiplicityi) greater thanA one. In the case when is diagonalizable, the solution consists of exactly eigenpairs, and if has distinct eigenvalues thenA the solution is unique, up to the phases of the n . A vi 1.1.1 Accuracy and Conditioning Due to the Abel-Ruffini theorem, it is impossible to have a finite-time algorithm which solves the eigenproblem exactly using arithmetic operations and radicals. Thus, all we can hope for is ap- proximate eigenvalues and eigenvectors, up to a desired accuracy . There are two standard notions of approximation. We assume for normalization, where > 0 throughout this work, ‖A‖ ≤ 1 denotes the spectral norm (the 2 2 operator norm). ‖⋅‖ → Forward Approximation. Compute pairs ¨ ¨ such that (i ,vi ) ¨ and ¨ |i − i |≤ ‖vi −vi ‖≤ for the true eigenpairs , i.e., find a solution close to the exact solution. This makes sense in contexts where the exact(i,v solutioni) is meaningful; e.g. the matrix is of theoretical/mathematical origin, and unstable (in the entries) quantities such as eigenvalue multiplicity can have a signifi- cant meaning. Backward Approximation.

Load more