Methods for Estimating the Diagonal of Matrix Functions

W&M ScholarWorks Dissertations, Theses, and Masters Projects Theses, Dissertations, & Master Projects Fall 2016 Methods for Estimating The Diagonal of Matrix Functions Jesse Harrison Laeuchli College of William and Mary, [email protected] Follow this and additional works at: https://scholarworks.wm.edu/etd Part of the Computer Sciences Commons Recommended Citation Laeuchli, Jesse Harrison, "Methods for Estimating The Diagonal of Matrix Functions" (2016). Dissertations, Theses, and Masters Projects. Paper 1477067934. http://doi.org/10.21220/S2CC7X This Dissertation is brought to you for free and open access by the Theses, Dissertations, & Master Projects at W&M ScholarWorks. It has been accepted for inclusion in Dissertations, Theses, and Masters Projects by an authorized administrator of W&M ScholarWorks. For more information, please contact [email protected]. Methods for Estimating the Diagonal of Matrix Functions Jesse Harrison Laeuchli Williamsburg, Virginia Bachelor of Science, University of Notre Dame, 2007 Master of Science, College of William and Mary, 2012 A Dissertation presented to the Graduate Faculty of the College of William and Mary in Candidacy for the Degree of Doctor of Philosophy Department of Computer Science The College of William and Mary May 2016 © Copyright by Jesse Harrison Laeuchli 2016 ABSTRACT Many applications such as path integral evaluation in Lattice Quantum Chromodynamics (LQCD), variance estimation of least square solutions and spline fits, and centrality measures in network analysis, require computing the diagonal of a function of a matrix, Diag(f(A)) where A is sparse matrix, and f is some function. Unfortunately, when A is large, this can be computationally prohibitive. Because of this, many applications resort to Monte Carlo methods. However, Monte Carlo methods tend to converge slowly. One method for dealing with this shortcoming is probing. Probing assumes that nodes that have a large distance between them in the graph of A, have only a small weight connection in f(A). To determine the distances between nodes, probing forms Ak. Coloring the graph of this matrix will group nodes that have a high distance between them together, and thus a small connection in f(A). This enables the construction of certain vectors, called probing vectors, that can capture the diagonals of f(A). One drawback of probing is in many cases it is too expensive to compute and store Ak for the k that adequately determines which nodes have a strong connection in f(A). Additionally, it is unlikely that the set of probing vectors required for Ak is a subset of the probing vectors needed for Ak+1. This means that if more accuracy in the estimation is required, all previously computed work must be discarded. In the case where the underlying problem arises from a discretization of a partial differential equation (PDE) onto a lattice, we can make use of our knowledge of the geometry of the lattice to quickly create hierarchical colorings for the graph of Ak. A hierarchical coloring is one in which colors for Ak+1 are created by splitting groups of nodes sharing a color in Ak. The hierarchical property ensures that the probing vectors used to estimate Diag(f(A)) are nested subsets, so if the results are inaccurate the estimate can be improved without discarding the previous work. If we do not have knowledge of the intrinsic geometry of the matrix, we propose two new classes of methods that improve on the results of probing. One method seeks to determine structural properties of the matrix f(A) by obtaining random samples of the columns of f(A). The other method leverages ideas arising from similar problems in graph partitioning, and makes use of the eigenvectors of f(A) to form effective hierarchical colorings. Our methods have thus far seen successful use in computational physics, where they have been applied to compute observables arising in LQCD. We hope that the refinements presented in this work will enable interesting applications in many other fields. TABLE OF CONTENTS Acknowledgments iv Dedication v List of Tables vi List of Figures vii 1 Introduction 2 1.1 Motivation . 2 1.1.1 Prior Work and New Approach . 3 1.2 Overview . 6 2 Prior Work and Applications 7 2.1 Applications . 8 2.1.1 Statistical Applications . 8 2.1.2 Lattice Quantum Chromodynamics . 9 2.1.3 Network Centrality . 10 2.2 Prior Work . 10 2.2.1 Statistical Methods . 10 2.2.2 Non-Statistical Methods . 12 3 Estimation of Diag(f(A)) on torodial lattices 18 3.1 Lattices with dimensions consisting only of powers of 2 . 18 3.1.1 Introduction . 18 i 3.1.2 Preliminaries . 20 3.1.3 Lattice QCD problems . 21 3.1.4 The Monte Carlo method for Tr(A−1) . 21 3.1.5 Probing . 22 3.1.6 Hadamard vectors . 25 3.1.7 Overcoming probing limitations . 26 3.1.8 Hierarchical coloring . 29 3.1.9 Hierarchical coloring on lattices . 29 3.1.10 Splitting color blocks into conformal d-D lattices . 31 3.1.11 Facilitating bit reversal in higher dimensions . 32 3.1.12 Lattices with different sizes per dimension . 33 3.1.13 Coloring lattices with non-power of two sizes . 37 3.1.14 Generating the probing basis . 38 3.1.15 Removing the deterministic bias . 40 3.1.16 Numerical experiments . 41 3.1.17 Comparison with classical probing . 41 3.1.18 Comparison with random-noise Monte Carlo . 44 3.1.19 A large QCD problem . 45 3.1.20 Conclusions . 49 3.2 Lattices of arbitrary dimensions . 52 3.2.1 Introduction and Preliminaries . 52 3.3 Lattices as spans of sublattices . 53 3.4 Coloring sublattices . 57 3.4.1 Hierarchical Permutations of Lattices with Equal Sides . 62 3.4.2 Hierarchical Permutations of Lattices with Unequal Sides . 64 3.4.3 Generating Probing Vectors Quickly . 68 3.5 Probing Vectors For Hierarchical Coloring on General Graphs . 70 ii 3.6 Performance Testing . 73 3.7 Conclusion . 75 4 Estimation of diag(f(A)) in the general case 77 4.1 Graph Coloring . 78 4.2 Statistical Considerations . 80 4.3 Structural Methods . 83 4.4 Spectral Methods . 89 4.4.1 Spectral k-partitioning for the matrix inverse . 92 4.5 Experimental Results . 94 4.6 Conclusions . 99 5 Conclusion and future work 103 5.1 Methods for Lattices . 103 5.2 Methods for General Matrices . 104 iii ACKNOWLEDGMENTS It is difficult to convey my deep gratitude and respect for my advisor Andreas Stathopoulos, without whom this work would not have been possible. His keen perception, mathematical insight, and deep understanding of the field carried us though many difficult problems. To me he is the ideal computer scientist and mentor. I also wish to thank all of the members of my dissertation committee for their thoughtful comments and guidance, which greatly improved this work. As part of the Computational Science research group at William and Mary, I had the good fortune to work with Lingfei Wu, and Eloy Romero Alcalde. I am grateful for their support and friendship. During the course of this research I had several mentors at different internships, whose guidance and support contributed greatly to this work. In particular I would like to thank Chris Long, Lance Ward, and Geoff Sanders. They provided the spark for many ideas, and a great working environment to explore them. For their many stimulating conversations on Computer Science, Mathematics and other topics, I would like to thank Philip Bozek, Douglas Tableman, and Walter McClean. Finally, I would like to thank the Tan family for their moral support during the production of this thesis. iv To My Parents, Samuel and Elizabeth Laeuchli v LIST OF TABLES 2.1 Convergence rates of different methods. 11 3.1 Table showing run times of the new algorithm compared to the origi- nal. Results obtained on an Intel i7 860 clocked at 2.8 GHz. 74 vi LIST OF FIGURES 2.1 The area zeroed out by using Hadamard Vectors . 13 2.2 An example of probing a 4-colorable graph . 14 2.3 An example of wasted probing vectors in non-hierarchical probing . 15 3.1 Visualizing a 4-colorable matrix permuted such that all rows corresponding to color 1 appear first, for color 2 appear second, and so on. Each diagonal block is a diagonal matrix. The four probing vectors with 1s in the corresponding blocks are shown on the right. 22 3.2 Crossed out nodes have their contribution to the error canceled by the Hadamard vectors used. Left: the first two, natural order Hadamard vectors do not cancel errors in some distance-1 neighbors in the lex- icographic ordering of a 2-D uniform lattice. Right: if the grid is permuted with the red nodes first, the first and the middle Hadamard vectors completely cancel variance from nearest neighbors and corre- spond to the distance-1 probing vectors. 25 3.3 When doubling the probing distance (here from 1 to 2) we first split the 2-D grid to four conformal 2-D subgrids. Red nodes split to two 2 × 2 grids (red and green), and similarly black nodes split to blues and black. Smaller 2-D grids can then be red-black ordered. 29 vii 3.4 Error in the Tr(A−1) approximation using the MC method with var- ious deterministic vectors. Classic probing requires 2,16,62, and 317 colors for probing distances 1,2,4, and 8, respectively. Left: Clas- sic probing approximates the trace better than the same number of Hadamard vectors taken in their natural order. Going to higher distance-k requires discarding previous work.

Load more