Algorithm Design Using Spectral Graph Theory

Algorithm Design Using Spectral Graph Theory Richard Peng CMU-CS-13-121 August 2013 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Thesis Committee: Gary L. Miller, Chair Guy E. Blelloch Alan Frieze Daniel A. Spielman, Yale University Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy. Copyright © 2013 Richard Peng This research was supported in part by the National Science Foundation under grant number CCF-1018463, by a Microsoft Research PhD Fellowship, by the National Eye Institute under grant numbers R01-EY01317-08 and RO1- EY11289-22, and by the Natural Sciences and Engineering Research Council of Canada under grant numbers M- 377343-2009 and D-390928-2010. Parts of this work were done while at Microsoft Research New England. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the funding parties. Keywords: Combinatorial Preconditioning, Linear System Solvers, Spectral Graph Theory, Parallel Algorithms, Low Stretch Embeddings, Image Processing To my parents and grandparents iv Abstract Spectral graph theory is the interplay between linear algebra and combinatorial graph theory. Laplace’s equation and its discrete form, the Laplacian matrix, appear ubiquitously in mathematical physics. Due to the recent discovery of very fast solvers for these equations, they are also becoming increasingly useful in combinatorial opti- mization, computer vision, computer graphics, and machine learning. In this thesis, we develop highly efficient and parallelizable algorithms for solving linear systems involving graph Laplacian matrices. These solvers can also be extended to symmetric diagonally dominant matrices and M-matrices, both of which are closely related to graph Laplacians. Our algorithms build upon two decades of progress on combinatorial preconditioning, which connects numerical and combinatorial algorithms through spectral graph theory. They in turn rely on tools from numerical analysis, metric embeddings, and random matrix theory. We give two solver algorithms that take diametrically opposite approaches. The first is motivated by combinatorial algorithms, and aims to gradually break the problem into several smaller ones. It represents major simplifications over previous solver constructions, and has theoretical running time comparable to sorting. The second is motivated by numerical analysis, and aims to rapidly improve the algebraic connectiv- ity of the graph. It is the first highly efficient solver for Laplacian linear systems that parallelizes almost completely. Our results improve the performances of applications of fast linear system solvers ranging from scientific computing to algorithmic graph theory. We also show that these solvers can be used to address broad classes of image processing tasks, and give some preliminary experimental results. vi Acknowledgments This thesis is written with the help of many people. First and foremost I would like to thank my advisor, Gary Miller, who introduced me to most of the topics discussed here and helped me throughout my studies at CMU. My thesis committee members, Guy Blelloch, Alan Frieze, and Daniel Spielman, provided me with invaluable advice during the dissertation process. I am also grateful towards Ian Munro and Daniel Sleator for their constant guidance. I am indebted to the co-authors who I had the fortune to work with during my graduate studies. Many of the ideas and results in this thesis are due to collaborations with Yiannis Koutis, Kanat Tangwongsan, Hui Han Chin, and Shen Chen Xu. Michael Cohen, Jakub Pachocki, and Shen Chen Xu also provided very helpful comments and feedback on earlier versions of this document. I was fortunate to be hosted by Aleksander M ˛adryand Madhu Sudan while interning at Microsoft Research New England. While there, I also had many enlightening discussions with Jonathan Kelner and Alex Levin. I would like to thank my parents and grandparents who encouraged and cultivated my interests; and friends who helped me throughout graduate school: Bessie, Eric, Hui Han, Mark, Neal, Tom, and Travis. Finally, I want to thank the CNOI, CEMC, IOI, USACO, and SGU for developing my problem solving abilities, stretching my imagination, and always giving me something to look forward to. vii viii Contents 1 Introduction 1 1.1 Graphs and Linear Systems . 1 1.2 Related Work . 4 1.3 Structure of This Thesis . 6 1.4 Applications . 8 1.5 Solving Linear Systems . 9 1.6 Matrices and Similarity . 14 2 Nearly O(m log n) Time Solver 23 2.1 Reductions in Graph Size . 23 2.2 Ultra-Sparsification Algorithm . 26 2.3 Recursive Preconditioning and Nearly-Linear Time Solvers . 30 2.4 Reuse, Recycle, and Reduce the Tree . 34 2.5 Tighter Bound for High Stretch Edges . 36 2.6 Stability Under Fixed Point Arithmetic . 39 3 Polylog Depth, Nearly-Linear Work Solvers 47 3.1 Overview of Our Approach . 49 3.2 Parallel Solver . 51 3.3 Nearly-Linear Sized Parallel Solver Chains . 55 3.4 Alternate Construction of Parallel Solver Chains . 63 4 Construction of Low-Stretch Subgraphs 75 4.1 Low Stretch Spanning Trees and Subgraphs . 76 4.2 Parallel Partition Routine . 79 ix 4.3 Parallel Construction of Low Stretch Subgraphs . 88 4.4 Construction of low `↵-stretch spanning trees . 93 5 Applications to Image Processing 95 5.1 Background and Formulations . 97 5.2 Applications . 99 5.3 Related Algorithms . 101 5.4 Approximating Grouped Least Squares Using Quadratic Minimization . 102 5.5 Evidence of Practical Feasibility . 111 5.6 Relations between Graph Problems and Minimizing LASSO Objectives . 112 5.7 Other Variants . 116 5.8 Remarks . 117 6 Conclusion and Open Problems 119 6.1 Low Stretch Spanning Trees . 119 6.2 Interaction between Iterative Methods and Sparsification . 120 Bibliography 120 A Deferred Proofs 133 B Spectral Sparsification by Sampling 135 B.1 From Matrix Chernoff Bounds to Sparsification Guarantees . 136 B.2 Proof of Concentration Bound . 137 C Partial Cholesky Factorization 145 C.1 Partial Cholesky Factorization on Graph Laplacians . 146 C.2 Errors Under Partial Cholesky Factorization . 147 D Iterative Methods 155 D.1 Richardson Iteration . 155 D.2 Chebyshev Iteration . 158 D.3 Chebyshev Iteration with Round-off Errors . 163 x List of Figures 1.1 Representing a social network as a graph. 2 1.2 A resistive electric network with resistors labeled by their conductances (f).... 3 1.3 Call structure of a 3-level W-cycle algorithm corresponding to the runtime recur- rence given in Equation 1.5 with t =2. Each level makes 2 calls in succession to a problem that’s 4 times smaller. 12 1.4 Call structure of a 3-level V-cycle algorithm. Each level makes 1 calls to a problem of comparable size. A small number of levels leads to a fast algorithm. 14 2.1 The effective resistance RT (e) of the blue off-tree edge in the red tree is 1/4+ 1/5+1/2=0.95. Its stretch strT (e)=weRT (e) is (1/4+1/5+1/2)/(1/2) = 1.9 27 4.1 Cycle on n vertices. Any tree T can include only n 1 of these edges. If the graph − is unweighted, the remaining edge has stretch n 1.................77 − 4.2 Two possible spanning trees of the unit weighted square grid, shown with red edges. 77 4.3 Low stretch spanning tree for the square grid (in red) with extra edges of high stretch (narrower, in green). These edges can be viewed either as edges to be resolved later in the recursive algorithm, or as forming a subgraph along with the tree .......................................... 79 5.1 Outputs of various denoising algorithms on image with AWGN noise. From left to right: noisy version, Split Bregman [GO09], Fixed Point [MSX11], and Grouped Least Squares. Errors listed below each figure from left to right are `2 and `1 norms of differences with the original. 111 5.2 Applications to various image processing tasks. From left to right: image segmen- tation, global feature extraction / blurring, and decoloring. 113 5.3 Example of seamless cloning using Poisson Image Editing . 114 xi xii Chapter 1 Introduction Solving a system of linear equations is one of the oldest and possibly most studied computational problems. There is evidence that humans have been solving linear systems to facilitate economic activities over two thousand years ago. Over the past two hundred years, advances in physical sciences, engineering, and statistics made linear system solvers a central topic of applied mathe- matics. This popularity of solving linear systems as a subroutine is partly due to the existence of efficient algorithms: small systems involving tens of variables can be solved reasonably fast even by hand. Over the last two decades, the digital revolution led to a wide variety of new applications for linear system solvers. At the same time, improvements in sensor capabilities, storage devices, and networks led to sharp increases in data sizes. Methods suitable for systems with a small number of variables have proven to be challenging to scale to the large, sparse instances common in these applications. As a result, the design of efficient linear system solvers remains a crucial algorithmic problem. Many of the newer applications model entities and their relationships as networks, also known as graphs, and solve linear systems based on these networks. The resulting matrices are called graph Laplacians, which we will formally define in the next section. This approach of extracting information using graph Laplacians has been referred to as the Laplacian Paradigm [Ten10]. It has been successfully applied to a growing number of areas including clustering, machine learning, computer vision, and algorithmic graph theory. The main focus of this dissertation is the routine at the heart of this paradigm: solvers for linear systems involving graph Laplacians.

Load more