Matroids and Canonical Forms: Theory and Applications
Total Page:16
File Type:pdf, Size:1020Kb
MATROIDS AND CANONICAL FORMS: THEORY AND APPLICATIONS Gregory F Henselman-Petrusek A DISSERTATION in Electrical and Systems Engineering Presented to the Faculties of the University of Pennsylvania in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy 2017 Supervisor of Dissertation Robert W Ghrist, Andrea Mitchell University Professor of Mathematics and Electrical and Systems Engineering Graduate Group Chairperson Alejandro Ribeiro, Rosenbluth Associate Professor of Electrical and Systems Engineering arXiv:1710.06084v3 [math.CT] 19 Apr 2020 Dissertation Committee: Chair: Alejandro Ribeiro, Rosenbluth Associate Professor of Electrical and Systems Engineering Supervisor: Robert W Ghrist, Andrea Mitchell University Professor of Mathematics and Electrical and Systems Engineering Member: Rakesh Vohra, George A. Weiss and Lydia Bravo Weiss University Professor of Economics and Electrical and Systems Engineering MATROIDS AND CANONICAL FORMS: THEORY AND APPLICATIONS COPYRIGHT 2017 Gregory F Henselman For Linda iii GIATHGIAGIAMOUPOUMOUEDW SETOMUALOGIATONPAPPOUM OUPOUMOUEDWSETOPNEUMA TOUGIATHMHTERAMOUPOUMO UEDWSETHAGAPHTHSGIATONP ATERAMOUPOUMOUEDWSETH NELPIDATOUGIATHNADELFHM OUPOUKRATAEITONDROMOMA SKAIGIATONJAKOOPOIOSFWT IZEITAMATIAMASGIAKAITLINT OUOPOIOUHFILIAEINAIHKALU TERHMOUSUMBOULHGIAJANE THSOPOIASTOERGOEINAIAUTO iv ACKNOWLEDGEMENTS I would like to thank my advisor, Robert Ghrist, whose encouragement and support made this work possible. His passion for this field is inescapable, and his mentorship gave purpose to a host of burdens. A tremendous debt of gratitude is owed to Chad Giusti, whose sound judgement has been a guiding light for the past nine years, and whose mathematical insights first led me to consider research in computation. A similar debt is owed to the colleagues who shared in the challenges of the doctoral program: Jaree Hudson, Kevin Donahue, Nathan Perlmutter, Adam Burns, Eusebio Gardella, Justin Hilburn, Justin Curry, Elaine So, Tyler Kelly, Ryan Rogers, Yiqing Cai, Shiying Dong, Sarah Costrell, Iris Yoon, and Weiyu Huang. Thank you for your friendship and support at the high points and the low. This work has benefitted enormously from interaction with mathematicians who took it upon themselves to help my endeavors. To Brendan Fong, David Lipsky, Michael Robinson, Matthew Wright, Radmila Sazdanovic, Sanjeevi Krishnan, Paweł Dlotko, Michael Lesnick, Sara Kalisnik, Amit Patel, and Vidit Nanda, thank you. Many faculty have made significant contributions to the content of this text, whether directly or indirectly. Invaluable help came from Ortwin Knorr, whose instruction is a constant presence in my writing. To Sergey Yuzvinsky, Charles Curtis, Hal Sadofsky, Christopher Phillips, Dev Sinha, Nicholas Proudfoot, Boris Botvinnik, David Levin, Peng Lu, Alexander Kleshchev, Yuan Xu, James Isenberg, Christopher Sinclair, Peter Gilkey, Santosh Venkatesh, Matthew Kahle, Mikael Vejdemo- Johansson, Ulrich Bauer, Michael Kerber, Heather Harrington, Nina Otter, Vladimir Itskov, Carina Curto, Jonathan Cohen, Randall Kamien, Robert MacPherson, and Mark Goresky, thank you. Special thanks to Klaus Truemper, whose text opened the world of matroid decomposition to my imagination. Special thanks are due, also, to my dissertation committee, whose technical insights continue to excite new ideas for the possibilities of topology in complex systems, and whose coursework led directly to my work in combinatorics. Thank you to Mary Bachvarova, whose advice was decisive in my early graduate education, and whose friendship offered shelter from many storms. Finally, this research was made possible by the encouragement, some ten years ago, of my advisor Erin McNicholas, and of my friend and mentor Inga Johnson. Thank you for bringing mathematics to life, and for showing me the best of what it could be. Your knowledge put my feet on this path, and your faith was the reason I imagined following it. v ABSTRACT MATROIDS AND CANONICAL FORMS: THEORY AND APPLICATIONS Gregory F Henselman-Petrusek Robert W Ghrist This thesis proposes a combinatorial generalization of a nilpotent operator on a vector space. The resulting object is highly natural, with basic connections to a variety of fields in pure mathematics, engineering, and the sciences. For the purpose of exposition we focus the discussion of applications on homological algebra and computation, with additional remarks in lattice theory, linear algebra, and abelian categories. For motivation, we recall that the methods of algebraic topology have driven remarkable progress in the qualitative study of large, noisy bodies of data over the past 15 years. A primary tool in Topological Data Analysis [TDA] is the homological persistence module, which leverages categorical structure to compare algebraic shape descriptors across multiple scales of measurement. Our principle application to computation is a novel algorithm to calculate persistent homology which, in certain cases, improves the state of the art by several orders of magnitude. Included are novel results in discrete, spectral, and algebraic Morse theory, and on the strong maps of matroid theory. The defining theme throughout is interplay between the combinatorial theory matroids and the algebraic theory of categories. The nature of these interactions is remarkably simple, but their consequences in homological algebra, quiver theory, and combinatorial optimization represent new and widely open fields for interaction between the disciplines. vi Contents 1 Introduction 1 2 Notation 8 I Canonical Forms 10 3 Background: Matroids 11 3.1 Independence, rank, and closure . 11 3.2 Circuits . 12 3.3 Minors . 14 3.4 Modularity . 15 3.5 Basis Exchange . 15 3.6 Duality . 17 4 Modularity 18 4.1 Generators . 18 4.2 Minimal Bases . 23 5 Canonical Forms 24 5.1 Nilpotent Canonical Forms . 24 5.2 Graded Canonical Forms . 27 5.3 Generalized Canonical Forms . 30 II Algorithms 34 6 Algebraic Foundations 35 6.1 Biproducts . 35 6.2 Idempotents . 36 6.3 Arrays . 38 6.4 Kernels . 39 6.4.1 The Splitting Lemma . 41 6.4.2 The Exchange Lemma . 41 6.4.3 Idempotents . 45 6.4.4 Exchange . 46 vii 6.5 The Schur Complement . 47 6.5.1 Diagramatic Complements . 50 6.6 Mobius¨ Inversion . 51 7 Exchange Formulae 54 7.1 Relations . 54 7.2 Formulae . 56 8 Exchange Algorithms 60 8.1 LU Decomposition . 60 8.2 Jordan Decomposition . 62 8.3 Filtered Exchange . 67 8.4 Block exchange . 69 III Applications 71 9 Efficient Homology Computation 72 9.1 The linear complex . 72 9.2 The linear persistence module . 74 9.3 Homological Persistence . 75 9.4 Optimizations . 76 9.4.1 Related work . 76 9.4.2 Basis selection . 77 9.4.3 Input reduction . 79 9.5 Benchmarks . 82 10 Morse Theory 84 10.1 Smooth Morse Theory . 85 10.2 Discrete and Algebraic Morse Theory . 87 10.3 Morse-Witten Theory . 90 11 Abelian Matroids 100 11.1 Linear Matroids . 100 11.2 Covariant Matroids . 102 viii List of Tables 9.1 Wall-time in seconds. Comparison results for eleg, Klein, HIV, drag 2, and random are reported for computation on the cluster. Results for fract r are reported for computation on the shared memory system. 82 9.2 Max Heap in GB. Comparison results for eleg, Klein, HIV, drag 2, and random are reported for computation on the cluster. Results for fract r are reported for computation on the shared memory system. 83 ix List of Figures 6.1 Arrays associated to the coproduct structures l and u. Left: 1(u;l). The pth row p p ] n of this array is the unique tuple u such that hu ;wi = up(w) for all w 2 R . Right: 1(l;u). The pth column of this array is the tuple up = up(1). The matrix products 1(u;l)1(l;u) = 1(u;u) and 1(l;u)1(u;l) = 1(l;l) are Dirac delta functions on u × u and l × l, respectively. 40 8.1 Morphisms generating a coproduct structure on Kn+1 = D(T)........... 64 9.1 A population of 600 points sampled with noise from the unit circle in the Euclidean plane. Noise vectors were drawn from the uniform distribution on the disk of radius 0.2, centered at the origin. 79 9.2 Spanning trees in the one-skeleton of the Vieotris-Rips complex with scale pa- rameter 0.5 generated by the point cloud in Figure 9.1. The cardinalities of the fundamental circuits determined by the upper basis sum to 7:3×105, with a median circuit length of 11 edges. The cardinalities of the fundamental circuits determined by the lower basis sum to 4:9 × 106, with a median length of 52 edges. 80 9.3 (a) Sparsity pattern of a subsample of 1200 columns selected at random from the row-reduced node incidence array determined by spanning tree (a) in Figure 9.2. The full array has a total of 7:3 × 105 nonzero entries. (b) Sparsity pattern of a subsample of 1200 columns selected at random from the row-reduced node incidence array determined by the spanning tree (b) in Figure 9.2. The full array has a total of 4:9 × 106 nonzero entries. 81 x Chapter 1 Introduction Motivation: computational homology In the past fifteen years, there has been a steady advance in the use of techniques and principles from algebraic topology to address problems in the data sciences. This new subfield of Topological Data Analysis [TDA] seeks to extract robust qualitative features from large, noisy data sets. At the simplest and most elementary level, one has clustering, which returns something akin to connected components. There are, however, many higher-order notions of global features in connected components of higher-dimensional data sets that are not describable in terms of clustering phenomena. Such “holes” in data are quantified and collated by the classical tools of algebraic topology: homology, and the more recent, data-adapted, parameterized version, persistent homology [16, 26, 38]. Homology and persistent homology will be described in Chapter 9. For the moment, the reader may think of homology as an enumeration of “holes” in a data set, outfitted with the structure of a sequence of vector spaces whose bases identify and enumerate the “essential” holes in a data set.