Acta Numerica (2014), pp. 289–368 c Cambridge University Press, 2014 doi:10.1017/S0962492914000051 Printed in the United Kingdom Topological pattern recognition for point cloud data∗ Gunnar Carlsson† Department of Mathematics, Stanford University, CA 94305, USA E-mail:
[email protected] In this paper we discuss the adaptation of the methods of homology from algebraic topology to the problem of pattern recognition in point cloud data sets. The method is referred to as persistent homology, and has numerous applications to scientific problems. We discuss the definition and computation of homology in the standard setting of simplicial complexes and topological spaces, then show how one can obtain useful signatures, called barcodes, from finite metric spaces, thought of as sampled from a continuous object. We present several different cases where persistent homology is used, to illustrate the different ways in which the method can be applied. CONTENTS 1 Introduction 289 2 Topology 293 3 Shape of data 311 4 Structures on spaces of barcodes 331 5 Organizing data sets 343 References 365 1. Introduction Deriving knowledge from large and complex data sets is a fundamental prob- lem in modern science. All aspects of this problem need to be addressed by the mathematical and computational sciences. There are various dif- ferent aspects to the problem, including devising methods for (a) storing massive amounts of data, (b) efficiently managing it, and (c) developing un- derstanding of the data set. The past decade has seen a great deal of devel- opment of powerful computing infrastructure, as well as methodologies for ∗ Colour online for monochrome figures available at journals.cambridge.org/anu.