Topological Graph Persistence
Total Page:16
File Type:pdf, Size:1020Kb
CAIM ISSN 2038-0909 Commun. Appl. Ind. Math. 11 (1), 2020, 72–87 DOI: 10.2478/caim-2020-0005 Research Article Topological graph persistence Mattia G. Bergomi1*, Massimo Ferri2*, Lorenzo Zuffi2 1Veos Digital, Milan, Italy 2ARCES and Dept. of Mathematics, Univ. of Bologna, Italy *Email address for correspondence: [email protected] Communicated by Alessandro Marani Received on 04 28, 2020. Accepted on 11 09, 2020. Abstract Graphs are a basic tool in modern data representation. The richness of the topological information contained in a graph goes far beyond its mere interpretation as a one-dimensional simplicial complex. We show how topological constructions can be used to gain information otherwise concealed by the low-dimensional nature of graphs. We do this by extending previous work in homological persistence, and proposing novel graph-theoretical constructions. Beyond cliques, we use independent sets, neighborhoods, enclaveless sets and a Ramsey-inspired extended persistence. Keywords: Clique, independent set, neighborhood, enclaveless set, Ramsey AMS subject classification: 68R10, 05C10 1. Introduction Currently data are produced massively and rapidly. A large part of these data is either naturally organized, or can be represented, as graphs or networks. Recently, topological persistence has proved to be an invaluable tool for the exploration and understanding of these data types. Albeit graphs can be considered as topological objects per se, given their low dimensionality, only limited information can be obtained by studying their topology. It is possible to grasp more information by superposing higher-dimensional topological structures to a given graph. Many research lines, for ex- ample, focused on building n-dimensional complexes from graphs by considering their (n+1)-cliques (see, e.g., section 1.1). Here, we explore how topological persistence can be used to study both classical and new simplicial complexes drawn from graphs. In particular, we will focus on the novel information disclosed by considering graph-theoretical concepts so far neglected in the literature, at least to our knowledge. In section2 we set basic terminology and notation for simplicial complexes and graphs. section3 is devoted to the definition of several constructions of simplicial complexes based on as many basic graph-theoretical concepts, namely: the simplicial complexes of cliques, neighborhoods, enclaveless and independent sets. For each construction, we discuss both the simplicial complex representation and its filtration. The stability with respect to the bottleneck distance is examined in a separated subsection. Finally, we present an \extended persistence"-like construction based on the Ramsey principle. 1.1. State of the art Graphs and (what would eventually be called) persistence are bound together since the early days in the context of size theory [51]. Graphs were a tool for managing the discretization of filtered spaces in applied contexts. As far as we know, a true use of persistence in the study of graphs per se started with [25], where the clique and neighborhood complexes were built on a time-varying network for application in statistical mechanics (see also [34]). A research aiming at the physical application of persistence to polymer models of hypergraphs is developed in [1]. © 2020 Mattia G. Bergomi, Massimo Ferri, Lorenzo Zuffi, licensee De Gruyter Open. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License. Topological graph persistence Complex networks have been studied|more or less directly|with persistent homology also in [26, 37]. In both cases, the main example is a network of collaborating people; simplices are formed on the basis of relationship measures between members. A study on different classes of networks through persistent homology of the clique complex is the object of [45], where also the relation with the small-world paradigm [52] is considered. Various types of networks are anayzed via persistence of clique complexes in [38]. Connectivity of neural networks is naturally modelled as a graph. The association of a simplicial complex to such graphs allows to gather relevant information about the network as discussed in [21]. The same paper introduces several simplicial constructions associated to graphs and evaluates their efficacy in modelling different neurological features. In this domain, the clique complex plays a major role. This is the case of [46], where a graph is built by considering 83 brain regions as vertices, and edges weighted by considering pairwise region connections through white matter. Correlations of hippocampal place cells in rodents during spatial navigation [36] are the subject of [22]. Even in this case, the clique topology is used. Clique complex, persistence and particular homology generators are the keys to a study on the effect of psilocybin on human brain connections in [31]. When it comes to directed graphs, a challenge is presented by asymmetry. The problem of adapting persistence to this setting was faced in [50] through the directed clique complex (see, e.g., [35] for this construction) and in [9] through Dowker complexes (for which we refer to [16]). An exhaustive, functorial answer is given in [10] by persistent path homology. The homology of directed cliques (with no use of persistence) is the tool for neurological exploration in [15,42]. In the present paper, however, we will consider exclusively undirected graphs. 1.2. Contributions Persistent homology has been largely applied to the analysis of graphs and networks, mainly by associating auxiliary topological constructions to a given graph. However, a more theoretical topological analyses of these constructions is still lacking. We review two known strategies that allow one to associate a simplicial complex to a graph by considering its cliques [35,40] and neighborhoods [25,33]. Then, we introduce two new constructions that leverage the concepts of dominating and independent set of vertices. We prove that these constructions are suitable for persistent homology, i.e. the underlying graph- theoretical concepts give rise to well defined filtered simplicial complexes. We also investigate the topo- logical richness of each construction. To do this, we ask if given a persistence diagram, there exists a graph that will induce it through the construction of choice. Afterwards, we introduce the notion of natural pseudodistance for weighted graphs, borrowing from its definition for topological spaces [13], and discuss the stability of the persistence diagrams obtained from graphs through the aforementioned constructions. Finally, we introduce the concept of extended persistence on the complex of cliques associated to a weighted graph. The extension consisting in considering at the same time the information carried by the complex of cliques associated to both a graph and its complement. We show with an example how this information is not redundant. 2. General preliminaries We fix terminology for simplicial complexes and graphs respectively in sections 2.1 and 2.3. 2.1. Simplicial complexes First, we recall that a simplicial complex K (an abstract simplicial complex in the terminology of many authors) is a set of simplices, where a simplex is a finite set of elements (vertices) of a given set V (K), such that (i) any set of exactly one vertex is a simplex of K 73 M.G. Bergomi, M. Ferri, L. Zuffi (ii) any nonempty subset of a simplex σ of K is a simplex of K; it is called a face of σ [47, Sect. 3.1]. A simplex consisting of n+1 vertices is said to have dimension n and to be an n−simplex. The dimension of K is the maximum dimension of its simplices. A subcomplex of K is a subset of K which is itself a simplicial complex. For the sake of clarity, the simplex having vertices v0; : : : ; vn will be denoted by hv0; : : : ; vni. In the remainder, all simplicial complexes will be finite. A simplicial map from a simplicial complex K to a simplicial complex L is a map h : K ! L such that (i) the image of a vertex of K is a vertex of L and (ii) if σ = hv0; : : : ; vni, then h(v1); : : : ; h(vn) are the (not necessarily distinct) vertices of h(σ). A bijective simplicial map is said to be an isomorphism. If there exists an isomorphism from K to L, then K and L are said to be isomorphic. A standard way (actually a functor) to associate a topological space jKj (the space of K) to a simplicial complex K is through barycentric coordinates [47, Sect. 3.1]. jKj is defined as the set of functions α : K ! [0; 1] such that: • For any α, the set fv 2 K j α(v) 6= 0g is a simplex of K, P • For any α, one has v2K α(v) = 1 and the topology comes from the L2 metric, but the most usual way of thinking of K in geometrical terms is by its possible embeddings into a Euclidean space [47, Sect. 3.2, Thm. 9]. This is what we shall do in both text and figures throughout the article. We refer to [47] for terminology and notions of simplicial and algebraic topology; another very nice reference is [24]. 2.2. Persistent homology Persistent homology is a branch of computational topology of remarkable success in shape analysis and pattern recognition. Its key idea is to analyze data through filtering functions, i.e. continuous functions n f defined on a suitable topological space X with values e.g. in R (but sometimes in R or in a circle). Given a pair (X; f), with f : X ! R continuous, for each u 2 R the sublevel set Xu is the set of elements of X whose value through f is less than or equal to u. For each Xu one can compute the homology modules Hk(Xu). As there exist various homology theories, some additional hypotheses might be imposed on f depending on the choice of the homology.