Matching Matchings

20132013 IEEE Third Third Eastern Eastern European European Regional Regional Conference Conference on onthe the Engineering Engineering of Computerof Computer Based Based Systems Systems Matching matchings Gabor´ Bacso´ Anita Keszler Zsolt Tuza Laboratory of Parallel and Distributed Events Analysis Renyi´ Institute, Budapest; Distributed Systems, MTA SZTAKI Research Group, MTA SZTAKI University of Pannonia, Veszprem´ [email protected] [email protected] [email protected] Abstract—This paper presents the first steps toward a graph be the same, just similar enough. This is the reason why comparison method based on matching matchings, or in other these algorithms are often referred to as error tolerant or words, comparison of independent edge sets in graphs. The approximate graph matchings. novelty of our approach is to use matchings for calculating distance of graphs in case of edge-colored graphs. This idea can The exact subgraph matching for arbitrary graphs is NP- be used as a preprocessing step of graph querying applications, complete [12]. An experimental comparison on the running to speed up exact and inexact graph matching methods. We time of some exact graph matching methods is presented in introduce the notion of colored matchings and prove some [10]. However, in case of special graph classes, for example properties of them in edge colored complete graphs and planar graphs, there exist algorithms with polynomial run- complete bipartite graphs in case of two colors. ning time [16]. We remark here that the following statement Keywords-edge-colored graph; inexact graph matching; col- is an old conjecture: the general isomorphism problem is ored matching neither polynomial nor NP-complete (it is in NP, of course). Although several approaches are also known for speeding I. INTRODUCTION up isomorphism testing as well—for example a heuristic Graph based representation has become one of the main based method in [20] or [13] using random walks—in directions of modeling in pattern recognition during the general for arbitrary graphs inexact graph matching methods last few decades. The main reason of the growing interest have become more popular. These methods also have to in graph based modeling and algorithms is the variety of deal with computational complexity issues (see [1]), but in available graph models leading to expressive and compact case of real datasets and applications flexibility and error data representations. Another motivation is that many graph tolerance are required. based pattern recognition methods have low computational Depending on the nature of application the applied inexact cost. For example graph cut based methods ([21], [17]) or graph matching methods are also varied. In case of image minimum weight spanning tree based algorithms ([15], [14]) comparison or object categorization simple structures, such are applied often in computer vision. as trees are compared (see [22]). Image processing tasks are Graph comparison is a frequently appearing problem in typical examples for the case when the shape of the graphs graph based pattern recognition applications. Graph com- can also be important, since vertices have coordinates (see parison or as it is often called graph matching is an [2]). essential part of algorithms applied in image retrieval, or in However, the most frequently applied approaches are to comparison of molecular compounds, just to mention some compare graphs using a distance measure based on graph application areas. Due to its high importance in theoretical edit distance ([29], [28]) or a maximum common subgraph approaches and engineering applications as well, several ([9]). In case of these metrics, the position of the vertices is papers have investigated this topic, see [5]. irrelevant. The main drawback of matching graphs is the high A detailed survey on graph edit distance is presented computational complexity, since most problems related to in [11]. Despite the number of papers that are concerned this topic belong to the NP-complete problem class. with this topic, very few contributions can be found in The idea is that the objects (fingerprints [25], business the literature about learning the parameters that control the processes [7], molecular compounds, shapes, etc.) are rep- matching ([26], [18]). resented by graphs, and the comparison of these objects is In [3] the author analyzes the connection between the two done by comparing the corresponding graphs. distance measures. As mentioned, matching graphs is a hard problem from Our suggestion is to define a distance function between algorithmic point of view. Two types of graph matching graphs based on a special type of maximum common sub- are usually distinguished: exact and inexact matching. Exact graph searching: finding the maximum common matching matching is also called graph isomorphism. In case of in edge colored graphs. inexact matching, we do not require the two graphs to The paper is organized as follows. In Section II we present 978-0-7695-5064-0/13 $26.00 © 2013 IEEE 85 DOI 10.1109/ECBS-EERC.2013.19 some basic definitions and notation. Section III presents the size of the available input datasets have increased our idea of comparing graphs by matching matchings: rapidly in several areas applying graph-based modeling (web Subsection III-A contains our suggestion in case of graphs analysis, protein-protein interaction networks, etc.). This without edge colors, and Subsection III-B analyzes the case naturally requires the development of efficient graph storing of edge-colored graphs. Some interesting properties of 2- and searching techniques. For example graph indexing and edge-colored complete and complete bipartite graphs are querying receives more and more attention, see [31] or [27]. presented in Section IV. The suggested algorithm for finding Testing relatively easily computable features of graphs help colored matchings in l-edge-colored graphs is introduced reducing the search space (branch-and-bound or tree pruning in Section V with some remarks on special graph classes. techniques). In our case, a pruning condition is the size of Section VI presents test results on evaluating the usefulness the matching in the query graph and the ones in the graph of comparing matchings. Section VII concludes our work database. Comparing a simple structural property can speed and also points out to some directions for future research. up exact and inexact graph matching techniques as well. Let the distance between two graphs be derived from the II. DEFINITIONS AND NOTATION difference of the size of their maximum matchings. That is, =( , ) A simple undirected graph is an ordered pair G V E , let G1 and G2 be two arbitrary graphs. The distance between where( V) = {v1,v2,...,vn} denotes the set of vertices, and these graphs is the following: ⊆ V E 2 (a collection of unordered vertex pairs) denotes the set of edges. The edge between vertex v and v is i j D(G1,G2)=abs(∣M1∣−∣M2∣) (1) denoted by (vi,v j)=eij.Avertexv is incident to edge e if v ∈ e. The number of vertices is called the order of the where ∣Mi∣ is the maximum size of a matching in graph Gi. graph. The complete graph (or clique) Kn on n vertices is B. Comparing matchings of edge colored graphs the graph where each vertex pair is adjacent: (vi,v j) ∈ E for all vi,v j ∈ V. A graph is bipartite if its vertex set V Investigation of matchings in graphs is an extensively can be partitioned into two disjoint sets A and B, such that studied topic, however the main directions of research take each edge in E connects a vertex in A to a vertex in B. graphs into consideration without edge colors. One of the In notation we write G =(A,B,E). Note that the vertex novel aspects of our approach is to compare colored match- bipartition A∪B = V is uniquely determined by a bipartite G ings as well. if and only if G is connected. The complete bipartite graph Definition 1. An edge colored (or edge labeled) graph K , is a bipartite graph, where ∣A∣ = m, ∣B∣ = n and each m n (V,E,c) is a graph such that color c(e ) is the color vertex in A is adjacent to each vertex in B. In an arbitrary ij assigned to edge e . graph two edges are said to be independent if they do not ij have a common vertex. A matching is a set of pairwise Hence, no restriction is put on the color assignments here independent edges. A matching is called a perfect matching (contrary to ‘proper edge coloring’ and many other notions). if every vertex of the graph is incident to exactly one edge of Edge colored graphs offer more possibilities for compar- it. For further introduction to graph theory and algorithmic ing matchings, or calculating the distance of graphs based on complexity, see for example [6]. matchings, than the ones without edge colors. The first idea is to extend Equation 1, to handle more colors, as follows: III. COMPARING MATCHINGS OF TWO GRAPHS √ A. Comparing matchings of graphs without edge colors nc D1 (G ,G )= ∑ w (∣M , ∣−∣M , ∣)2 (2) Finding the largest common subgraph of two graphs is in color 1 2 i ci 1 ci 2 i=1 general an NP-hard problem. Our suggestion is to modify (or th specialize) the idea of finding the largest common subgraph where nc is the number of colors, ci is the i color, and ∣ ∣ = , ≤ ≤ to finding the largest common matching of two graphs. Mci, j (for j 1 2 and 1 i nc) is the size of a maximum Matchings are an appropriate choice for comparing graphs matching in the subgraph of G j containing only the edges without colors, since it is relatively easy to find a matching with color ci.

Matching Matchings

Rainbow Matchings and Transversals∗

Parameterized Algorithms and Kernels for Rainbow Matching

Existences of Rainbow Matchings and Rainbow Matching Covers Lo, Allan

Complexity Results for Rainbow Matchings

Benny Sudakov Curriculum Vitae

On Latin Squares and Avoidable Arrays for Daniel, and for What Springtime Brings

Large Matchings in Bipartite Graphs Have a Rainbow Matching

Complexity Results for Rainbow Matchings 3

Multicolored Matchings in Hypergraphs, Moscow Journal Of

Size Conditions for the Existence of Rainbow Matchings

Minimum Color-Degree Perfect B-Matchings

Vertices of a Bipartite Graph G and the Coalitions Are the Edges of G, the Stable Marriage Theorem Says There Exists a Stable Allocation