<<

On the unification of the graph and graph matching problems Romain Raveaux

To cite this version:

Romain Raveaux. On the unification of the graph edit distance and graph matching problems. Pattern Recognition Letters, Elsevier, 2021, 145, ￿10.1016/j.patrec.2021.02.014￿. ￿hal-03163084v2￿

HAL Id: hal-03163084 https://hal.archives-ouvertes.fr/hal-03163084v2 Submitted on 19 Mar 2021

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. On the unification of the graph edit distance and graph matching problems

Romain Raveaux1

1Universit´ede Tours, Laboratoire d’Informatique Fondamentale et Appliqu´eede Tours (LIFAT - EA 6300), 64 Avenue Jean Portalis, 37000 Tours, France

7th March 2020

Abstract Error-tolerant graph matching gathers an important family of problems. These problems aim at finding correspondences between two graphs while integrating an error model. In the Graph Edit Distance (GED) problem, the insertion/deletion of edges/nodes from one graph to another is explicitly expressed by the error model. At the opposite, the problem commonly referred to as “graph matching” does not explicitly express such operations. For decades, these two problems have split the research community in two separated parts. It resulted in the design of different solvers for the two problems. In this paper, we propose a unification of both problems thanks to a single model. We give the proof that the two problems are equivalent under a reformulation of the error models. This unification makes possible the use on both problems of existing solving methods from the two communities. Keywords— Graph edit distance, graph matching, error-correcting graph matching, discrete optimiza- tion

Graphical Abstract

1 1 Introduction then Yi,k = 1, and Yi,k = 0 otherwise. We denote by y ∈ {0, 1}n1.n2, a column-wise vectorized replica Graphs are frequently used in various fields of com- of Y . With this notation, graph matching problems puter science, since they constitute a universal mod- can be expressed as finding the assignment vector ∗ eling tool which allows the description of structured y that maximizes a score function S(G1,G2, y) as data. The handled objects and their relations are follows: described in a single and human-readable formal- ism. Hence, tools for graphs supervised classification Model 1. Graph matching model (GMM) and graph mining are required in many applications ∗ such as pattern recognition (Riesen, 2015), chemi- y =argmax S(G1,G2, y) (1a) cal components analysis, transfer learning (Das and y Lee, 2018). In such applications, comparing graphs subject to yi,k ∈ {0, 1} ∀(ui, vk) ∈ V1 × V2 (1b) is of first interest. The similarity or dissimilarity be- X yi,k ≤ 1 ∀vk ∈ V2 (1c) tween two graphs requires the computation and the ui∈V1 evaluation of the “best” matching between them. X Since exact isomorphism rarely occurs in pattern yi,k ≤ 1 ∀ui ∈ V1 (1d) analysis applications, the matching process must be vk∈V2 error-tolerant, i.e., it must tolerate differences on the where equations (1c),(1d) induces the matching topology and/or its labeling. The Graph Edit Dis- constraints, thus making y an assignment vector. tance (GED)(Riesen, 2015) problem and the Graph The function S(G ,G , y) measures the similarity Matching problem (GM) (Swoboda et al., 2017) pro- 1 2 of graph attributes, and is typically decomposed into vide two different error models. These two problems a first order similarity function s(u → v ) for a have been deeply studied but they have split the re- i k node pair u ∈ V and v ∈ V , and a second-order search community into two groups of people devel- i 1 k 2 similarity function s(e → e ) for an edge pair e ∈ oping separately quite different methods. ij kl ij E and e ∈ E . Thus, the objective function of In this paper, we propose to unify the GED prob- 1 kl 2 graph matching is defined as: lem and the GM problem in order to unify the work force in terms of methods and benchmarks. We show X X S(G ,G , y) = s(u → v ) · y that the GED problem can be equivalent to the GM 1 2 i k i,k u ∈V v ∈V problem under certain (permissive) conditions. The i 1 k 2 X X paper is organized as follows: Section 2, we give the + s(eij → ekl) · yik · yjl definitions of the problems. Section 3, the state of eij ∈E1 ekl∈E2 the art on GM and GED is presented along with the (2) literature comparing GED and GM to other prob- In essence, the score accumulates all the similarity lems. Section 4, a specific related work is detailed values that are relevant to the assignment. The GM since it is the basement of our reasoning. Section problem has been proven to be NP-hard by (Garey 5, our proposal is described and a proof is given. and Johnson, 1979). Section 6, experimental results are presented to val- idate empirically our proposal. Finally, conclusions are drawn. 2.2 Graph Edit Distance The graph edit distance (GED) was first reported in (Tsai et al., 1979). GED is a dissimilarity mea- 2 Definitions and problems sure for graphs that represents the minimum-cost sequence of basic editing operations to transform a In this section, we define the problems to be studied. graph into another graph by means classically in- An attributed graph is considered as a set of 4 tuples cluded operations: insertion, deletion and substitu- (V ,E,µ,ζ) such that: G = (V ,E,µ,ζ). V is a set of tion of vertices and/or edges. Therefore, GED can vertices. E is a set of edges such as E ⊆ V × V . µ be formally represented by the minimum cost edit is a vertex labeling function which associates a label path transforming one graph into another. Edge to a vertex. ζ is an edge labeling function which operations are taken into account in the matching associates a label to an edge. process when substituting, deleting or inserting their adjacent vertices. From now on and for simplicity, 2.1 Graph matching problem we denote the substitution of two vertices ui and vk by (ui → vk), the deletion of vertex ui by (ui → ) The objective of graph matching is to find correspon- and the insertion of vertex vk by ( → vk). Like- dences between two attributed graphs G1 and G2.A wise for edges eij and ekl,(eij → ekl) denotes edges solution of graph matching is defined as a subset of substitution, (eij → ) and ( → ekl) denote edges possible correspondences Y ⊆ V1 × V2, represented deletion and insertion, respectively. by a binary assignment matrix Y ∈ {0, 1}n1×n2, An edit path (λ) is a set of edit operations o. where n1 and n2 denote the number of nodes in G1 This set is referred to as Edit Path and it is defined and G2, respectively. If ui ∈ V1 matches vk ∈ V2, in Definition1.

2 Definition 1. Edit Path Property 1. The edges matching are driven by the A set λ = {o1, ··· , ok} of k edit operations o that vertices matching transform G1 completely into G2 is called an (com- Assuming that constraint defined in Definition4 plete) edit path. is satisfied then three cases can appear : Case 1: If there is an edge eij = (ui, uj ) ∈ E1 and an Let c(o) be the cost function measuring the edge ekl = (vk, vl) ∈ E2, edges substitution between strength of an edit operation o. Let Γ(G1,G2) be (ui, uj ) and (vk, vl) is performed (i.e., (eij → ekl)). the set of all possible edit paths (λ). The graph edit Case 2: If there is an edge eij = (ui, uj ) ∈ E1 and distance problem is defined by Problem1. there is no edge between vk and vl then an edge dele- tion of (ui, uj ) is performed (i.e., (eij → )). Problem 1. Graph Edit Distance Case 3: If there is no edge between ui and uj Let G1 = (V1,E1,µ1,ζ1) and G2 = (V2,E2,µ2,ζ2) be and there is an edge between and an edge ekl = two graphs, the graph edit distance between G1 and (vk, vl) ∈ E2 then an edge insertion of (vk, vl) is G2 is defined as: performed (i.e., ( → ekl)). X dmin(G1,G2) = min c(o) (3) The GED problem defined in Problem1 and re- λ∈Γ(G1,G2) o∈λ fined with constraints defined in Definitions2,3 and 4 is referred in the literature and in this paper as the GED problem. The GED problem has been proven The GED problem is a minimization problem and to be NP-hard by (Zeng et al., 2009). dmin is the best distance. In its general form, the GED problem (Problem1) is very versatile. The problem has to be refined to cope with the con- 2.3 Related problems and models straints of an assignment problem. First, let us de- fine constraints on edit operations (oi) in Definition GED and GM problems fall into the family of 2. error-tolerant graph matching problems. GED and GM problems can be equivalent to another prob- Definition 2. Edit operations constraints lem called Quadratic Assignment Problem (QAP) (Bougleux et al., 2017; Cho et al., 2013). In addi- 1. Deleting a vertex implies deleting all its inci- tion, GED and GM problems can be equivalent to dent edges. a constrained version of the Maximum a posteriori 2. Inserting an edge is possible only if the two ver- (MAP)-inference problem of a Conditional Random tices already exist or have been inserted. Field (CRF) (Swoboda et al., 2017). All these prob- 3. Inserting an edge must not create more than lems can be expressed by mathematical models. A one edge between two vertices. mathematical model is composed of variables, con- straints and an objective functions. A single prob- lem can be expressed by many different models. An Second, let us define constraints on edit paths (λ) Integer Quadratic Program (IQP) is a model with a in Definition3. This type of constraint prevents the quadratic objective function of the variables and lin- edit path to be composed of an infinite number of ear constraints of the variables. We chose to present edit operations. the GM problem as an IQP (Model1). At the op- posite, an Integer Linear Program (ILP) is a math- Definition 3. Edit path constraints ematical model where the objective function is a 1. k is a finite positive integer. linear combination of the variables. The objective function is constrained by linear combinations of the 2. A vertex/edge can have at most one edit oper- variables. ation applied on it.

Finally, let us define the topological constraint in 3 State of the art Definition4. This type of constraints avoids edges to be matched without respect to their adjacent ver- In this section, the state of the art is presented. tices. First, the solution methods for GED and GM are described. Finally, papers comparing GED to other Definition 4. Topological constraint matching problems are mentioned. The topological constraint implies that matching (substituting) two edges (ui, uj ) ∈ E1 and (vk, vl) ∈ E2 is valid if and only if their incident vertices are 3.1 State of the art on GM and matched (ui → vk) and (uj → vl). GED An important property of the GED can be in- The GED and GM problems have been proven to be ferred from the topological constraint defined in Def- NP-hard. So, unless P = NP, solving the problem inition4. to optimality cannot be done in polynomial time of

3 the size of the input graphs. Consequently, the run- 3.2 State of the art on comparing time complexity of exact methods is not polynomial GED problems to others but exponential with respect to the number of ver- tices of the graphs. On the other hand, heuristics are Neuhaus and Bunke (Neuhaus and Bunke., 2007) used when the demand for low computational time have shown that if each operation cost satisfies the dominates the need to obtain optimality guarantees. criteria of a distance (positivity, uniqueness, symme- try, triangular inequality) then the edit distance de- GM methods Many solver paradigms were put fines a between graphs and it can be inferred to the test for GM. These include relaxations that if GED(G1,G2) = 0 ⇔ G1 = G2. Further- based on Lagrangean decompositions (Swoboda more, it has been shown that standard concepts from et al., 2017; Torresani et al., 2013), convex/concave graph theory, such as , subgraph quadratic (Liu and Qiao, 2014) (GNCCP) and semi- isomorphism, and maximum common subgraph, are definite programming (Schellewald and Schn¨orr, special cases of the GED problem under particular 2005) , which can be used either directly to ob- cost functions (Bunke, 1997, 1999; Brun et al., 2012). tain approximate solutions or just to provide lower bounds. To tighten these bounds several cutting Deadlocks, contributions and motivations plane methods were proposed (Bazaraa and Sherali, From the literature, two main deadlocks can be 1982). On the other side, various primal heuristics, drawn. First, GED and GM problems split the re- both (i) deterministic, such as graduated assign- search community in two parts. People working on ment methods (Gold and Rangarajan, 1996), fixed GED do not work on GM and vice and versa. They point iterations (Leordeanu et al., 2009) (IPFP), do not contribute to the same journals and confer- spectral technique and its derivatives (Cour et al., ences. Second, these two communities do not use 2007; Leordeanu and Hebert, 2005) and (ii) non- the same methods to solve their problem while they deterministic (stochastic), like random walk (Cho have mainly the same applications fields (computer et al., 2010) were proposed to provide approximate vision, chemoinformatics, ...). Researchers work- solutions to the problem. Many of these methods ing on GM problems have concentrated their ef- were published in TPAMI, NIPS, CVPR, ICCV. forts on the QAP and MAP-inference solvers (Frank- Wolfe like methodology (Leordeanu et al., 2009; Liu GED methods Exact GED algorithms were and Qiao, 2014), Lagrangian decomposition meth- proposed based on search (Tsai et al., 1979; ods (Swoboda et al., 2017; Torresani et al., 2013), Riesen et al., 2007; Abu-Aisheh et al., 2015). An- ...). On the other hand, the community working other way to build exact methods is to model the on the GED problem has favored LSAP-based and problem by Integer Linear Programs. Then, a black tree-based methods. box solver is used to obtain solutions (Justice and Our motivation is to gather people working on Hero, 2006; Lerouge et al., 2017). In addition, the GED and GM problems because methods and GED community worked on simplifications of the benchmarks built from one community could help GED problem to the Linear Sum Assignment Prob- the other. A first step forward has been done by lem (LSAP) (Bougleux et al., 2017; Serratosa, 2015; (Bougleux et al., 2017) by modelling the GED prob- Riesen and Bunke, 2009). The GED problem was lem as a specific QAP and using modified solvers modeled as a QAP (Bougleux et al., 2017). Let from the graph matching community. However, our us named this model GEDQAP. The GEDQAP proposal stands apart from their work because we model has extra variables to cope with the insertion propose a single model to express the GM and and deletions cases and all costs are represented by the GED problems. In this direction, we propose 2 2 a (|V1| + |V2|) × (|V1| + |V2|) matrix D. The cost more investigations to compare GED and GM prob- matrix D can be decomposed as follows into four lems. We propose a theoretical study to relate GM blocks of size (|V1| + |V2|) × (|V1| + |V2|). The left and GED problems. Our contribution is to prove upper block of the matrix D contains all possible that GED and GM problems are equivalent in terms edge substitutions, the diagonal of the right upper of solutions under a reformulation of the similarity matrix represents the cost of all possible edge dele- function. Consequently, all the methods solving the tions and the diagonal of the bottom left corner con- GM problem can be used to solve the GED prob- tains all possible edge insertions. Finally, the bot- lems. tom right block elements cost is set to a large con- stant w which concerns the matching of  −  edges. 2 The GEDQAP model has (|V1|+|V2|) variables and 4 Related works: Integer (|V1| + |V2|) + (|V1| + |V2|) constraints. The cost 2 2 matrix size is (|V1| + |V2|) × (|V1| + |V2|) . Based Linear Program for GED on this GEDQAP model, modified versions of IPFP (Bougleux et al., 2017) and GNCCP (Bougleux In (Lerouge et al., 2017), an ILP was proposed to et al., 2017) were proposed. Finally, many GED model the GED problem. This model will play an methods were published in PRL, PR, Image and Vi- important role in our proposal so we propose to give sion Computing, GbR, SSPR. a brief definition of this model. For each type of edit

4 Table 1: Definition of binary variables of the eij and ekl can be mapped only if their tail vertices ILP. are mapped:

Name Card Role zij,kl ≤ yj,l ∀(eij , ekl) ∈ E1 × E2 (8) yi,k ∀(ui, vk) ∈ V1 × V2 =1 if ui is substi- tuted with vk zij,kl ∀(eij , ekl) ∈ E1 × E2 =1 if eij is substi- The insertions and deletions variables a, b, g tuted with ekl and h help the reader to understand how the ob- ai ∀ui ∈ V1 =1 if ui is deleted jective function and the constraints were obtained, from G1 bij ∀eij ∈ E1 =1 if eij is deleted but they are unnecessary to solve the GED prob- from G1 lem. In the equation (4), the variables a, b, g and h g ∀v ∈ V =1 if v is inserted k k 2 k are replaced by their expressions deduced from the in G1 hkl ∀ekl ∈ E2 =1 if ekl is inserted equations (5a), (5b), (6a) and (6b). For instance, in G1 from the equation (5a), the variable a is deduced: P ai = 1 − yi,k and replaced in the equation vk∈V2 (4), the part of the objective function concerned by operation, a set of corresponding binary variables is variable a becomes: defined in Table1. The objective function (4) is the overall cost in- X X c(ui → ) · ai = c(ui → ) duced by an edit path (y, z, a, b, g, h) that transforms ui∈V1 ui∈V1 a graph G1 into a graph G2. In order to get the X X + −c(ui → ).yi,k graph edit distance between G1 and G2, this objec- tive function must be minimized. ui∈V1 vk∈V2 (9)

X X Consequently, a new objective function is expressed C(y, z, a, b, g, h) = c(ui → vk) · yi,k as follows: ui∈V1 vk∈V2 X X X 0 X X  + c(eij → ekl) · zij,kl + c(ui → ) · ai C (y, z) =γ + c(ui → vk)

eij ∈E1 ekl∈E2 ui∈V1 ui∈V1 vk∈V2 X X  + c( → vk) · gk + c(eij → ) · bij − c(ui → ) − c( → vk) · yi,k vk∈V2 eij ∈E1 X X  ! + c(eij → ekl) X e ∈E1 e ∈E2 + c( → ekl) · hkl ij kl  e ∈E2 kl − c(eij → ) − c( → ekl) · zij,kl (4) X X Now, the constraints are presented. They are with γ = c(ui → ) + c( → vk) mandatory to guarantee that the admissible solu- ui∈V1 vk∈V2 tions of the ILP are edit paths that transform G1 in X X + c(eij → ) + c( → ekl) G2. The constraint (5a) ensures that each vertex of G is either mapped to exactly one vertex of G or eij ∈E1 ekl∈E2 1 2 (10) deleted from G , while the constraint (5b) ensures 1 Equation (10) shows that the GED can be ob- that each vertex of G is either mapped to exactly 2 tained without explicitly computing the variables one vertex of G or inserted in G : 1 1 a, b, g and h. Once the formulation solved, all inser- X ai + yi,k = 1 ∀ui ∈ V1 (5a) tion and deletion variables can be a posteriori de-

vk∈V2 duced from the substitution variables. X The vertex mapping constraints (5a) and (5b) g + y = 1 ∀v ∈ V (5b) k i,k k 2 are transformed into inequality constraints, without u ∈V i 1 changing their role in the program. As a side effect, The same applies for edges: it removes the a and g variables from the constraints: X bij + zij,kl = 1 ∀eij ∈ E1 (6a) X yi,k ≤ 1 ∀ui ∈ V1 (11) ekl∈E2 vk∈V2 X h + z = 1 ∀e ∈ E (6b) kl ij,kl kl 2 X eij ∈E1 yi,k ≤ 1 ∀vk ∈ V2 (12) ui∈V1 The topological constraints defined in Definition 4 can be expressed with the following constraints (7) In fact, the insertions and deletions variables a and (8): and g of the equations (5a) and (5b) can be seen as slack variables to transform inequality constraints eij and ekl can be mapped only if their head ver- tices are mapped: to equalities and consequently providing a canoni- cal form. The entire formulation is called F2 and zij,kl ≤ yi,k ∀(eij , ekl) ∈ E1 × E2 (7) described as follows :

5  Model 2. F2 Proof. 1. By setting d(ui → vk) = c(ui → vk) −

0  min C (y, z) (13a) c(ui → ) − c( → vk) and d(eij → ekl) = y,z   X c(eij → ekl)−c(eij → )−c( → ekl) , we can subject to yi,k ≤ 1 ∀ui ∈ V1 (13b)

vk∈V2 rewrite the objective function of F2 as follows X : yi,k ≤ 1 ∀vk ∈ V2 (13c)

ui∈V1 0 X X C (y, z) =γ + d(u → v ) · y X i k ui,vk zij,kl ≤ yi,k ∀vk ∈ V2, ∀eij ∈ E1 ui∈V1 vk∈V2

ekl∈E2 X X + d(e → e ) · z (13d) ij kl ij,kl e ∈E e ∈E X ij 1 kl 2 zij,kl ≤ yj,l ∀vl ∈ V2, ∀eij ∈ E1 X X with γ = c(ui → ) + c( → vk) ekl∈E2 u ∈V v ∈V (13e) i 1 k 2 X X + c(eij → ) + c( → ekl) with yi,k ∈ {0, 1} ∀(ui, vk) ∈ V1 × V2 e ∈E e ∈E (13f) ij 1 kl 2 (14) zij,kl ∈ {0, 1} ∀(eij , ekl) ∈ E1 × E2 (13g) 2. γ does not depend on variables so it does not impact the optimization problem. Therefore γ can be removed.

γ is not a function of y and z. It does not impact 0 3. By setting s (ui → vk) = −d(ui → vk) = the minimization problem. However, γ is mandatory   to obtain the GED value (i.e. dmin(G1,G2) from − c(ui → vk)−c(ui → )−c( → vk) and sim- 0 Problem1). The topological constraints (7) and (8) ilarly, s (eij → ekl) = −d(eij → ekl), we can are expressed in another way and are replaced by rewrite the objective function C0 of the model the constraints (13d) and (13e). F2 to obtain S0.

5 Proposal on the unifica- 0 X X 0 S (y, z) = s (ui → vk) · yi,k

tion of the two problems ui∈V1 vk∈V2 X X 0 In this paragraph, we propose to draw a relation + s (eij → ekl) · zij,kl between the graph matching and graph edit distance eij ∈E1 ekl∈E2 problems. Especially, we create a link between both (15) problems through a change of similarity functions. 4. In a general way, minimizing f(x) is equivalent Our proposal can be stated as follows: to maximize -f(x). So, minimizing C0 is equiv- 0 Proposition 1. GM and GED problems are alent to maximize S . equivalent in terms of solutions under a re- 5. The linear objective function S0 can be turned 0 formulation of the similarity function s (ui → into a quadratic function by removing variables vk) = − (c(ui → vk) − c(ui → ) − c( → vk)) z and replacing them by product of y variables. 0 and s (eij → ekl) = − (c(eij → ekl) − c(eij → ) − c( → ekl)) 00 X X 0 To intuitively demonstrate the exactness of the S (y) = s (ui → vk) · yi,k proposition, we proceed as follows : ui∈V1 vk∈V2 X X 0 1. We start from the GED problem expressed by + s (eij → ekl) · yi,k · yj,l

model F2 (see Model2). eij ∈E1 ekl∈E2 2. We link the similarity function s with the cost (16) function c thanks to a new similarity function 6. Topological constraints (Equations (13d) and 0 s . (13e)) in F2 are not necessary anymore and 0 3. With this similarity function s , we show that they can be removed. The product of yi,k and F2 turns to be a maximization problem and we yj,l is enough to ensure that an edge eij ∈ E1 call this new model F2’. can be matched to an edge ekl ∈ E2 only if 4. F2’ is modified by switching from a linear to a the head vertices ui ∈ V1 and vk ∈ V2, on the quadratic model called GMM’. one hand, and if the tail vertices uj ∈ V1 and vl ∈ V2, on the other hand, are respectively 5. GMM’ is identical to GMM. It is sufficient to matched. show that both models express the same prob- lem, that is to say, the graph matching problem. 7. We obtain the new model named GMM’:

6 the graph matching problem can be used to solve the graph edit distance problem under a specific similar- 0  ity function s (ui → vk) = − c(ui → vk) − c(ui →  ) − c( → vk) .

6 Experiments

In this section, we show the results of our nu- merical experiments to validate our proposal that the model GMM’ can model the GED problem if s0(i → k) = −{c(i → k) − c(i → ) − c( → k)}. We based our protocol on the ICPR GED contest1. Among the data sets available, we chose the GREC data set for two reasons. First, graphs sizes range from 5 to 20 nodes and these sizes are amendable to compute optimal solutions. Second, the GREC cost function, defined in the contest, is complex enough to cover a large range of matching cases. This cost function is not a constant value and includes eu- clidean distances between point coordinates. The reader is redirected to (Abu-Aisheh et al., 2017) for the full definition of the cost function. From Figure 1: A comparison of the graph matching the GREC database, we chose the subset of graphs and GED problems when the similarity function called ”MIXED” because it holds 10 graphs of var- s0(i → k) = −{c(i → k) − c(i → ) − c( → k)} ious sizes. We computed all the pairwise compar- isons to obtain 100 solutions. We compared the op- timal solutions obtained by our Model GMM’ and the optimal solutions found by the straightforward Model 3. GMM’ ILP formulation called F1 (Lerouge et al., 2017). We y∗ =argmax S00(y) (17a) computed the average difference between the GED y values and the objective function values of our model X GMM’. The average difference is exactly equal subject to yi,k ≤ 1 ∀vk ∈ V2 (17b) to zero. This result corroborates our theoret- ui∈V1 X ical statement. Detailed results and codes can be yi,k ≤ 1 ∀ui ∈ V1 (17c) found on the website https://sites.google.com/ vk∈V2 view/a-single-model-for-ged-and-gm. with yi,k ∈ {0, 1} ∀(ui, vk) ∈ V1 × V2 (17d) 7 Conclusion

8. Model GMM’ = Model GMM. This was to be In this paper, an equivalence between graph match- demonstrated. Proposition1 is right. ing and graph edit distance problems was proven under a reformulation of the similarity functions be- tween nodes and edges. These functions should take Under the condition of Proposition1, the optimal into account explicitly the deletion and insertion assignment obtains when solving the graph matching costs. That’s the major difference between GM and problem can be used to reconstruct an optimal solu- GED problems. In the GED problem, costs to delete tion of the GED problem. An instance of GED and or to insert vertices or edges are explicitly introduced an instance of GM are presented in Figure1. Solu- in the error model. On the other hand, deletion costs tions of the GED instance are presented with respect are implicitly set to a specific value (that is to say 0) to the cost function c while the graph matching so- in the GM problem. Many learning methods aim at lutions are presented with respect to the similarity learning edit costs (Serratosa, 2020; Martineau et al., function s0. The optimal matching of both instances 2020) or matching similarities (Zanfir and Sminchis- are the same. escu, 2018; Caetano et al., 2007). Learned matching Model GMM’ has |V |.|V | variables and |V |+|V | 1 2 1 2 similarities may include implicitly deletion and in- constraints. Similarity functions can be represented sertion costs. Does it help the learning algorithm to by a similarity matrix K of size is |V |.|V |×|V |.|V |. 1 2 1 2 learn separately insertion and deletion costs? That Proposition1 is a first attempt toward the unifi- cation of two communities working respectively on 1 https://gdc2016.greyc.fr/ (Abu-Aisheh et al., GED and GM problems. All the methods solving 2017)

7 is still an open question. However, with this paper, S. Gold, A. Rangarajan, A graduated assignment we stand for a rapprochement of the research com- algorithm for graph matching, TPAMI 18 (1996) munities that work on learning graph edit distance 377–388. and learning graph matching because edit costs can be hidden in the learned similarities. M. Leordeanu, M. Hebert, R. Sukthankar, An inte- ger projected fixed point method for graph match- ing and map inference, in: Y. Bengio, D. Schuur- References mans, J. D. Lafferty, C. K. I. Williams, A. Culotta (Eds.), NIPS, Curran Associates, Inc., 2009, pp. K. Riesen, Structural Pattern Recognition with 1114–1122. Graph Edit Distance - Approximation Algorithms and Applications, Advances in Computer Vision T. Cour, P. Srinivasan, J. Shi, Balanced graph and Pattern Recognition, Springer, 2015. matching, in: B. Sch¨olkopf, J. C. Platt, T. Hoff- man (Eds.), NIPS, MIT Press, 2007, pp. 313–320. D. Das, C. G. Lee, Sample-to-sample correspon- dence for unsupervised domain adaptation, En- M. Leordeanu, M. Hebert, A spectral technique gineering Applications of Artificial Intelligence 73 for correspondence problems using pairwise con- (2018) 80 – 91. straints, in: ICCV, volume 2, 2005, pp. 1482–1489 Vol. 2. P. Swoboda, C. Rother, H. Abu Alhaija, D. Kain- muller, B. Savchynskyy, A study of lagrangean M. Cho, J. Lee, K. M. Lee, Reweighted random decompositions and dual ascent solvers for graph walks for graph matching, in: K. Daniilidis, matching, in: CVPR, 2017. P. Maragos, N. Paragios (Eds.), ECCV, Springer Berlin Heidelberg, Berlin, Heidelberg, 2010, pp. M. R. Garey, D. S. Johnson, Computers and In- 492–505. tractability; A Guide to the Theory of NP- Completeness, W. H. Freeman Co., USA, 1979. K. Riesen, S. Fankhauser, H. Bunke, Speeding up graph edit distance computation with a bipartite W.-h. Tsai, S. Member, K.-s. Fu, Pattern Deforma- heuristic, in: Mining and Learning with Graphs, tional Model and Bayes Error-Correcting Recog- Proceedings, 2007. nition System, IEEE Transactions on Systems, Man, and Cybernetics 9 (1979) 745–756. Z. Abu-Aisheh, R. Raveaux, J. Ramel, P. Martineau, An exact graph edit distance algorithm for solving Z. Zeng, A. K. H. Tung, J. Wang, J. Feng, L. Zhou, pattern recognition problems, in: ICPRAM, 2015, Comparing stars: On approximating graph edit pp. 271–278. distance, PVLDB 2 (2009) 25–36. D. Justice, A. Hero, A binary linear programming S. Bougleux, L. Brun, V. Carletti, P. Foggia, formulation of the graph edit distance, TPAMI B. Gauzere, M. Vento, Graph edit distance as a 28 (2006) 1200–1214. quadratic assignment problem, Pattern Recogni- tion Letters 87 (2017) 38 – 46. Advances in Graph- J. Lerouge, Z. Abu-Aisheh, R. Raveaux, P. H´eroux, based Pattern Recognition. S. Adam, New binary linear programming formu- lation to compute the graph edit distance, Pattern M. Cho, K. Alahari, J. Ponce, Learning graphs to Recognition 72 (2017) 254–265. match, in: ICCV, 2013, pp. 25–32. S. Bougleux, B. Ga¨uz`ere,L. Brun, A hungarian al- L. Torresani, V. Kolmogorov, C. Rother, A dual de- gorithm for error-correcting graph matching, in: composition approach to feature correspondence, P. Foggia, C.-L. Liu, M. Vento (Eds.), Graph- TPAMI 35 (2013) 259–271. Based Representations in Pattern Recognition, Springer International Publishing, Cham, 2017, Z. Liu, H. Qiao, Gnccp—graduated nonconvex- pp. 118–127. ityand concavity procedure, TPAMI 36 (2014) 1258–1267. F. Serratosa, Computation of graph edit distance: Reasoning about optimality and speed-up, Image C. Schellewald, C. Schn¨orr, Probabilistic subgraph Vision Comput. 40 (2015) 38–48. matching based on convex relaxation, in: A. Ran- garajan, B. Vemuri, A. L. Yuille (Eds.), Energy K. Riesen, H. Bunke, Approximate graph edit dis- Minimization Methods in Computer Vision and tance computation by means of bipartite graph Pattern Recognition, Springer Berlin Heidelberg, matching, Image Vision Comput. 27 (2009) 950– Berlin, Heidelberg, 2005, pp. 171–186. 959.

M. S. Bazaraa, H. D. Sherali, On the use of ex- M. Neuhaus, H. Bunke., Bridging the gap between act and heuristic cutting plane methods for the graph edit distance and kernel machines., Ma- quadratic assignment problem, Journal of the Op- chine Perception and Artificial Intelligence. 68 erational Research Society 33 (1982) 991–1003. (2007) 17–61.

8 H. Bunke, On a relation between graph edit dis- tance and maximum common subgraph, Pattern Recognition Letters 18 (1997) 689–694.

H. Bunke, Error correcting graph matching: On the influence of the underlying cost function, TPAMI 21 (1999) 917–922.

L. Brun, B. Ga¨uz`ere,S. Fourey, Relationships be- tween Graph Edit Distance and Maximal Com- mon Unlabeled Subgraph, Technical Report, 2012.

Z. Abu-Aisheh, B. Gauzere, S. Bougleux, et al, Graph edit distance contest: Results and fu- ture challenges, Pattern Recognition Letters 100 (2017) 96 – 103.

F. Serratosa, A general model to define the substitu- tion, insertion and deletion graph edit costs based on an embedded space, Pattern Recognition Let- ters 138 (2020) 115 – 122.

M. Martineau, R. Raveaux, D. Conte, G. Venturini, Learning error-correcting graph matching with a multiclass neural network, Pattern Recognition Letters 134 (2020) 68 – 76.

A. Zanfir, C. Sminchisescu, Deep learning of graph matching, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 2684–2693.

T. S. Caetano, Li Cheng, Q. V. Le, A. J. Smola, Learning graph matching, in: 2007 IEEE 11th In- ternational Conference on Computer Vision, 2007, pp. 1–8.

9