Exact Probabilistic Inference for Inexact Graph Matching

Exact Probabilistic Inference for Inexact Graph Matching Rupert Brooks 110247534 April 30, 2003 Inexact graph matching is a fundamental problem in computer vision applications. It crops up when- ever a correspondence between two structures must be determined in the presence of structural and attribute noise. Despite the importance of the problem, graph matching techniques in computer vision tend to be extremely task specific, and sometimes lack the power a general approach would have. For example, this project was motivated by the matching of human cortical sulci described in [1]. Riviere` et al achieve a fairly reliable solution for this problem using a congregation of neural networks. Unfortunately, their technique does not yield any sort of quality measure for the match, or the degree to which other matches are likely. A more general approach which could deliver information on match quality and report the existence of multiple valid matches would be desirable. 1 Theoretical background The notation and terminological definitions for various elements of graph matching vary considerably between different authors. While this is to be expected to some degree, it does cause some confusion when comparing the different algorithms. In particular, the difference between a graph and the embedding of a graph is glossed over by all the approaches reviewed here, which leads to confusion. This section defines the terms used in this report in a mathematically strict 1 fashion. A graph consists of a structure G = (V; E), where V is a set of vertices, and the predicate E ⊆ V × V is a set of edges. Two vertices are called adjacent if they form the elements of an edge in E. A graph G is a subgraph of a graph H if VG ⊆ VH and EG ⊆ EH . The degree of a vertex is the number of edges which connect to it. A graph is called bipartite if V can be divided into two parts V1 and V2 such that every edge in E has an endpoint in V1 and an endpoint in V2 [3]. A homomorphism (or simply morphism) is a mapping that preserves predicates and functions [2]. Thus a homomorphism between two graphs G; H is a mapping between VG and VH so that adjacent vertices in G map to vertices that are also adjacent in H, and non-adjacent vertices in G map to vertices that are not adjacent in H. An isomorphism is a one-to-one homomorphism. Given two graphs, G and H, the graph isomorphism problem is to find a one-to-one, adjacency pre- serving mapping between them if such a mapping exists. This problem holds an unusual place in complexity theory. No polynomial algorithm is known, but it has not been shown to be NP-complete. Some mathematicians suspect that it lies in a complexity class between P and NP [3]. The common subgraph isomorphism problem, which is to determine the largest graph F, that is a subgraph of both G and H, has been shown to be NP-complete. There are a range of graph homomorphism problems, most of which also have high complexity [4]. 1Using terminology and notation from [2] 1 The graph and subgraph isomorphism problems are often referred to as exact graph matching. For many practical problems a true homomorphism of the graphs involved cannot be found. The area of inexact graph matching involves finding mappings between the elements of two graphs that are approximately homomorphic. Practical results depend on careful definition of what is meant by an “approximate” homomorphism. Although some authors claim that exact and inexact matching are entirely separate problems [5], in this report it will be assumed that an inexact matching technique should yield a homomorphism or isomorphism in the limiting case where such a mapping can be found. In most inexact matching approaches, the matching is not symmetric. One graph is treated as the model graph, while the other is treated as the sample graph. The algorithm may give different results if the roles of the two graphs are reversed. In particular, it is accepted that not all the vertices in one graph can be matched to the other. Thus a null vertex, φ, is added to the model graph. Nodes that cannot be matched to a model vertex will be matched to the null vertex. The algebraic definition of a graph says nothing about its representation. However, graphs are usually visualized as a series of dots and lines on paper. The practical examples of graph matching in computer vision use graphs as representations of structures in data. A structure X is embedded in a space Y when Y restricted to X has the properties of X. Thus, a particular representation of a graph is called an embedding of the graph. Our intuitive understanding of graph matching, often involves the matching of embedded graphs rather than just the graphs themselves. In this case, some spatial properties of the embedding should be preserved as well as the graph structure. The Attributed Relational Graph (ARG) extends the notion of graph by adding a vector of attributes defined at each node[6, 7]. Thus an ARG is a triple, G = (V; E; A) where V and E are defined as before and A = fx¯u; 8u 2 V g. These attributes can, in principle, be anything, and there is no particular requirement to have the same attributes at each vertex although this is usually done. These attributes usually represent the spatial properties of the vertices. The word relational is for semantic purposes only. It refers to the fact that the edges in an ARG represent relationships between elements being matched. 2 Approaches to inexact graph matching Fundamentally graph matching techniques have two components. First some definition of the quality of a match must be made. Second, a search must be made through the space of possible matches to find an optimal, or at least acceptable, match. This report focuses on defining the quality measure, and relies on available tools for probabilistic inference to search the space. It is quite likely that specific cases can be solved more efficiently using specialized techniques. 2.1 Bayesian Approaches Of the numerous approaches to graph matching, explicitly probabilistic ones are relatively rare. For a number of years, they were considered difficult to formulate correctly, and risky to use when formulated incorrectly (e.g. [8]). In 1995, Christmas [6] and Wilson and Hancock [9] proposed probabilistic formu- lations for graph matching that used relaxation techniques to optimize a reasonable initial guess. Christmas [6] matched road network features extracted from aerial photography to features in a database. Geometric properties of the features were stored as a set of attributes in an ARG. The segments in the im- age are treated as the vertices of the graph, and the relationships between segments make up the edges of the graph. The probabilities were calculated locally on the attributes at a particular node, and on individual consideration of relational edges. 2 This method appears to work extremely well on the test images, but there are two theoretical annoy- ances. First, the local nature of the relaxation process means that it will converge to a local minimum. Second, and of more concern, a feedback process is used, where the posterior distribution from one iteration is used as the prior distribution for the next iteration. The system works well when initialized near the right answer, but it seems improbable that the resulting joint probability distribution can be trusted. In [7], Wilson and Hancock propose a Bayesian criterion for assessing the validity of a structural match. They propose that a correct structural match between a sample vertex, u, and a model vertex, v, will be characterized by three properties: • the neighbors of u will be matched to neighbors of v, • conversely, the neighbors of v will be matched to neighbors of u, and • there will be a minimum of matches to the null vertex. They explore both an exponential (Gibbs) distribution and a linear distribution as models for the weighted sum of these three properties. Like Christmas, the update method is local. The prior probabilities of matching are first computed based on the local attribute information. A matching model vertex is then selected at each node that maximizes the a posteriori probability given the attribute values and the neighborhood. They test their method on Delaunay triangulations to which they may add or delete nodes. The success is evaluated based on the number of correctly matched nodes. Results using their exponential criteria are impressive, as their system can correctly match the graphs even when initialized with up to 50% of the matches being incorrect, and in the presence of significant noise. The exponential distribution strikingly outperforms a linear distribution. In her PhD thesis proposal [10], Farmer describes several possible models for using Bayesian networks for graph matching. The most promising of these has been used as the foundation for this project. It formalizes the local Bayesian criteria of Wilson and Hancock [7] as a Bayesian network. The proposed structure is shown in Figure 1. The network consists of three types of node. Each vertex, ui in the sample graph has a corresponding node in the network. Each sample node has a child node f(ui) that describes the correctness of the attribute match. Each sample node also induces a neighborhood node Γui . Each Γui has ui and all the neighbors of ui in the sample graph as parents. Farmer develops the cumulative probability tables for each node using the equations from [7], and assuming a constant error probability (that is, that it is half as likely to make two errors as it is to make one error).

Load more