Spectral Geometry for Structural Pattern Recognition HEWAYDA EL GHAWALBY Ph.D. Thesis This thesis is submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy. Department of Computer Science United Kingdom May 2011 Abstract Graphs are used pervasively in computer science as representations of data with a network or relational structure, where the graph structure provides a flexible representation such that there is no fixed dimensionality for objects. However, the analysis of data in this form has proved an elusive problem; for instance, it suffers from the robustness to structural noise. One way to circumvent this problem is to embed the nodes of a graph in a vector space and to study the properties of the point distribution that results from the embedding. This is a problem that arises in a number of areas including manifold learning theory and graph-drawing. In this thesis, our first contribution is to investigate the heat kernel embed- ding as a route to computing geometric characterisations of graphs. The reason for turning to the heat kernel is that it encapsulates information concerning the distribution of path lengths and hence node affinities on the graph. The heat ker- nel of the graph is found by exponentiating the Laplacian eigensystem over time. The matrix of embedding co-ordinates for the nodes of the graph is obtained by performing a Young-Householder decomposition on the heat kernel. Once the embedding of its nodes is to hand we proceed to characterise a graph in a geometric manner. With the embeddings to hand, we establish a graph character- ization based on differential geometry by computing sets of curvatures associated ii Abstract iii with the graph nodes, edges and triangular faces. The second contribution comes from the need to solve the problem that arise in the processing of a noisy data over a graph. The Principal difficulty of this task, is how to preserve the geometrical structures existing in the initial data. Bringing together several, distinct concepts that have received some independent recent attention in machine learning; we propose a framework to regularize real-valued or vector-valued functions on weighted graphs of arbitrary topology. The first of these is deduced from the concepts of the spectral graph theory that have been applied to a wide range of clustering and classification tasks over the last decades taking in consideration the properties of the graph p-Laplacian as a nonlinear extension of the usual graph Laplacian. The second one is the geometric point of view comes from the heat kernel embedding of the graph into a manifold. In these techniques we use the geometry of the manifold by assuming that it has the geometric structure of a Riemannian manifold. The third important conceptual framework comes from the manifold regularization which extends the classical framework of regularization in the sense of reproducing Hilbert Spaces to exploit the geometry of the embedded set of points. The proposed framework, based on the p-Laplacian operators considering minimizing a weighted sum of two energy terms: a regularization one and an additional approximation term which helps to avoid the shrinkage effects obtained during the regularization process. The data are structured by functions depending on data features, the curvature attributes associated with the geometric embedding of the graph. The third contribution is inspired by the concepts and techniques of the graph calculus of partial differential functions. We propose a new approach for embed- ding graphs on pseudo-Riemannian manifolds based on the wave kernel which is the solution of the wave equation on the edges of a graph. The eigensystem of the wave-kernel is determined by the eigenvalues and the eigenfunctions of the normalized adjacency matrix and can be used to solve the edge-based wave equation. By factorising the Gram-matrix for the wave-kernel, we determine the embedding co-ordinates for nodes under the wave-kernel. The techniques proposed through this thesis are investigated as a means of iv Abstract gauging the similarity of graphs. We experiment on sets of graphs representing the proximity of image features in different views of different objects in three different datasets namely, the York model house, the COIL-20 and the TOY databases. Contents 1 Introduction 1 1.1 Thesis Motivation . 1 1.2 Thesis Goals . 6 1.3 Thesis outline . 7 2 Literature Review 8 2.1 Introduction . 8 2.2 Overview of the spectral approach in pattern recognition . 8 2.2.1 Spectral methods for Graph Embedding problems . 12 2.2.2 Spectral methods for graph matching problems . 14 2.2.3 Spectral methods for Graph Clustering problem . 15 2.3 Manifold Learning . 17 2.4 Calculus on graphs . 19 2.5 Conclusion . 19 3 Heat Kernel Embedding 21 3.1 Introduction . 21 3.2 Heat Kernels on Graphs . 24 3.2.1 Preliminaries . 25 v vi Abstract 3.2.2 Heat Equation . 26 3.2.3 Geodesic Distance from the Heat Kernel . 27 3.2.4 Heat Kernel Embedding . 29 3.2.5 Point Distribution Statistics . 29 3.3 Geometric Characterisation . 31 3.3.1 The Sectional Curvature . 32 3.3.2 The Gaussian Curvature . 34 3.4 Experiments . 36 3.4.1 Hausdorff distance . 36 3.4.2 A probabilistic similarity measure (PSM) . 37 3.4.3 Multidimensional Scaling . 39 3.4.4 Discerption of the Experimental Databases . 40 3.4.4.1 The York model house dataset . 40 3.4.4.2 The COIL dataset . 45 3.4.4.3 The Toy dataset . 50 3.4.5 Experimenting with Real-world data . 53 3.5 Conclusion . 72 4 Regularization on Graphs 83 4.1 Introduction . 83 4.2 Functions and Operators on Graphs . 86 4.2.1 Preliminaries . 86 4.2.2 Functions on Graphs . 87 4.2.3 Regularization by means of the Laplacian . 88 4.2.4 Operators on Graphs . 90 4.2.5 The p-Laplacian Operator . 91 4.3 p-Laplacian regularization framework . 92 4.4 The Gaussian Curvature . 93 4.4.1 Geometric Preliminaries . 93 4.4.2 Gaussian Curvature from Gauss Bonnet Theorem . 94 4.5 The Euler characteristic . 95 4.6 Experiments . 96 Abstract vii 4.7 Conclusion . 107 5 Wave Kernel Embedding 109 5.1 Introduction . 109 5.2 Embedding graphs into Pseudo Riemannian manifolds . 111 5.2.1 Edge-based Wave Equation . 111 5.2.2 Edge-based Eigenvalues and Eigenfunctions . 113 5.2.3 The manifold spanned by the data . 114 5.3 Pseudo Euclidean Space . 115 5.3.1 Distance Function . 116 5.3.2 An Orthonormal Basis . 116 5.3.3 Projection into a kDSubspace . 117 5.4 Experiments and Results . 118 5.5 Conclusion . 124 6 Conclusion and Future Work 125 6.1 Summary and Conclusion . 125 6.2 Future Work . 128 I Appendices 130 A Area Metric Manifolds 131 A.1 Area Metric Geometry . 131 A.1.1 Area Metric Manifolds . 131 A.1.2 Induced-metric area metric . 132 A.2 The space of oriented areas . 133 A.2.1 Area metric curvature . 134 A.3 Area metric under Hyperbolic geometric flow . 136 A The COIL dateset 138 References . 151 List of Figures 3.1 The geometric embedding of the graph . 30 3.2 Illustration of the sectional curvature . 33 3.3 Sample images from the houses dataset . 41 3.4 Histogram of number of Nodes from the houses database . 42 3.5 Histogram of number of Edges from the houses database . 43 3.6 Histogram of number of Faces from the houses database . 44 3.7 The COIL-20 objects . 45 3.8 The COIL-100 objects . 46 3.9 Histogram of number of Nodes for the COIL data . 47 3.10 Histogram of number of Edges for the COIL data . 48 3.11 Histogram of number of Triangular Faces for the COIL data . 49 3.12 Example images of the 4 objects of The Toy Database. 50 3.13 Histogram of number of Nodes from the Toy database . 51 3.14 Histogram of number of Edges from the Toy database . 52 3.15 HD for house data represented by the sectional curvatures . 55 3.16 HD for house data represented by the Gaussian curvatures . 56 3.17 MHD for house data represented by the sectional curvatures . 57 3.18 MHD for house data represented by the Gaussian curvatures . 58 viii Abstract ix 3.19 PSM for house data represented by the sectional curvatures . 59 3.20 PSM for house data represented by the Gaussian curvatures . 60 3.21 HD for COIL data represented by the sectional curvature . 63 3.22 HD for COIL data represented by the Gaussian curvature . 64 3.23 MHD for COIL data represented by the sectional curvature . 65 3.24 MHD for COIL data represented by the Gaussian curvature . 66 3.25 HD for TOY data represented by the sectional curvature . 68 3.26 MHD for TOY data represented by the sectional curvature . 69 3.27 Distribution of Gaussian curvatures for the 1st house at t = 1:0 . 73 3.28 Distribution of Gaussian curvatures for the 2nd house at t = 1:0 74 3.29 Distribution of Gaussian curvatures for the 3rd house at t = 1:0 . 75 3.30 Distribution of Gaussian curvatures for the 1st house at t = 0:1 . 76 3.31 Distribution of Gaussian curvatures for the 2nd house at t = 0:1 77 3.32 Distribution of Gaussian curvatures for the 3rd house at t = 0:1 . 78 3.33 Distribution of Gaussian curvatures for the 1st house at t = 0:01 79 3.34 Distribution of Gaussian curvatures for the 2nd house at t = 0:01 80 3.35 Distribution of Gaussian curvatures for the 3rd house at t = 0:01 81 4.1 HD of Laplace operator regularization for the houses data .
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages179 Page
-
File Size-