Pattern Matching Using Similarity Measures
Total Page:16
File Type:pdf, Size:1020Kb
Pattern matching using similarity measures Patroonvergelijking met behulp van gelijkenismaten (met een samenvatting in het Nederlands) PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Universiteit Utrecht op gezag van Rector Magnificus, Prof. Dr. H. O. Voorma, ingevolge het besluit van het College voor Promoties in het openbaar te verdedigen op maandag 18 september 2000 des morgens te 10:30 uur door Michiel Hagedoorn geboren op 13 juli 1972, te Renkum promotor: Prof. Dr. M. H. Overmars Faculteit Wiskunde & Informatica co-promotor: Dr. R. C. Veltkamp Faculteit Wiskunde & Informatica ISBN 90-393-2460-3 PHILIPS '$ The&&% research% described in this thesis has been made possible by financial support from Philips Research Laboratories. The work in this thesis has been carried out in the graduate school ASCI. Contents 1 Introduction 1 1.1Patternmatching.......................... 1 1.2Applications............................. 4 1.3Obtaininggeometricpatterns................... 7 1.4 Paradigms in geometric pattern matching . 8 1.5Similaritymeasurebasedpatternmatching........... 11 1.6Overviewofthisthesis....................... 16 2 A theory of similarity measures 21 2.1Pseudometricspaces........................ 22 2.2Pseudometricpatternspaces................... 30 2.3Embeddingpatternsinafunctionspace............. 40 2.4TheHausdorffmetric........................ 46 2.5Thevolumeofsymmetricdifference............... 54 2.6 Reflection visibility based distances . 60 2.7Summary.............................. 71 2.8Experimentalresults........................ 73 2.9Discussion.............................. 76 3 Computation of the minimum distance 79 3.1Generalminimisation........................ 81 3.2Exactcongruencematching.................... 84 3.3TheHausdorffmetric........................ 91 3.4Thevolumeofsymmetricdifference............... 95 3.5 Reflection visibility based similarity measures . 104 3.6Discussion.............................. 118 4 Approximation of the minimum distance 121 4.1 The geometric branch-and-bound algorithm . 122 4.2Thetracesapproach........................ 126 i ii CONTENTS 4.3Thepartitioncombinationapproach............... 132 4.4Experimentalresults........................ 139 4.5Discussion.............................. 146 A Point set topology 149 B Topological transformation groups 153 Bibliography 157 Acknowledgements 169 Samenvatting 171 Vita 175 Symbols 177 Index 183 Chapter 1 Introduction 1.1 Pattern matching There are two types of pattern matching: combinatorial pattern matching and spatial pattern matching. Examples of combinatorial pattern matching are string matching [74], DNA pattern matching [26], tree pattern matching [35], and edit distance computation [34]. Spatial pattern matching is the problem of finding a match between two given intensity images or geometric models. Here, a match may be a correspondence or a geometric transformation. The remainder of this section only considers spatial pattern matching; the term pattern matching is used as a shorthand for spatial pattern matching. The following is a simple example of pattern matching. Given are two input patterns A and B, shown in Figures 1.1 and 1.2, respectively. The problem is finding a transformation g for which g(A) is similar to B. Here, g is allowed to be a combination of scaling, rotation and translation. The output of a good pattern matching algorithm should be a transformation g like the one depicted in Figure 1.3 where g(A)issuperimposedonB. In the general case, the input of a pattern matching algorithm is a pair of patterns, and the output is a transformation. Pattern matching problems can be categorised by three components: the collection of patterns, the class of transformations, and the criterion used to select a transformation. Examples of pattern collections are colour images, grey scale images, CAD models, and vector based graphics. Examples of transformation classes are translations, Euclidean isometries (which preserve the Euclidean distance between each pair of points), affine transformations (which are compositions of linear transfor- mations and translations), elastic transformations (which do not necessarily preserve straight lines) and correspondences (which are relations between par- 1 2 CHAPTER 1. INTRODUCTION Figure 1.1: Pattern A Figure 1.2: Pattern B Figure 1.3: Match ticular features in both patterns). An example of a selection criterion is that a transformation must minimise the value of a similarity measure (which is a function that assigns a nonnegative real number to each pair of patterns). The selection criterion is not always made explicit the description of a method. Photometric and geometric pattern matching Pattern matching can be subdivided into two categories, depending on the type of input patterns: photometric1 pattern matching and geometric pattern matching. Photometric methods work directly on images which are considered as arrays of intensity values or real valued functions. Geometric methods work on geometric data such as finite point sets or polygons. This geometric data may be obtained directly from vector based object representations. It may also be obtained from images using feature extraction techniques. In this context, geometric pattern matching may be called feature based pattern matching. Sometimes photometric methods and geometric methods are used in combina- tion. For example, when features are used to generate candidate matches and photometry is used to verify the candidate matches [28]. Total and partial pattern matching There is a distinction between total pattern matching and partial pattern matching. In total pattern matching, a pattern is matched with another pat- tern as a whole. In partial pattern matching, a pattern is matched with part of another pattern. Partial pattern matching can be seen as the geometric analogue of substring matching. Care is necessary in the formulation of partial pattern matching problems. For example, define partial pattern matching as transforming one 1Photometry stands for the intensity information measured by cameras, scanners, MRI, CT, etc. 1.1. PATTERN MATCHING 3 pattern so that it becomes a subset of another pattern. Such a definition may lead to unwanted results if it is implemented, especially if transformations that include scaling are considered. There are various ways to define partial pattern matching in terms of total pattern matching. An example is identifying the subpatterns of a pattern that correspond to distinct objects; each of these subpatterns gives rise to a total pattern matching problem. Another example is repeatedly applying total pattern matching to a pattern and the subpattern of another pattern that is contained in a window whose position varies. This thesis deals primarily with total pattern matching. However, much of the theory and many of the algorithms discussed in this thesis can be applied to partial pattern matching problems. Exact and approximate pattern matching Perhaps the adjectives exact and approximate should be avoided in relation with pattern matching since they may cause confusion. However, since these terms are often used, a short discussion of them is appropriate. Exact pattern matching usually means finding a transformation under which one pattern becomes identical to (part of) another pattern. Approximate pat- tern matching means finding the transformation under which a pattern becomes similar (but not necessarily identical) to another pattern. The confusion lies in the fact that the adjective approximate is also used in relation with optimisa- tion algorithms; many types of pattern matching can be seen as optimisation problems. An approximation algorithm computes a solution that is not neces- sarily the optimal one. In this context, the approximation has nothing to do with the type of pattern matching. The word exact is often used to denote algorithms that compute an optimal solution. From here on, the terms exact and approximate are avoided in relation with pattern matching. Specialised and general purpose pattern matching Pattern matching methods are specialised in varying degrees. Some pattern matching methods are specialised to detect or recognise a restricted class of shapes such as circles and squares. Another class of highly specialised methods is found in optical character recognition (OCR). General purpose (or context independent) pattern matching methods are not specialised for any specific task and work for unrestricted classes of images or geometric shapes. These methods do not incorporate application domain dependent assumptions; they work purely on photometry or geometry. In principle, a pattern matching algorithm that is specialised for a given application domain is more efficient and reliable than a general purpose pattern 4 CHAPTER 1. INTRODUCTION matching algorithm, if applied to that domain. It is difficult to incorporate domain knowledge elegantly in an algorithm. Attempts to incorporate such knowledge may result in \messy" or \ad-hoc" methods. To keep the discussion clean, this thesis focuses on the subject of general purpose pattern matching. This is on itself an interesting and challenging problem. No attempts are made to specialise the algorithms for specific application domains. Problems related to pattern matching Object recognition is similar to pattern matching. However, there is no consen- sus about the exact meaning of the word object recognition. There are three distinct uses of the term object recognition. First, the term object recognition is used for region segmentation. Region