Object Recognition Techniques in Real Applications Fernandez Robles, Laura

Total Page:16

File Type:pdf, Size:1020Kb

Object Recognition Techniques in Real Applications Fernandez Robles, Laura University of Groningen Object recognition techniques in real applications Fernandez Robles, Laura IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below. Document Version Publisher's PDF, also known as Version of record Publication date: 2016 Link to publication in University of Groningen/UMCG research database Citation for published version (APA): Fernandez Robles, L. (2016). Object recognition techniques in real applications. University of Groningen. Copyright Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons). The publication may also be distributed here under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license. More information can be found on the University of Groningen website: https://www.rug.nl/library/open-access/self-archiving-pure/taverne- amendment. Take-down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum. Download date: 29-09-2021 Chapter 2 State of the art n the last decades, there has been substantial work in the computer vision field Ithat tackles the problem of object recognition. Here we present a brief survey of different approaches on object recognition. Some reviews divide object recognition approaches in three categories. Model- based methods deal with the representation and identification of a known three di- mensional (3-D) objects (boxes, spheres, cylinders, cones, surface of revolution, etc.). Similarly, shape-based methods represent an object by its shape and/or contour. In contrast, appearance-based models use the appearance of the object usually under several two dimensional (2-D) views. Another way of classifying object recognition techniques distinguishes between local and global approaches. Local methods search for salient regions or points that characterise the object of interest such as corners, edges or entropy. Later, these re- gions are typified by given descriptors. The local descriptors of the object of interest and the local descriptors of the test image are then compared for object recognition purposes. In contrast to that, global methods model the information content of the whole object of interest. This information can come from simple statistical measures (such as mean values or histograms of features) to more advanced dimensional- ity reduction techniques. Global methods allow to reconstruct the original image providing robutness to some extend whereas local methods can better cope with partly occluded objects. Local appearance based object recognition methods need to detect and describe dis- tinctive regions or keypoints in an image. As for the detection, we can differentiate corner based detectors, region based detectors and others. Corner based detectors locate keypoints and regions which contain a lot of im- age structure like edges. Corners can be defined as points with low self-similarity in all directions. The self-similarity of an image patch can be measured by taking the sum of squared differences (SSD) between an image patch and a shifted version of itself. The most popular corner based detector is the one of Harris and Steph- ens (1988). It works by computing a response function across all the image pixels. Then, those exceeding a threshold, also known as locally maximal, are considered corners. The response function is obtained from the Harris matrix computed from image derivatives. The Harris-point detector achieves a large number of keypo- 12 2. State of the art ints with sufficient repeatability (Schmid et al., 2000). The main advantage of this detector is the high computation speed whereas the main disadvantage is that the detector is only invariant to rotation since no information about scale and orienta- tion is provided. Harris-Laplace detector add invariance to scale and is based on the work of Lindeberg (1998) that studies the properties of scale space. Mikolajczyk and Schmid (2002) proposed Harris-Affine detector by extending the Harris-Laplace detector in order to achieve invariance to affine transformations. It is based on the shape estimation properties of the second moment matrix. The main disadvantage of the Harris-Affine detector is the increase in time computation. Region based detectors locate local blobs of uniform brightness and are therefore suited for uniform regions or regions with smooth transitions. Hessian matrix de- tectors (Mikolajczyk et al., 2005) are similar to Harris detectors. The Hessian mat- rix is computed from the second image derivatives, thus, this detector responds to blobs-like structures. Keypoints are selected based on the determinant of the Hessian matrix after non-maximum suppression. The main drawback is that it provides only rotational invariance properties. Similarly, Hessian-Laplace detectors have scale invariance properties and Hessian-Affine detectors add invariance to af- fine transformations, (Mikolajczyk and Schmid, 2002). Instead of a scale normalised Laplacian, Lowe (1999, 2004) uses an approximation of the Laplacian, namely dif- ference of Gaussian function (DoG), by calculating differences of Gaussian blurred images at different adjacent local scales. The main advantage is the invariance to scale but it is punished by an increase in runtime. Maximally stable extremal re- gions (MSER) (Matas et al., 2004) are regions that are either darker or brighter than their surroundings and that are stable across a range of thresholds of the intensity function. If single keypoints are needed, they are usually consider as the centres of gravity of each MSERs. The number of MSERs detected is rather small in compar- ison with previous mentioned detectors, but Mikolajczyk et al. (2005) affirms that the repeatability is higher in most cases. An example of other approaches different to corner or region detectors can be the entropy based salient regions detectors (Kadir et al., 2003; Kadir and Brady, 2001, 2003). They consider the grey value entropy of a circular region in the im- age in order to estimate the visual saliency of a region. The main drawback is that it is time consuming, specially for the affine invariant implementation (Kadir et al., 2004). Tuytelaars and Gool (1999) and Tuytelaars and Van Gool (2004) proposed two detectors, intensity based regions (IBR) and edge based regions (EBR). Some works locally describe the whole object in a dense way such as bag of words (BoW) descrip- tion based on dense SIFT and histogram of oriented gradients (HOG). BoW (Sivic and Zisserman, 2009) is a vector of occurrence counts of a vocabulary of local image features. HOG (Dalal and Triggs, 2005a) counts occurrences of gradient orientation in a dense grid of uniformly spaced cells and uses overlapping local contrast nor- malization to improve accuracy. 13 After region or point detection, features descriptors should be computed for de- scribing the regions or the local neighbourhoods of the points respectively. One can distinguish among distribution based descriptors, filter based descriptors and other methods. Distribution based descriptors represent some properties of given regions by his- tograms. Usually these properties come from the geometric information of the points and local orientation information in the region. Probably the most pop- ular descriptor is scale invariant feature transform (SIFT), developed by Lowe (1999, 2004). Actually, he proposed a combination of both SIFT detector and descriptor, where SIFT detector is the DoG previously discussed. For obtaining SIFT descriptors, the local image gradients are measured at the selected scale in the region around each keypoint. These are usually transformed into a representation of a 4×4 array of histograms with 8 orientation bins in each, leading to a 128 element feature vector for each keypoint. SIFT is invariant to uniform scaling and orientation and partially invariant to affine distortion and illumination changes and allows for object recognition under clutter and partial occlusion. The main disadvantage of SIFT is the high computational time required. Many versions of SIFT have been proposed. Ke and Sukthankar (2004) reduced the dimensionality of the SIFT descriptor by ap- plying principal component analysis (PCA) to the scale-normalised gradient patches instead of gradient histograms on the keypoints. Gradient location-orientation his- tograms (GLOH) (Mikolajczyk et al., 2005) try to obtain higher robustness and dis- tinctiveness than SIFT descriptors. Authors divided the keypoint patch into a radial and angular grid leading to a higher dimensional descriptor that is reduced by ap- plying PCA to the 128 largest eigenvalues. Spin images (Johnson and Hebert, 1999) for 2-D images (Lazebnik et al., 2003) use a 2-D histogram of intensity values and their distance from the centre of the region. Each row of the 2-D descriptor repres- ents the histogram of the grey values in an annulus distance from the centre. This descriptor is invariant to in-plane rotations. Belongie et al. (2002) introduced shape context descriptors that
Recommended publications
  • Hough Transform, Descriptors Tammy Riklin Raviv Electrical and Computer Engineering Ben-Gurion University of the Negev Hough Transform
    DIGITAL IMAGE PROCESSING Lecture 7 Hough transform, descriptors Tammy Riklin Raviv Electrical and Computer Engineering Ben-Gurion University of the Negev Hough transform y m x b y m 3 5 3 3 2 2 3 7 11 10 4 3 2 3 1 4 5 2 2 1 0 1 3 3 x b Slide from S. Savarese Hough transform Issues: • Parameter space [m,b] is unbounded. • Vertical lines have infinite gradient. Use a polar representation for the parameter space Hough space r y r q x q x cosq + ysinq = r Slide from S. Savarese Hough Transform Each point votes for a complete family of potential lines: Each pencil of lines sweeps out a sinusoid in Their intersection provides the desired line equation. Hough transform - experiments r q Image features ρ,ϴ model parameter histogram Slide from S. Savarese Hough transform - experiments Noisy data Image features ρ,ϴ model parameter histogram Need to adjust grid size or smooth Slide from S. Savarese Hough transform - experiments Image features ρ,ϴ model parameter histogram Issue: spurious peaks due to uniform noise Slide from S. Savarese Hough Transform Algorithm 1. Image à Canny 2. Canny à Hough votes 3. Hough votes à Edges Find peaks and post-process Hough transform example http://ostatic.com/files/images/ss_hough.jpg Incorporating image gradients • Recall: when we detect an edge point, we also know its gradient direction • But this means that the line is uniquely determined! • Modified Hough transform: for each edge point (x,y) θ = gradient orientation at (x,y) ρ = x cos θ + y sin θ H(θ, ρ) = H(θ, ρ) + 1 end Finding lines using Hough transform
    [Show full text]
  • Scale Invariant Feature Transform (SIFT) Why Do We Care About Matching Features?
    Scale Invariant Feature Transform (SIFT) Why do we care about matching features? • Camera calibration • Stereo • Tracking/SFM • Image moiaicing • Object/activity Recognition • … Objection representation and recognition • Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging parameters • Automatic Mosaicing • http://www.cs.ubc.ca/~mbrown/autostitch/autostitch.html We want invariance!!! • To illumination • To scale • To rotation • To affine • To perspective projection Types of invariance • Illumination Types of invariance • Illumination • Scale Types of invariance • Illumination • Scale • Rotation Types of invariance • Illumination • Scale • Rotation • Affine (view point change) Types of invariance • Illumination • Scale • Rotation • Affine • Full Perspective How to achieve illumination invariance • The easy way (normalized) • Difference based metrics (random tree, Haar, and sift, gradient) How to achieve scale invariance • Pyramids • Scale Space (DOG method) Pyramids – Divide width and height by 2 – Take average of 4 pixels for each pixel (or Gaussian blur with different ) – Repeat until image is tiny – Run filter over each size image and hope its robust How to achieve scale invariance • Scale Space: Difference of Gaussian (DOG) – Take DOG features from differences of these images‐producing the gradient image at different scales. – If the feature is repeatedly present in between Difference of Gaussians, it is Scale Invariant and should be kept. Differences Of Gaussians
    [Show full text]
  • Exploiting Information Theory for Filtering the Kadir Scale-Saliency Detector
    Introduction Method Experiments Conclusions Exploiting Information Theory for Filtering the Kadir Scale-Saliency Detector P. Suau and F. Escolano {pablo,sco}@dccia.ua.es Robot Vision Group University of Alicante, Spain June 7th, 2007 P. Suau and F. Escolano Bayesian filter for the Kadir scale-saliency detector 1 / 21 IBPRIA 2007 Introduction Method Experiments Conclusions Outline 1 Introduction 2 Method Entropy analysis through scale space Bayesian filtering Chernoff Information and threshold estimation Bayesian scale-saliency filtering algorithm Bayesian scale-saliency filtering algorithm 3 Experiments Visual Geometry Group database 4 Conclusions P. Suau and F. Escolano Bayesian filter for the Kadir scale-saliency detector 2 / 21 IBPRIA 2007 Introduction Method Experiments Conclusions Outline 1 Introduction 2 Method Entropy analysis through scale space Bayesian filtering Chernoff Information and threshold estimation Bayesian scale-saliency filtering algorithm Bayesian scale-saliency filtering algorithm 3 Experiments Visual Geometry Group database 4 Conclusions P. Suau and F. Escolano Bayesian filter for the Kadir scale-saliency detector 3 / 21 IBPRIA 2007 Introduction Method Experiments Conclusions Local feature detectors Feature extraction is a basic step in many computer vision tasks Kadir and Brady scale-saliency Salient features over a narrow range of scales Computational bottleneck (all pixels, all scales) Applied to robot global localization → we need real time feature extraction P. Suau and F. Escolano Bayesian filter for the Kadir scale-saliency detector 4 / 21 IBPRIA 2007 Introduction Method Experiments Conclusions Salient features X HD(s, x) = − Pd,s,x log2Pd,s,x d∈D Kadir and Brady algorithm (2001): most salient features between scales smin and smax P.
    [Show full text]
  • A PERFORMANCE EVALUATION of LOCAL DESCRIPTORS 1 a Performance Evaluation of Local Descriptors
    MIKOLAJCZYK AND SCHMID: A PERFORMANCE EVALUATION OF LOCAL DESCRIPTORS 1 A performance evaluation of local descriptors Krystian Mikolajczyk and Cordelia Schmid Dept. of Engineering Science INRIA Rhone-Alpesˆ University of Oxford 655, av. de l'Europe Oxford, OX1 3PJ 38330 Montbonnot United Kingdom France [email protected] [email protected] Abstract In this paper we compare the performance of descriptors computed for local interest regions, as for example extracted by the Harris-Affine detector [32]. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [3], steerable filters [12], PCA-SIFT [19], differential invariants [20], spin images [21], SIFT [26], complex filters [37], moment invariants [43], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor, and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors. Index Terms Local descriptors, interest points, interest regions, invariance, matching, recognition. I. INTRODUCTION Local photometric descriptors computed for interest regions have proved to be very successful in applications such as wide baseline matching [37, 42], object recognition [10, 25], texture Corresponding author is K.
    [Show full text]
  • Hough Transform 1 Hough Transform
    Hough transform 1 Hough transform The Hough transform ( /ˈhʌf/) is a feature extraction technique used in image analysis, computer vision, and digital image processing.[1] The purpose of the technique is to find imperfect instances of objects within a certain class of shapes by a voting procedure. This voting procedure is carried out in a parameter space, from which object candidates are obtained as local maxima in a so-called accumulator space that is explicitly constructed by the algorithm for computing the Hough transform. The classical Hough transform was concerned with the identification of lines in the image, but later the Hough transform has been extended to identifying positions of arbitrary shapes, most commonly circles or ellipses. The Hough transform as it is universally used today was invented by Richard Duda and Peter Hart in 1972, who called it a "generalized Hough transform"[2] after the related 1962 patent of Paul Hough.[3] The transform was popularized in the computer vision community by Dana H. Ballard through a 1981 journal article titled "Generalizing the Hough transform to detect arbitrary shapes". Theory In automated analysis of digital images, a subproblem often arises of detecting simple shapes, such as straight lines, circles or ellipses. In many cases an edge detector can be used as a pre-processing stage to obtain image points or image pixels that are on the desired curve in the image space. Due to imperfections in either the image data or the edge detector, however, there may be missing points or pixels on the desired curves as well as spatial deviations between the ideal line/circle/ellipse and the noisy edge points as they are obtained from the edge detector.
    [Show full text]
  • Histogram of Directions by the Structure Tensor
    Histogram of Directions by the Structure Tensor Josef Bigun Stefan M. Karlsson Halmstad University Halmstad University IDE SE-30118 IDE SE-30118 Halmstad, Sweden Halmstad, Sweden [email protected] [email protected] ABSTRACT entity). Also, by using the approach of trying to reduce di- Many low-level features, as well as varying methods of ex- rectionality measures to the structure tensor, insights are to traction and interpretation rely on directionality analysis be gained. This is especially true for the study of the his- (for example the Hough transform, Gabor filters, SIFT de- togram of oriented gradient (HOGs) features (the descriptor scriptors and the structure tensor). The theory of the gra- of the SIFT algorithm[12]). We will present both how these dient based structure tensor (a.k.a. the second moment ma- are very similar to the structure tensor, but also detail how trix) is a very well suited theoretical platform in which to they differ, and in the process present a different algorithm analyze and explain the similarities and connections (indeed for computing them without binning. In this paper, we will often equivalence) of supposedly different methods and fea- limit ourselves to the study of 3 kinds of definitions of di- tures that deal with image directionality. Of special inter- rectionality, and their associated features: 1) the structure est to this study is the SIFT descriptors (histogram of ori- tensor, 2) HOGs , and 3) Gabor filters. The results of relat- ented gradients, HOGs). Our analysis of interrelationships ing the Gabor filters to the tensor have been studied earlier of prominent directionality analysis tools offers the possibil- [3], [9], and so for brevity, more attention will be given to ity of computation of HOGs without binning, in an algo- the HOGs.
    [Show full text]
  • 1 Introduction
    1: Introduction Ahmad Humayun 1 Introduction z A pixel is not a square (or rectangular). It is simply a point sample. There are cases where the contributions to a pixel can be modeled, in a low order way, by a little square, but not ever the pixel itself. The sampling theorem tells us that we can reconstruct a continuous entity from such a discrete entity using an appropriate reconstruction filter - for example a truncated Gaussian. z Cameras approximated by pinhole cameras. When pinhole too big - many directions averaged to blur the image. When pinhole too small - diffraction effects blur the image. z A lens system is there in a camera to correct for the many optical aberrations - they are basically aberrations when light from one point of an object does not converge into a single point after passing through the lens system. Optical aberrations fall into 2 categories: monochromatic (caused by the geometry of the lens and occur both when light is reflected and when it is refracted - like pincushion etc.); and chromatic (Chromatic aberrations are caused by dispersion, the variation of a lens's refractive index with wavelength). Lenses are essentially help remove the shortfalls of a pinhole camera i.e. how to admit more light and increase the resolution of the imaging system at the same time. With a simple lens, much more light can be bought into sharp focus. Different problems with lens systems: 1. Spherical aberration - A perfect lens focuses all incoming rays to a point on the optic axis. A real lens with spherical surfaces suffers from spherical aberration: it focuses rays more tightly if they enter it far from the optic axis than if they enter closer to the axis.
    [Show full text]
  • The Hough Transform As a Tool for Image Analysis
    THE HOUGH TRANSFORM AS A TOOL FOR IMAGE ANALYSIS Josep Llad´os Computer Vision Center - Dept. Inform`atica. Universitat Aut`onoma de Barcelona. Computer Vision Master March 2003 Abstract The Hough transform is a widespread technique in image analysis. Its main idea is to transform the image to a parameter space where clusters or particular configurations identify instances of a shape under detection. In this chapter we overview some meaningful Hough-based techniques for shape detection, either parametrized or generalized shapes. We also analyze some approaches based on the straight line Hough transform able to detect particular structural properties in images. Some of the ideas of these approaches will be used in the following chapter to solve one of the goals of the present work. 1 Introduction The Hough transform was first introduced by Paul Hough in 1962 [4] with the aim of detecting alignments in T.V. lines. It became later the basis of a great number of image analysis applications. The Hough transform is mainly used to detect parametric shapes in images. It was first used to detect straight lines and later extended to other parametric models such as circumferences or ellipses, being finally generalized to any parametric shape [1]. The key idea of the Hough transform is that spatially extended patterns are transformed into a parameter space where they can be represented in a spatially compact way. Thus, a difficult global detection problem in the image space is reduced to an easier problem of peak detection in a parameter space. Ü; Ý µ A set of collinear image points ´ can be represented by the equation: ÑÜ Ò =¼ Ý (1) Ò where Ñ and are two parameters, the slope and intercept, which characterize the line.
    [Show full text]
  • Detection and Evaluation Methods for Local Image and Video Features Julian Stottinger¨
    Technical Report CVL-TR-4 Detection and Evaluation Methods for Local Image and Video Features Julian Stottinger¨ Computer Vision Lab Institute of Computer Aided Automation Vienna University of Technology 2. Marz¨ 2011 Abstract In computer vision, local image descriptors computed in areas around salient interest points are the state-of-the-art in visual matching. This technical report aims at finding more stable and more informative interest points in the domain of images and videos. The research interest is the development of relevant evaluation methods for visual mat- ching approaches. The contribution of this work lies on one hand in the introduction of new features to the computer vision community. On the other hand, there is a strong demand for valid evaluation methods and approaches gaining new insights for general recognition tasks. This work presents research in the detection of local features both in the spatial (“2D” or image) domain as well for spatio-temporal (“3D” or video) features. For state-of-the-art classification the extraction of discriminative interest points has an impact on the final classification performance. It is crucial to find which interest points are of use in a specific task. One question is for example whether it is possible to reduce the number of interest points extracted while still obtaining state-of-the-art image retrieval or object recognition results. This would gain a significant reduction in processing time and would possibly allow for new applications e.g. in the domain of mobile computing. Therefore, the work investigates different corner detection approaches and evaluates their repeatability under varying alterations.
    [Show full text]
  • Local Descriptors
    Local descriptors SIFT (Scale Invariant Feature Transform) descriptor • SIFT keypoints at locaon xy and scale σ have been obtained according to a procedure that guarantees illuminaon and scale invance. By assigning a consistent orientaon, the SIFT keypoint descriptor can be also orientaon invariant. • SIFT descriptor is therefore obtained from the following steps: – Determine locaon and scale by maximizing DoG in scale and in space (i.e. find the blurred image of closest scale) – Sample the points around the keypoint and use the keypoint local orientaon as the dominant gradient direc;on. Use this scale and orientaon to make all further computaons invariant to scale and rotaon (i.e. rotate the gradients and coordinates by the dominant orientaon) – Separate the region around the keypoint into subregions and compute a 8-bin gradient orientaon histogram of each subregion weigthing samples with a σ = 1.5 Gaussian SIFT aggregaon window • In order to derive a descriptor for the keypoint region we could sample intensi;es around the keypoint, but they are sensi;ve to ligh;ng changes and to slight errors in x, y, θ. In order to make keypoint es;mate more reliable, it is usually preferable to use a larger aggregaon window (i.e. the Gaussian kernel size) than the detec;on window. The SIFT descriptor is hence obtained from thresholded image gradients sampled over a 16x16 array of locaons in the neighbourhood of the detected keypoint, using the level of the Gaussian pyramid at which the keypoint was detected. For each 4x4 region samples they are accumulated into a gradient orientaon histogram with 8 sampled orientaons.
    [Show full text]
  • Local Feature Detectors and Descriptors Overview
    CS 6350 – COMPUTER VISION Local Feature Detectors and Descriptors Overview • Local invariant features • Keypoint localization - Hessian detector - Harris corner detector • Scale Invariant region detection - Laplacian of Gaussian (LOG) detector - Difference of Gaussian (DOG) detector • Local feature descriptor - Scale Invariant Feature Transform (SIFT) - Gradient Localization Oriented Histogram (GLOH) • Examples of other local feature descriptors Motivation • Global feature from the whole image is often not desirable • Instead match local regions which are prominent to the object or scene in the image. • Application Area - Object detection - Image matching - Image stitching Requirements of a local feature • Repetitive : Detect the same points independently in each image. • Invariant to translation, rotation, scale. • Invariant to affine transformation. • Invariant to presence of noise, blur etc. • Locality :Robust to occlusion, clutter and illumination change. • Distinctiveness : The region should contain “interesting” structure. • Quantity : There should be enough points to represent the image. • Time efficient. Others preferable (but not a must): o Disturbances, attacks, o Noise o Image blur o Discretization errors o Compression artifacts o Deviations from the mathematical model (non-linearities, non-planarities, etc.) o Intra-class variations General approach + + + 1. Find the interest points. + + 2. Consider the region + ( ) around each keypoint. local descriptor 3. Compute a local descriptor from the region and normalize the feature.
    [Show full text]
  • Interest Point Detectors and Descriptors for IR Images
    DEGREE PROJECT, IN COMPUTER SCIENCE , SECOND LEVEL STOCKHOLM, SWEDEN 2015 Interest Point Detectors and Descriptors for IR Images AN EVALUATION OF COMMON DETECTORS AND DESCRIPTORS ON IR IMAGES JOHAN JOHANSSON KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF COMPUTER SCIENCE AND COMMUNICATION (CSC) Interest Point Detectors and Descriptors for IR Images An Evaluation of Common Detectors and Descriptors on IR Images JOHAN JOHANSSON Master’s Thesis at CSC Supervisors: Martin Solli, FLIR Systems Atsuto Maki Examiner: Stefan Carlsson Abstract Interest point detectors and descriptors are the basis of many applications within computer vision. In the selection of which methods to use in an application, it is of great in- terest to know their performance against possible changes to the appearance of the content in an image. Many studies have been completed in the field on visual images while the performance on infrared images is not as charted. This degree project, conducted at FLIR Systems, pro- vides a performance evaluation of detectors and descriptors on infrared images. Three evaluations steps are performed. The first evaluates the performance of detectors; the sec- ond descriptors; and the third combinations of detectors and descriptors. We find that best performance is obtained by Hessian- Affine with LIOP and the binary combination of ORB de- tector and BRISK descriptor to be a good alternative with comparable results but with increased computational effi- ciency by two orders of magnitude. Referat Detektorer och deskriptorer för extrempunkter i IR-bilder Detektorer och deskriptorer är grundpelare till många ap- plikationer inom datorseende. Vid valet av metod till en specifik tillämpning är det av stort intresse att veta hur de presterar mot möjliga förändringar i hur innehållet i en bild framträder.
    [Show full text]