David Lowe, Distinctive Image Features from Scale-Invariant
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Hough Transform, Descriptors Tammy Riklin Raviv Electrical and Computer Engineering Ben-Gurion University of the Negev Hough Transform
DIGITAL IMAGE PROCESSING Lecture 7 Hough transform, descriptors Tammy Riklin Raviv Electrical and Computer Engineering Ben-Gurion University of the Negev Hough transform y m x b y m 3 5 3 3 2 2 3 7 11 10 4 3 2 3 1 4 5 2 2 1 0 1 3 3 x b Slide from S. Savarese Hough transform Issues: • Parameter space [m,b] is unbounded. • Vertical lines have infinite gradient. Use a polar representation for the parameter space Hough space r y r q x q x cosq + ysinq = r Slide from S. Savarese Hough Transform Each point votes for a complete family of potential lines: Each pencil of lines sweeps out a sinusoid in Their intersection provides the desired line equation. Hough transform - experiments r q Image features ρ,ϴ model parameter histogram Slide from S. Savarese Hough transform - experiments Noisy data Image features ρ,ϴ model parameter histogram Need to adjust grid size or smooth Slide from S. Savarese Hough transform - experiments Image features ρ,ϴ model parameter histogram Issue: spurious peaks due to uniform noise Slide from S. Savarese Hough Transform Algorithm 1. Image à Canny 2. Canny à Hough votes 3. Hough votes à Edges Find peaks and post-process Hough transform example http://ostatic.com/files/images/ss_hough.jpg Incorporating image gradients • Recall: when we detect an edge point, we also know its gradient direction • But this means that the line is uniquely determined! • Modified Hough transform: for each edge point (x,y) θ = gradient orientation at (x,y) ρ = x cos θ + y sin θ H(θ, ρ) = H(θ, ρ) + 1 end Finding lines using Hough transform -
Scale Invariant Feature Transform (SIFT) Why Do We Care About Matching Features?
Scale Invariant Feature Transform (SIFT) Why do we care about matching features? • Camera calibration • Stereo • Tracking/SFM • Image moiaicing • Object/activity Recognition • … Objection representation and recognition • Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging parameters • Automatic Mosaicing • http://www.cs.ubc.ca/~mbrown/autostitch/autostitch.html We want invariance!!! • To illumination • To scale • To rotation • To affine • To perspective projection Types of invariance • Illumination Types of invariance • Illumination • Scale Types of invariance • Illumination • Scale • Rotation Types of invariance • Illumination • Scale • Rotation • Affine (view point change) Types of invariance • Illumination • Scale • Rotation • Affine • Full Perspective How to achieve illumination invariance • The easy way (normalized) • Difference based metrics (random tree, Haar, and sift, gradient) How to achieve scale invariance • Pyramids • Scale Space (DOG method) Pyramids – Divide width and height by 2 – Take average of 4 pixels for each pixel (or Gaussian blur with different ) – Repeat until image is tiny – Run filter over each size image and hope its robust How to achieve scale invariance • Scale Space: Difference of Gaussian (DOG) – Take DOG features from differences of these images‐producing the gradient image at different scales. – If the feature is repeatedly present in between Difference of Gaussians, it is Scale Invariant and should be kept. Differences Of Gaussians -
Seamless Texture Mapping of 3D Point Clouds
Seamless Texture Mapping of 3D Point Clouds Dan Goldberg Mentor: Carl Salvaggio Chester F. Carlson Center for Imaging Science, Rochester Institute of Technology Rochester, NY November 25, 2014 Abstract The two similar, quickly growing fields of computer vision and computer graphics give users the ability to immerse themselves in a realistic computer generated environment by combining the ability create a 3D scene from images and the texture mapping process of computer graphics. The output of a popular computer vision algorithm, structure from motion (obtain a 3D point cloud from images) is incomplete from a computer graphics standpoint. The final product should be a textured mesh. The goal of this project is to make the most aesthetically pleasing output scene. In order to achieve this, auxiliary information from the structure from motion process was used to texture map a meshed 3D structure. 1 Introduction The overall goal of this project is to create a textured 3D computer model from images of an object or scene. This problem combines two different yet similar areas of study. Computer graphics and computer vision are two quickly growing fields that take advantage of the ever-expanding abilities of our computer hardware. Computer vision focuses on a computer capturing and understanding the world. Computer graphics con- centrates on accurately representing and displaying scenes to a human user. In the computer vision field, constructing three-dimensional (3D) data sets from images is becoming more common. Microsoft's Photo- synth (Snavely et al., 2006) is one application which brought attention to the 3D scene reconstruction field. Many structure from motion algorithms are being applied to data sets of images in order to obtain a 3D point cloud (Koenderink and van Doorn, 1991; Mohr et al., 1993; Snavely et al., 2006; Crandall et al., 2011; Weng et al., 2012; Yu and Gallup, 2014; Agisoft, 2014). -
Robust Tracking and Structure from Motion with Sampling Method
ROBUST TRACKING AND STRUCTURE FROM MOTION WITH SAMPLING METHOD A DISSERTATION SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS for the degree DOCTOR OF PHILOSOPHY IN ROBOTICS Robotics Institute Carnegie Mellon University Peng Chang July, 2002 Thesis Committee: Martial Hebert, Chair Robert Collins Jianbo Shi Michael Black This work was supported in part by the DARPA Tactical Mobile Robotics program under JPL contract 1200008 and by the Army Research Labs Robotics Collaborative Technology Alliance Contract DAAD 19- 01-2-0012. °c 2002 Peng Chang Keywords: structure from motion, visual tracking, visual navigation, visual servoing °c 2002 Peng Chang iii Abstract Robust tracking and structure from motion (SFM) are fundamental problems in computer vision that have important applications for robot visual navigation and other computer vision tasks. Al- though the geometry of the SFM problem is well understood and effective optimization algorithms have been proposed, SFM is still difficult to apply in practice. The reason is twofold. First, finding correspondences, or "data association", is still a challenging problem in practice. For visual navi- gation tasks, the correspondences are usually found by tracking, which often assumes constancy in feature appearance and smoothness in camera motion so that the search space for correspondences is much reduced. Therefore tracking itself is intrinsically difficult under degenerate conditions, such as occlusions, or abrupt camera motion which violates the assumptions for tracking to start with. Second, the result of SFM is often observed to be extremely sensitive to the error in correspon- dences, which is often caused by the failure of the tracking. This thesis aims to tackle both problems simultaneously. -
Exploiting Information Theory for Filtering the Kadir Scale-Saliency Detector
Introduction Method Experiments Conclusions Exploiting Information Theory for Filtering the Kadir Scale-Saliency Detector P. Suau and F. Escolano {pablo,sco}@dccia.ua.es Robot Vision Group University of Alicante, Spain June 7th, 2007 P. Suau and F. Escolano Bayesian filter for the Kadir scale-saliency detector 1 / 21 IBPRIA 2007 Introduction Method Experiments Conclusions Outline 1 Introduction 2 Method Entropy analysis through scale space Bayesian filtering Chernoff Information and threshold estimation Bayesian scale-saliency filtering algorithm Bayesian scale-saliency filtering algorithm 3 Experiments Visual Geometry Group database 4 Conclusions P. Suau and F. Escolano Bayesian filter for the Kadir scale-saliency detector 2 / 21 IBPRIA 2007 Introduction Method Experiments Conclusions Outline 1 Introduction 2 Method Entropy analysis through scale space Bayesian filtering Chernoff Information and threshold estimation Bayesian scale-saliency filtering algorithm Bayesian scale-saliency filtering algorithm 3 Experiments Visual Geometry Group database 4 Conclusions P. Suau and F. Escolano Bayesian filter for the Kadir scale-saliency detector 3 / 21 IBPRIA 2007 Introduction Method Experiments Conclusions Local feature detectors Feature extraction is a basic step in many computer vision tasks Kadir and Brady scale-saliency Salient features over a narrow range of scales Computational bottleneck (all pixels, all scales) Applied to robot global localization → we need real time feature extraction P. Suau and F. Escolano Bayesian filter for the Kadir scale-saliency detector 4 / 21 IBPRIA 2007 Introduction Method Experiments Conclusions Salient features X HD(s, x) = − Pd,s,x log2Pd,s,x d∈D Kadir and Brady algorithm (2001): most salient features between scales smin and smax P. -
3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints
3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints Fred Rothganger ([email protected]) Svetlana Lazebnik ([email protected]) Department of Computer Science and Beckman Institute University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA Cordelia Schmid ([email protected]) INRIA Rhone-Alpesˆ 665, Avenue de l’Europe, 38330 Montbonnot, France Jean Ponce ([email protected]) Department of Computer Science and Beckman Institute University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA Abstract. This article introduces a novel representation for three-dimensional (3D) objects in terms of local affine-invariant descriptors of their images and the spatial relationships between the corresponding surface patches. Geometric constraints associated with different views of the same patches under affine projection are combined with a normalized representation of their appearance to guide matching and reconstruction, allowing the acquisition of true 3D affine and Euclidean models from multiple unregistered images, as well as their recognition in photographs taken from arbitrary viewpoints. The proposed approach does not require a separate segmentation stage, and it is applicable to highly cluttered scenes. Modeling and recognition results are presented. Keywords: Three-dimensional object recognition, image-based modeling, affine-invariant image descriptors, multi- view geometry. 1. Introduction This article addresses the problem of recognizing three-dimensional (3D) objects -
Self-Driving Car Autonomous System Overview
Self-Driving Car Autonomous System Overview - Industrial Electronics Engineering - Bachelors’ Thesis - Author: Daniel Casado Herráez Thesis Director: Javier Díaz Dorronsoro, PhD Thesis Supervisor: Andoni Medina, MSc San Sebastián - Donostia, June 2020 Self-Driving Car Autonomous System Overview Daniel Casado Herráez "When something is important enough, you do it even if the odds are not in your favor." - Elon Musk - 2 Self-Driving Car Autonomous System Overview Daniel Casado Herráez To my Grandfather, Family, Friends & to my supervisor Javier Díaz 3 Self-Driving Car Autonomous System Overview Daniel Casado Herráez 1. Contents 1.1. Index 1. Contents ..................................................................................................................................................... 4 1.1. Index.................................................................................................................................................. 4 1.2. Figures ............................................................................................................................................... 6 1.1. Tables ................................................................................................................................................ 7 1.2. Algorithms ........................................................................................................................................ 7 2. Abstract ..................................................................................................................................................... -
Hough Transform 1 Hough Transform
Hough transform 1 Hough transform The Hough transform ( /ˈhʌf/) is a feature extraction technique used in image analysis, computer vision, and digital image processing.[1] The purpose of the technique is to find imperfect instances of objects within a certain class of shapes by a voting procedure. This voting procedure is carried out in a parameter space, from which object candidates are obtained as local maxima in a so-called accumulator space that is explicitly constructed by the algorithm for computing the Hough transform. The classical Hough transform was concerned with the identification of lines in the image, but later the Hough transform has been extended to identifying positions of arbitrary shapes, most commonly circles or ellipses. The Hough transform as it is universally used today was invented by Richard Duda and Peter Hart in 1972, who called it a "generalized Hough transform"[2] after the related 1962 patent of Paul Hough.[3] The transform was popularized in the computer vision community by Dana H. Ballard through a 1981 journal article titled "Generalizing the Hough transform to detect arbitrary shapes". Theory In automated analysis of digital images, a subproblem often arises of detecting simple shapes, such as straight lines, circles or ellipses. In many cases an edge detector can be used as a pre-processing stage to obtain image points or image pixels that are on the desired curve in the image space. Due to imperfections in either the image data or the edge detector, however, there may be missing points or pixels on the desired curves as well as spatial deviations between the ideal line/circle/ellipse and the noisy edge points as they are obtained from the edge detector. -
Euclidean Reconstruction of Natural Underwater
EUCLIDEAN RECONSTRUCTION OF NATURAL UNDERWATER SCENES USING OPTIC IMAGERY SEQUENCE By Han Hu B.S. in Remote Sensing Science and Technology, Wuhan University, 2005 M.S. in Photogrammetry and Remote Sensing, Wuhan University, 2009 THESIS Submitted to the University of New Hampshire In Partial Fulfillment of The Requirements for the Degree of Master of Science In Ocean Engineering Ocean Mapping September, 2015 This thesis has been examined and approved in partial fulfillment of the requirements for the degree of M.S. in Ocean Engineering Ocean Mapping by: Thesis Director, Yuri Rzhanov, Research Professor, Ocean Engineering Philip J. Hatcher, Professor, Computer Science R. Daniel Bergeron, Professor, Computer Science On May 26, 2015 Original approval signatures are on file with the University of New Hampshire Graduate School. ii ACKNOWLEDGEMENTS I would like to thank all those people who made this thesis possible. My first debt of sincere gratitude and thanks must go to my advisor, Dr. Yuri Rzhanov. Throughout the entire three-year period, he offered his unreserved help, guidance and support to me in both work and life. Without him, it is definite that my thesis could not be completed successfully. I could not have imagined having a better advisor than him. Besides my advisor in Ocean Engineering, I would like to give many thanks to the rest of my thesis committee and my advisors in the Department of Computer Science: Prof. Philip J. Hatcher and Prof. R. Daniel Bergeron as well for their encouragement, insightful comments and all their efforts in making this thesis successful. I also would like to express my thanks to the Center for Coastal and Ocean Mapping and the Department of Computer Science at the University of New Hampshire for offering me the opportunities and environment to work on exciting projects and study in useful courses from which I learned many advanced skills and knowledge. -
View Matching with Blob Features
View Matching with Blob Features Per-Erik Forssen´ and Anders Moe Computer Vision Laboratory Department of Electrical Engineering Linkoping¨ University, Sweden Abstract mographic transformation that is not part of the affine trans- formation. This means that our results are relevant for both This paper introduces a new region based feature for ob- wide baseline matching and 3D object recognition. ject recognition and image matching. In contrast to many A somewhat related approach to region based matching other region based features, this one makes use of colour is presented in [1], where tangents to regions are used to de- in the feature extraction stage. We perform experiments on fine linear constraints on the homography transformation, the repeatability rate of the features across scale and incli- which can then be found through linear programming. The nation angle changes, and show that avoiding to merge re- connection is that a line conic describes the set of all tan- gions connected by only a few pixels improves the repeata- gents to a region. By matching conics, we thus implicitly bility. Finally we introduce two voting schemes that allow us match the tangents of the ellipse-approximated regions. to find correspondences automatically, and compare them with respect to the number of valid correspondences they 2. Blob features give, and their inlier ratios. We will make use of blob features extracted using a clus- tering pyramid built using robust estimation in local image regions [2]. Each extracted blob is represented by its aver- 1. Introduction age colour pk,areaak, centroid mk, and inertia matrix Ik. I.e. -
Histogram of Directions by the Structure Tensor
Histogram of Directions by the Structure Tensor Josef Bigun Stefan M. Karlsson Halmstad University Halmstad University IDE SE-30118 IDE SE-30118 Halmstad, Sweden Halmstad, Sweden [email protected] [email protected] ABSTRACT entity). Also, by using the approach of trying to reduce di- Many low-level features, as well as varying methods of ex- rectionality measures to the structure tensor, insights are to traction and interpretation rely on directionality analysis be gained. This is especially true for the study of the his- (for example the Hough transform, Gabor filters, SIFT de- togram of oriented gradient (HOGs) features (the descriptor scriptors and the structure tensor). The theory of the gra- of the SIFT algorithm[12]). We will present both how these dient based structure tensor (a.k.a. the second moment ma- are very similar to the structure tensor, but also detail how trix) is a very well suited theoretical platform in which to they differ, and in the process present a different algorithm analyze and explain the similarities and connections (indeed for computing them without binning. In this paper, we will often equivalence) of supposedly different methods and fea- limit ourselves to the study of 3 kinds of definitions of di- tures that deal with image directionality. Of special inter- rectionality, and their associated features: 1) the structure est to this study is the SIFT descriptors (histogram of ori- tensor, 2) HOGs , and 3) Gabor filters. The results of relat- ented gradients, HOGs). Our analysis of interrelationships ing the Gabor filters to the tensor have been studied earlier of prominent directionality analysis tools offers the possibil- [3], [9], and so for brevity, more attention will be given to ity of computation of HOGs without binning, in an algo- the HOGs. -
The Hough Transform As a Tool for Image Analysis
THE HOUGH TRANSFORM AS A TOOL FOR IMAGE ANALYSIS Josep Llad´os Computer Vision Center - Dept. Inform`atica. Universitat Aut`onoma de Barcelona. Computer Vision Master March 2003 Abstract The Hough transform is a widespread technique in image analysis. Its main idea is to transform the image to a parameter space where clusters or particular configurations identify instances of a shape under detection. In this chapter we overview some meaningful Hough-based techniques for shape detection, either parametrized or generalized shapes. We also analyze some approaches based on the straight line Hough transform able to detect particular structural properties in images. Some of the ideas of these approaches will be used in the following chapter to solve one of the goals of the present work. 1 Introduction The Hough transform was first introduced by Paul Hough in 1962 [4] with the aim of detecting alignments in T.V. lines. It became later the basis of a great number of image analysis applications. The Hough transform is mainly used to detect parametric shapes in images. It was first used to detect straight lines and later extended to other parametric models such as circumferences or ellipses, being finally generalized to any parametric shape [1]. The key idea of the Hough transform is that spatially extended patterns are transformed into a parameter space where they can be represented in a spatially compact way. Thus, a difficult global detection problem in the image space is reduced to an easier problem of peak detection in a parameter space. Ü; Ý µ A set of collinear image points ´ can be represented by the equation: ÑÜ Ò =¼ Ý (1) Ò where Ñ and are two parameters, the slope and intercept, which characterize the line.