ECE 5582 Computer Vision Lec 04: Filtering and Edge Features

Total Page:16

File Type:pdf, Size:1020Kb

ECE 5582 Computer Vision Lec 04: Filtering and Edge Features Spring 2020: Venu: Haag 313, Time: M/W 4-5:15pm ECE 5582 Computer Vision Lec 04: Filtering and Edge Features Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: [email protected], Ph: x 2346. http://l.web.umkc.edu/lizhu slides created with WPS Office Linux and EqualX LaTex equation editor Z. Li: ECE 5582 Computer Vision, 2020 p.1 Outline Recap of Lec 03 . Perspective Projection &Homography . SVD and an example . HW-1 Image Filtering – A Quick Review . Image Filtering and Correlation . Edge Detection Edge Descriptors . HOG – Histogram of Oriented Gradients Z. Li: ECE 5582 Computer Vision, 2020 p.2 Camera Projection: Intrinsic + Extrinsic Parameters Camera Intrinsic + Extrinsic Parameters: éxù éuù éa 0 u0 ùér11 r12 r13 tx ùê ú ê ú ê úê ú y x = K[R t] X w v = 0 b v r r r t ê ú ê ú ê 0 úê 21 22 23 y úêzú ê ú ê úê ú ë1û ë0 0 1 ûër31 r32 r33 tz ûê ú ë1û . Intrinsic: camera center:[u0, v0], focus length/aspect ratio: . Extrinsic: Rotation R=RxRyRz, translation t=[tx, ty, tz] R,t jw X kw Ow x iw Z. Li: ECE 5582 Computer Vision, 2020 p.3 Homography In general, homography H maps 2d points according to, x’=Hx ′ = ℎ ℎ ℎ ′ ℎ ℎ ℎ . Up to′ a sℎcaleℎ, asℎ [x, y, w]=[sx, sy, sw], so H has 8 DoF Affine Transform: . 6 DoF: Contains a translation [t1, t2], and invertable affine matrix A[2x2] H = 1 4 Similarity 0Tran0sfo1rm, . 4DoF: a rigid transform that preserves distance if s=1: H = X − sin X cos Z. Li: ECE 5582 Compute0r Vision, 2020 0 1 p.4 Homography Estimation – SVD Pseudo Inv In matrix form, we have, Ah = 0 If we have more than 4, then the system is over-determined, we will find a solution by least squares, computationally via SVD Pseudo Inverse: . SVD: . Pseudo inverse (minimizing least square error) . Details: htt=ps://Xwww.ecse.rpi.edu/~qji/CV/svd_review.pdf + + Z. Li: ECE 5582 Computer Vision, 202=0 X p.5 HW-1 Compute HSV color histogram for CIFAR 100 image retrieval . rgb2hsv . uniform quantize HSV space . euclidean distance hist ranking . try other metrics, like KL distance as bonus Compute Homography Z. Li: ECE 5582 Computer Vision, 2020 p.6 Outline Recap of Lec 03 . SVD and an example Image Filtering – A Quick Review . Image Filters . Edge Detection Edge Feature Z. Li: ECE 5582 Computer Vision, 2020 p.7 What is an Image ? The famous Lenna…. https://www.wired.com/story/finding-lena-the-patron-saint-of- jpegs/ Lenna, 1971 Lena at ICIP 2015, Quebec City Lenna, 2019 Z. Li: ECE 5582 Computer Vision, 2020 p.8 What is an image? We can think of a (grayscale) image as a function, f, from R2 to R (or a 2D signal): . f (x,y) gives the intensity at position (x,y) f (x, y) x y . A digital image is a discrete (sampled, quantized) version of this function Z. Li: ECE 5582 Computer Vision, 2020 p.9 Image as a function: I=f(x,y); A grid (matrix) of intensity values 92 [R(x,y), G(x,y), B(x,y)]=[90, 0, 54] 221 [249, 215, 203] 144 [220, 15, 77] (common to use one byte per value: 0 = black, 255 = white) Z. Li: ECE 5582 Computer Vision, 2020 p.10 Motivation: Why Filtering ? Some use cases… Edge detection Deep Learning: VGG16 Z. Li: ECE 5582 Computer Vision, 2020 p.11 Image filtering – Looking at pixel neighbors Modify the pixels in an image based on some function of a local neighborhood of each pixel . Discrete form: 10 5 3 Some function 4 5 1 7 1 1 7 Local image data Modified image data Z. Li: ECE 5582 Computer Vision, 2020 p.12 Linear filtering One simple version: linear filtering (cross-correlation, convolution) . Replace each pixel by a linear combination of its neighbors The prescription for the linear combination is called the “kernel” (or “mask”, “filter”) 10 5 3 0 0 0 4 6 1 0 .5 0 8 1 1 8 0 1 .5 Local image kernel Modified image data data Source: L. Zhang Z. Li: ECE 5582 Computer Vision, 2020 p.13 Cross-correlation Cross-Correlation: . No flipping of kernel Let F be the image, H be the kernel (of size 2k+1 x 2k+1), and G be the output image This is called a cross-correlation operation: Z. Li: ECE 5582 Computer Vision, 2020 p.14 Convolution Same as cross-correlation, except that the kernel is “flipped” (horizontally and vertically), called convolution = ∗ Convolution is commutative and associative . F=G*h + G*f = G*(h+f) . F=G*h*f = G*(h*f) Z. Li: ECE 5582 Computer Vision, 2020 p.15 Mean filtering Find the average 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 20 30 30 30 20 10 0 0 0 90 90 90 90 90 0 0 0 20 40 60 60 60 40 20 1/9 11/9 11/9 0 0 0 90 90 90 90 90 0 0 0 30 60 90 90 90 60 30 11/9 11/9 1/9 0 0 0 90 90 90 90 90 0 0 0 30 50 80 80 90 60 30 0 0 0 90 0 90 90 90 0 0 0 30 50 80 80 90 60 30 1/19 11/9 11/9 = 0 0 0 90 90 90 90 90 0 0 0 20 30 50 50 60 40 20 * 0 0 0 0 0 0 0 0 0 0 10 20 30 30 30 30 20 10 0 0 90 0 0 0 0 0 0 0 10 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Z. Li: ECE 5582 Computer Vision, 2020 p.16 Linear filters: examples Identity filter: 0 0 0 0 1 0 = * 0 0 0 Original Identical image Source: D. Lowe Z. Li: ECE 5582 Computer Vision, 2020 p.17 Linear filters: examples Shift Filter 0 0 0 1 0 0 = * 0 0 0 Original Shifted left By 1 pixel Source: D. Lowe Z. Li: ECE 5582 Computer Vision, 2020 p.18 Linear filters: examples Average/Blurring 1 1 1 1 1 1 = * 1 1 1 Original Blur (with a mean filter) Source: D. Lowe Z. Li: ECE 5582 Computer Vision, 2020 p.19 Linear filters: examples Sharpen: remove the average. 0 0 0 1 1 1 0 2 0 - 1 1 1 = * 0 0 0 1 1 1 Sharpening filter Original (accentuates edges) Source: D. Lowe Z. Li: ECE 5582 Computer Vision, 2020 p.20 Smoothing with box filter Box Filter: Box filter in integral image domain: . much faster, just 4 ADD operations. More details in CABOX coverage Source: D. Forsyth Z. Li: ECE 5582 Computer Vision, 2020 p.21 Gaussian Kernel Gaussian Kernel with scale Matlab: h=fspecial('gaussian', 5, 1.0); Z. Li: ECE 5582 Computer Vision, 2020 p.22 Gaussian Filters – A Scale Space Approximation Gaussian Blur Matlab: n=8;s = 1.25.^[1:n]; m=fix(6.*s); figure(30); for k=1:8 subplot(2,4,k); h = fspecial('gaussian', m(k), s(k)); imagesc(h); title(sprintf('s = %1.1f', s(k))); end figure(31); subplot(3,3,1); imshow(im); title('f_0(1.0)'); for k=1:n subplot(3,3,k+1); h = fspecial('gaussian', m(k), s(k)); f = imfilter(im, h); imshow(f); title(sprintf('f_%d(%1.1f)', k+1, s(k))); end Z. Li: ECE 5582 Computer Vision, 2020 p.23 Gaussian filter properties Successive Gaussian filtering . convolution of Gaussian is still Gaussian * = new kernel sigma: h1+ h2 Matlab: h1=fspecial('gaussian', 11, 1.2); h2=fspecial('gaussian', 11, 2.0); h3 = conv2(h1, h2); h4=fspecial('gaussian', 11, 2.0+1.2); Z. Li: ECE 5582 Computer Vision, 2020 p.24 Separatable Fitler Gaussian filter is separatable: u v h verify via SVD: . h = fspecial('gaussian', 11, 1.2); . [u,s,v]=svd(h); plot(diag(s)); Z. Li: ECE 5582 Computer Vision, 2020 p.25 Sharpening revisited What does blurring take away? – = original smoothed (5x5) detail Let’s add it back: + α = original detail sharpened Source: S. Lazebnik Z. Li: ECE 5582 Computer Vision, 2020 p.26 Sharpen filter - LoG Laplacian of Gaussian blurred image image unit impulse (identity) scaled impulse Gaussian Laplacian of Gaussian Z. Li: ECE 5582 Computer Vision, 2020 p.27 Sharpen filter unfiltered filtered Z. Li: ECE 5582 Computer Vision, 2020 p.28 Convolution in the real world Physical world convolutions Camera Shake = * Source: Fergus, et al. “Removing Camera Shake from a Single Photograph”, SIGGRAPH 2006 Bokeh: Blur in out-of-focus regions of an image. Source: http://lullaby.homepage.dk/diy-camera/bokeh.html Z. Li: ECE 5582 Computer Vision, 2020 p.29 Filtering in Matlab Area of support for the operations To give the same n1 x m1 output, need to padding the edge . Default is zero padding . Also replicate the last edge pixel . Or, mirroring (used in MPEG codec) Z. Li: ECE 5582 Computer Vision, 2020 p.30 Image Filtering, Sweet Deal with Matlab It is such a nice tool . Main filter operation: im2 = imfilter(im, h, ‘replicate’) . Design your filter: h= fspecial(‘filter_type’, kernel_size, options) Filter Design Examples: . Sobel . Laplacian, Laplacian of Gaussian . Gaussian, Difference of Gaussian (SIFT) Z. Li: ECE 5582 Computer Vision, 2020 p.31 Matlab Image Filtering Example %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % image filters %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% im = imread('../pics/Lenna.png'); im = rgb2gray(im); im1 = double(im(201:320, 201:320)); % edge filters h{1} = fspecial('sobel'); h{2} = fspecial('laplacian', 0.25); h{3} = fspecial('log', 7, 0.25); % gaussians sigmas = [1.226, 1.554, 1.946]; for k=1:length(sigmas) h{3+k} = fspecial('gaussian',11, sigmas(k)); end % diff of gaussian h{7} = (h{6} - h{4}); h{7} = h{7}/sum(sum(h{7})); h{8} = h{5} - h{4}; h{8} = h{8}/sum(sum(h{8})); for k=1:8 fprintf('\n k=%d', k); figure(26); subplot(2,4,k); grid on; hold on; colormap('gray'); imagesc(h{k}); figure(27); subplot(2,4,k); imshow(imfilter(im, h{k}, 'replicate')); end Z.
Recommended publications
  • Edge Detection of an Image Based on Extended Difference of Gaussian
    American Journal of Computer Science and Technology 2019; 2(3): 35-47 http://www.sciencepublishinggroup.com/j/ajcst doi: 10.11648/j.ajcst.20190203.11 ISSN: 2640-0111 (Print); ISSN: 2640-012X (Online) Edge Detection of an Image Based on Extended Difference of Gaussian Hameda Abd El-Fattah El-Sennary1, *, Mohamed Eid Hussien1, Abd El-Mgeid Amin Ali2 1Faculty of Science, Aswan University, Aswan, Egypt 2Faculty of Computers and Information, Minia University, Minia, Egypt Email address: *Corresponding author To cite this article: Hameda Abd El-Fattah El-Sennary, Abd El-Mgeid Amin Ali, Mohamed Eid Hussien. Edge Detection of an Image Based on Extended Difference of Gaussian. American Journal of Computer Science and Technology. Vol. 2, No. 3, 2019, pp. 35-47. doi: 10.11648/j.ajcst.20190203.11 Received: November 24, 2019; Accepted: December 9, 2019; Published: December 20, 2019 Abstract: Edge detection includes a variety of mathematical methods that aim at identifying points in a digital image at which the image brightness changes sharply or, more formally, has discontinuities. The points at which image brightness changes sharply are typically organized into a set of curved line segments termed edges. The same problem of finding discontinuities in one-dimensional signals is known as step detection and the problem of finding signal discontinuities over time is known as change detection. Edge detection is a fundamental tool in image processing, machine vision and computer vision, particularly in the areas of feature detection and feature extraction. It's also the most important parts of image processing, especially in determining the image quality.
    [Show full text]
  • Scale Invariant Feature Transform - Scholarpedia 2015-04-21 15:04
    Scale Invariant Feature Transform - Scholarpedia 2015-04-21 15:04 Scale Invariant Feature Transform +11 Recommend this on Google Tony Lindeberg (2012), Scholarpedia, 7(5):10491. doi:10.4249/scholarpedia.10491 revision #149777 [link to/cite this article] Prof. Tony Lindeberg, KTH Royal Institute of Technology, Stockholm, Sweden Scale Invariant Feature Transform (SIFT) is an image descriptor for image-based matching and recognition developed by David Lowe (1999, 2004). This descriptor as well as related image descriptors are used for a large number of purposes in computer vision related to point matching between different views of a 3-D scene and view-based object recognition. The SIFT descriptor is invariant to translations, rotations and scaling transformations in the image domain and robust to moderate perspective transformations and illumination variations. Experimentally, the SIFT descriptor has been proven to be very useful in practice for image matching and object recognition under real-world conditions. In its original formulation, the SIFT descriptor comprised a method for detecting interest points from a grey- level image at which statistics of local gradient directions of image intensities were accumulated to give a summarizing description of the local image structures in a local neighbourhood around each interest point, with the intention that this descriptor should be used for matching corresponding interest points between different images. Later, the SIFT descriptor has also been applied at dense grids (dense SIFT) which have been shown to lead to better performance for tasks such as object categorization, texture classification, image alignment and biometrics . The SIFT descriptor has also been extended from grey-level to colour images and from 2-D spatial images to 2+1-D spatio-temporal video.
    [Show full text]
  • Edge Detection Low Level
    Image Processing - Lesson 10 Image Processing - Computer Vision Edge Detection Low Level • Edge detection masks Image Processing representation, compression,transmission • Gradient Detectors • Compass Detectors image enhancement • Second Derivative - Laplace detectors • Edge Linking edge/feature finding • Hough Transform image "understanding" Computer Vision High Level UFO - Unidentified Flying Object Point Detection -1 -1 -1 Convolution with: -1 8 -1 -1 -1 -1 Large Positive values = light point on dark surround Large Negative values = dark point on light surround Example: 5 5 5 5 5 -1 -1 -1 5 5 5 100 5 * -1 8 -1 5 5 5 5 5 -1 -1 -1 0 0 -95 -95 -95 = 0 0 -95 760 -95 0 0 -95 -95 -95 Edge Definition Edge Detection Line Edge Line Edge Detectors -1 -1 -1 -1 -1 2 -1 2 -1 2 -1 -1 2 2 2 -1 2 -1 -1 2 -1 -1 2 -1 -1 -1 -1 2 -1 -1 -1 2 -1 -1 -1 2 gray value gray x edge edge Step Edge Step Edge Detectors -1 1 -1 -1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 1 1 -1 -1 -1 -1 1 1 1 1 -1 1 -1 -1 1 1 1 1 1 1 gray value gray -1 -1 1 1 1 -1 1 1 x edge edge Example Edge Detection by Differentiation Step Edge detection by differentiation: 1D image f(x) gray value gray x 1st derivative f'(x) threshold |f'(x)| - threshold Pixels that passed the threshold are -1 -1 1 1 -1 -1 -1 -1 Edge Pixels -1 -1 1 1 -1 -1 -1 -1 -1 -1 1 1 1 1 1 1 -1 -1 1 1 1 -1 1 1 Gradient Edge Detection Gradient Edge - Examples ∂f ∂x ∇ f((x,y) = Gradient ∂f ∂y ∂f ∇ f((x,y) = ∂x , 0 f 2 f 2 Gradient Magnitude ∂ + ∂ √( ∂ x ) ( ∂ y ) ∂f ∂f Gradient Direction tg-1 ( / ) ∂y ∂x ∂f ∇ f((x,y) = 0 , ∂y Differentiation in Digital Images Example Edge horizontal - differentiation approximation: ∂f(x,y) Original FA = ∂x = f(x,y) - f(x-1,y) convolution with [ 1 -1 ] Gradient-X Gradient-Y vertical - differentiation approximation: ∂f(x,y) FB = ∂y = f(x,y) - f(x,y-1) convolution with 1 -1 Gradient-Magnitude Gradient-Direction Gradient (FA , FB) 2 2 1/2 Magnitude ((FA ) + (FB) ) Approx.
    [Show full text]
  • Fast Almost-Gaussian Filtering
    Fast Almost-Gaussian Filtering Peter Kovesi Centre for Exploration Targeting School of Earth and Environment The University of Western Australia 35 Stirling Highway Crawley WA 6009 Email: [email protected] Abstract—Image averaging can be performed very efficiently paper describes how the fixed low cost of averaging achieved using either separable moving average filters or by using summed through separable moving average filters, or via summed area area tables, also known as integral images. Both these methods tables, can be exploited to achieve a good approximation allow averaging to be performed at a small fixed cost per pixel, independent of the averaging filter size. Repeated filtering with to Gaussian filtering also at a small fixed cost per pixel, averaging filters can be used to approximate Gaussian filtering. independent of filter size. Thus a good approximation to Gaussian filtering can be achieved Summed area tables were devised by Crow in 1984 [12] but at a fixed cost per pixel independent of filter size. This paper only recently introduced to the computer vision community by describes how to determine the averaging filters that one needs Viola and Jones [13]. A summed area table, or integral image, to approximate a Gaussian with a specified standard deviation. The design of bandpass filters from the difference of Gaussians can be generated by computing the cumulative sums along the is also analysed. It is shown that difference of Gaussian bandpass rows of an image and then computing the cumulative sums filters share some of the attributes of log-Gabor filters in that down the columns.
    [Show full text]
  • Lecture 10 Detectors and Descriptors
    This lecture is about detectors and descriptors, which are the basic building blocks for many tasks in 3D vision and recognition. We’ll discuss Lecture 10 some of the properties of detectors and descriptors and walk through examples. Detectors and descriptors • Properties of detectors • Edge detectors • Harris • DoG • Properties of descriptors • SIFT • HOG • Shape context Silvio Savarese Lecture 10 - 16-Feb-15 Previous lectures have been dedicated to characterizing the mapping between 2D images and the 3D world. Now, we’re going to put more focus From the 3D to 2D & vice versa on inferring the visual content in images. P = [x,y,z] p = [x,y] 3D world •Let’s now focus on 2D Image The question we will be asking in this lecture is - how do we represent images? There are many basic ways to do this. We can just characterize How to represent images? them as a collection of pixels with their intensity values. Another, more practical, option is to describe them as a collection of components or features which correspond to “interesting” regions in the image such as corners, edges, and blobs. Each of these regions is characterized by a descriptor which captures the local distribution of certain photometric properties such as intensities, gradients, etc. Feature extraction and description is the first step in many recent vision algorithms. They can be considered as building blocks in many scenarios The big picture… where it is critical to: 1) Fit or estimate a model that describes an image or a portion of it 2) Match or index images 3) Detect objects or actions form images Feature e.g.
    [Show full text]
  • Anatomy of the SIFT Method Ives Rey Otero
    Anatomy of the SIFT method Ives Rey Otero To cite this version: Ives Rey Otero. Anatomy of the SIFT method. General Mathematics [math.GM]. École normale supérieure de Cachan - ENS Cachan, 2015. English. NNT : 2015DENS0044. tel-01226489 HAL Id: tel-01226489 https://tel.archives-ouvertes.fr/tel-01226489 Submitted on 9 Nov 2015 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Ecole´ Normale Sup´erieure de Cachan, France Anatomy of the SIFT method A dissertation presented by Ives Rey-Otero in fulfillment of the requirements for the degree of Doctor of Philosophy in the subject of Applied Mathematics Committee in charge Referees Coloma Ballester - Universitat Pompeu Fabra, ES Pablo Muse´ - Universidad de la Rep´ublica, UY Joachim Weickert - Universit¨at des Saarlandes, DE Advisors Jean-Michel Morel - ENS de Cachan, FR Mauricio Delbracio - Duke University, USA Examiners Patrick Perez´ - Technicolor Research, FR Fr´ed´eric Sur - Loria, FR ENSC-2015 603 September 2015 Version of November 5, 2015 at 12:27. Abstract of the Dissertation Ives Rey-Otero Anatomy of the SIFT method Under the direction of: Jean-Michel Morel and Mauricio Delbracio This dissertation contributes to an in-depth analysis of the SIFT method.
    [Show full text]
  • Efficient Construction of SIFT Multi-Scale Image Pyramids For
    Efficient Construction of SIFT Multi-Scale Image Pyramids for Embedded Robot Vision Peter Andreas Entschev and Hugo Vieira Neto Graduate School of Electrical Engineering and Applied Computer Science Federal University of Technology { Paran´a,Curitiba, Brazil [email protected], [email protected], http://www.cpgei.ct.utfpr.edu.br The original publication is available at www.springerlink.com. DOI 10.1007/978-3-662-43645-5 14 Abstract. Multi-scale interest point detectors such as the one used in the SIFT object recognition framework have been of interest for robot vi- sion applications for a long time. However, the computationally intensive algorithms used for the construction of multi-scale image pyramids make real-time operation very difficult to be achieved, especially when a low- power embedded system is considered as platform for implementation. In this work an efficient method for SIFT image pyramid construction is presented, aiming at near real-time operation in embedded systems. For that purpose, separable binomial kernels for image pyramid construction, rather than conventional Gaussian kernels, are used. Also, conveniently fixed input image sizes of 2N + 1 pixels in each dimension are used, in order to obtain fast and accurate resampling of image pyramid levels. Experiments comparing the construction time of both the conventional SIFT pyramid building scheme and the method suggested here show that the latter is almost four times faster than the former when running in the ARM Cortex-A8 core of a BeagleBoard-xM system. Keywords: multi-scale image pyramid, binomial filtering kernel, em- bedded robot vision 1 Introduction The design of autonomous mobile robots will benefit immensely from the use of physically small, low-power embedded systems that have recently become available, such as the BeagleBoard-xM [1] and the Raspberry Pi [2] boards.
    [Show full text]
  • Edge Detection
    Edge detection Digital Image Processing, K. Pratt, Chapter 15 Edge detection • Goal: identify objects in images – but also feature extraction, multiscale analysis, 3D reconstruction, motion recognition, image restoration, registration • Classical definition of the edge detection problem: localization of large local changes in the grey level image → large graylevel gradients – This definition does not apply to apparent edges, which require a more complex definition – Extension to color images • Contours are very important perceptual cues! – They provide a first saliency map for the interpretation of image semantics Contours as perceptual cues Contours as perceptual cues What do we detect? • Depending on the impulse response of the filter, we can detect different types of graylevel discontinuities – Isolate points (pixels) – Lines with a predefined slope – Generic contours • However, edge detection implies the evaluation of the local gradient and corresponds to a (directional) derivative Detection of Discontinuities • Point Detection Detected point Detection of Discontinuities • Line detection R1 R2 R3 R4 Detection of Discontinuities • Line Detection Example: Edge detection • mage locations with abrupt I changes → differentiation → high pass filtering f[,] m n Intensity profile n m ∂ f[,] m n ∂n n Types of edges Continuous domain edge models 2D discrete domain single pixel spot models Discrete domain edge models b a Profiles of image intensity edges Types of edge detectors • Unsupervised or autonomous: only rely on local image features
    [Show full text]
  • SIFTSIFT -- Thethe Scalescale Invariantinvariant Featurefeature Transformtransform
    SIFTSIFT -- TheThe ScaleScale InvariantInvariant FeatureFeature TransformTransform Distinctive image features from scale-invariant keypoints. David G. Lowe, International Journal of Computer Vision, 60, 2 (2004), pp. 91-110 Presented by Ofir Pele. Based upon slides from: - Sebastian Thrun and Jana Košecká - Neeraj Kumar Correspondence ■ Fundamental to many of the core vision problems – Recognition – Motion tracking – Multiview geometry ■ Local features are the key Images from: M. Brown and D. G. Lowe. Recognising Panoramas. In Proceedings of the the International Conference on Computer Vision (ICCV2003 ( Local Features: Detectors & Descriptors Detected Descriptors Interest Points/Regions <0 12 31 0 0 23 …> <5 0 0 11 37 15 …> <14 21 10 0 3 22 …> Ideal Interest Points/Regions ■ Lots of them ■ Repeatable ■ Representative orientation/scale ■ Fast to extract and match SIFT Overview Detector 1. Find Scale-Space Extrema 2. Keypoint Localization & Filtering – Improve keypoints and throw out bad ones 3. Orientation Assignment – Remove effects of rotation and scale 4. Create descriptor – Using histograms of orientations Descriptor SIFT Overview Detector 1.1. FindFind Scale-SpaceScale-Space ExtremaExtrema 2. Keypoint Localization & Filtering – Improve keypoints and throw out bad ones 3. Orientation Assignment – Remove effects of rotation and scale 4. Create descriptor – Using histograms of orientations Descriptor Scale Space ■ Need to find ‘characteristic scale’ for feature ■ Scale-Space: Continuous function of scale σ – Only reasonable kernel is
    [Show full text]
  • CSE 473/573 Computer Vision and Image Processing (CVIP)
    CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu [email protected] Lecture 11 – Local Features 1 Schedule • Last class – We started local features • Today – More on local features • Readings for today: Forsyth and Ponce Chapter 5 2 A hard feature matching problem NASA Mars Rover images Overview • Corners (Harris Detector) • Blobs • Descriptors Harris corner detector summary • Good corners – High contrast – Sharp change in edge orientation • Image features at good corners – Large gradients that change direction sharply • Will have 2 large eigenvalues • Compute matrix H by summing over window Overview • Corners (Harris Detector) • Blobs • Descriptors Blob detection with scale selection Achieving scale covariance • Goal: independently detect corresponding regions in scaled versions of the same image • Need scale selection mechanism for finding characteristic region size that is covariant with the image transformation Recall: Edge detection Edge f d Derivative g of Gaussian dx d Edge = maximum f ∗ g of derivative dx Source: S. Seitz Edge detection, Take 2 f Edge Second derivative d 2 g of Gaussian dx2 (Laplacian) d 2 f ∗ g Edge = zero crossing dx2 of second derivative Source: S. Seitz From edges to blobs • Edge = ripple • Blob = superposition of two ripples maximum Spatial selection: the magnitude of the Laplacian response will achieve a maximum at the center of the blob, provided the scale of the Laplacian is “matched” to the scale of the blob Estimating scale - I • Assume we have detected a corner • How big is the neighborhood?
    [Show full text]
  • University of Cambridge Engineering Part IB Paper 8 Information Engineering Handout 2: Feature Extraction Roberto Cipolla May 20
    University of Cambridge Engineering Part IB Paper 8 Information Engineering Handout 2: Feature Extraction Roberto Cipolla May 2018 2 Engineering Part IB: Paper 8 Image Matching Image Intensities We can represent a monochrome image as a matrix I(x, y) of intensity values. The size of the matrix is typically 320 240 × (QVGA), 640 480 (VGA) or 1280 720 (HD) and the × × intensity values are usually sampled to an accuracy of 8 bits (256 grey levels). For colour images each colour channel (e.g. RGB) is stored separately. If a point on an object is visible in view, the corresponding pixel intensity, I(x, y) is a function of many geometric and photometric variables, including: 1. The position and orientation of the camera; 2. The geometry of the scene (3D shapes and layout); 3. The nature and distribution of light sources; 4. The reflectance properties of the surfaces: specular ↔ Lambertian, albedo 0 (black) 1 (white); ↔ 5. The properties of the lens and the CCD. In practice the point may also only be partially visible or its appearance may be also be affected by occlusion. Feature Extraction 3 Data reduction With current computer technology, it is necessary to discard most of the data coming from the camera before any attempt is made at real-time image interpretation. images generic salient features → 10 MBytes/s 10 KBytes/s (mono CCD) All subsequent interpretation is performed on the generic rep- resentation, not the original image. We aim to: Dramatically reduce the amount of data. • Preserve the useful information in the images (such as the • albedo changes and 3D shape of objects in the scene).
    [Show full text]
  • Orientation Assignment − Extraction of Descriptor Components Orientation Assignment
    SIFT, SURF GLOH descriptors Dipartimento di Sistemi e Informatica SIFT Scale Invariant Feature Transform • SIFT method has been introduced by D. Lowe in 2004 to represent visual entities according to their local properties. The method employs local features taken in correspondence of salient points (referred to as keypoints or SIFT points) • The original Lowe’s algorithm: Given a grey-scale image: - Build a Gaussian-blurred image pyramid - Subtract adjacent levels to obtain a Difference of Gaussians (DoG) pyramid (so approximating the Laplacian of Gaussians) - Take local extrema of the DoG filters at different scales as keypoints For each keypoint: - Evaluate local gradients in a neighbourhood of the keypoint with orientations relative to the keypoint orientation and normalize - Build a descriptor as a feature vector with the salient keypoint information • SIFT descriptors are therefore obtained in the following three steps: A. Keypoint detection using local extrema of DoG filters B. Computation of keypoint descriptor C. SIFT descriptor derivation • Keypoints (their SIFT descriptors) are used to characterize shapes with invariant properties • Image points selected as keypoints and their SIFT descriptors are robust under: - Luminance change (due to difference-based metrics) - Rotation (due to local orientations wrt the keypoint canonical) - Scale change (due to scale-space) SIFT Keypoint detection Keypoints are local scale-space maxima of the Differences of Gaussians. They correspond to local min/max points in image I(x,y) that keep stable at different scales σ A1. Obtain a Difference of Gaussians pyramid Resample Blur Blur: is doubled from the bottom to top of each pyramid Resample: pyramid images are sub-sampled from scale to scale Subtract: adjacent levels of pyramid images are subtracted • In order to build the pyramids, the original image is convoluted with a set of Gaussians, so as to obtain a set of images that differ by k in the scale space.
    [Show full text]