Or Motion) Analysis

Total Page:16

File Type:pdf, Size:1020Kb

Or Motion) Analysis Image Sequence (or Motion) Analysis also referred as: Dynamic Scene Analysis Input: A Sequence of Images (or frames) Output: Motion parameters (3-D) of the objects/s in motion and/or 3-D Structure information of the moving objects Assumption: The time interval ∆T between two consecutive or successive pair of frames is small. Typical steps in Dynamic Scene Analysis • Image filtering and enhancement • Tracking • Feature detection • Establish feature correspondence • Motion parameter estimation (local and global) • Structure estimation (optional) • Predict occurrence in next frame TRACKING in a Dynamic Scene F1 Since ∆T is small, the amount of motion/displacement of objects, F2 between two successive pair of frames is also small. F3 At 25 frames per sec (fps): ∆T = 40 msec. FN At 50 frames per sec (fps): ∆T = 20 msec. Image Motion Image changes by difference equation: fd (x1, x2 ,ti ,t j ) = f (x1, x2 ,ti ) − f (x1, x2 ,t j ) = f (ti ) − f (ti ) = fi − f j Accumulated difference image: fT (X ,tn ) = fd (X ,tn−1,tn ) − fT (X ,tn−1);n ≥ 3, where, fT (X ,t2 ) = fd (X ,t2 ,t1) Moving Edge (or feature) detector: ∂f F (X ,t ,t ) = . f (X ,t ,t ) mov _ feat 1 2 ∂X d 1 2 Recent methods include background and foreground modeling. F 1 F2 F1 -F2 abs(F1 -F2) Input Video Frame Segmented Video frame Extracted moving object using Alpha matte Two categories of Visual Tracking Algorithms: Target Representation and Localization; Filtering and Data Association. A. Some common Target Representation and Localization algorithms: Blob tracking: Segmentation of object interior (for example blob detection, block- based correlation or optical flow) Kernel-based tracking (Mean-shift tracking): An iterative localization procedure based on the maximization of a similarity measure (Bhattacharyya coefficient). Contour tracking: Detection of object boundary (e.g. active contours or Condensation algorithm) Visual feature matching: Registration B. Some common Filtering and Data Association algorithms: Kalman filter: An optimal recursive Bayesian filter for linear functions and Gaussian noise. Particle filter: Useful for sampling the underlying state-space distribution of non-linear and non-Gaussian processes. Also see: Match moving; Motion capture; Swistrack BLOB Detection – Approaches used: • Corner detectors (Harris, Shi & Tomashi, Susan, Level Curve Curvature, Fast etc.) • LOG, DOG, DOH (Det. Of Hessian), Ridge detectors, Scale-space, Pyramids • Hessian affine, SIFT (Scale-invariant feature transform) • SURF (Speeded Up Robust Features) • GLOH (Gradient Location and Orientation Histogram) • LESH (Local Energy based Shape Histogram). Complexities and issues in tracking: Need to overcome difficulties that arise from noise, occlusion, clutter, moving cameras, multiple moving objects and changes in the foreground objects or in the background environment. DOH - the scale-normalized determinant of the Hessian, also referred to as the Monge–Ampère operator, where, HL denotes the Hessian matrix of L and then detecting scale-space maxima/minima of this operator one obtains another straightforward differential blob detector with automatic scale selection which also responds to saddles. Hessian Affine: SURF is based on a set of 2-D HAAR wavelets; implements DOH SIFT: Four major steps: Detect extremas 1. Scale-space extrema detection at various scales: 2. Keypoint localization 3. Orientation assignment 4. Keypoint descriptor Only three steps (1, 3 & 4) are shown below: Histograms contain 8 bins each, and each descriptor contains an array of 4 histograms around the keypoint. This leads to a SIFT feature vector with (4 x 4 x 8 = 128 elements). e.g. 2*2*8: GLOH (Gradient Location and Orientation Histogram) Gradient location-orientation histogram (GLOH) is an extension of the SIFT descriptor designed to increase its robustness and distinctiveness. The SIFT descriptor is computed for a log-polar location grid with 3 bins in radial direction (the radius set to 6, 11 and 15) and 8 in angular direction, which results 17 location bins. Note that the central bin is not divided in angular directions. The gradient orientations are quantized in 16 bins. This gives a 272 bin histogram. The size of this descriptor is reduced with PCA. The covariance matrix for PCA is estimated on 47 000 image patches collected from various images (see section III-C.1). The 128 largest eigenvectors are used for description. Gradient location and orientation histogram (GLOH) is a new descriptor which extends SIFT by changing the location grid and using PCA to reduce the size. Motion Equations: o x Models derived from mechanics: Euler’s or Newton’s equations. P’ = R.P + T, where: (X, Y) (X’, Y’) 2 2 y m11 = n1 + (1− n1 )cosθ m12 = n1n2 (1− cosθ ) − n3 sinθ m = n n (1− cosθ ) + n sinθ 13 1 3 2 z P(x, y, z) P’(x’, y’, z’) at t1 at t R = [mij ]3*3 2 m = n n (1− cosθ ) + n sinθ 21 1 2 3 T = ∂x ∂y ∂z T 2 2 [] m22 = n2 + (1− n2 )cosθ 2 2 2 m23 = n2n3 (1− cosθ ) − n1 sinθ where: n1 + n2 + n3 =1 Observation in the image plane: U = X’ – X; V = Y’ – Y. m31 = n1n3 (1− cosθ ) − n2 sinθ X = Fx / z;Y = Fy / z. m32 = n2n3 (1− cosθ ) + n1 sinθ 2 2 X '= Fx'/ z';Y = Fy'/ z'. m33 = n3 + (1− n3 )cosθ Mathematically (for any two successive frames), the problem is: Input: Given (X, Y), (X’, Y’); Output: Estimate n1, n2, n3, θ, δx, δy, δz First look at the problem of estimating motion parameters using 3D knowledge only: Given only three (3) non-linear equations, you have to obtain seven (7) parameters. Need a few more constraints and may be assumptions too. Since ∆T is small, θ must also be small enough (in radians). Thus R simplifies (reduces) to: ⎡ 1 − n3θ n2θ ⎤ ⎡ 1 −φ3 φ2 ⎤ R = ⎢ n θ 1 − n θ ⎥ = ⎢ φ 1 −φ ⎥ θ →0 ⎢ 3 1 ⎥ ⎢ 3 1 ⎥ ⎣⎢− n2θ n1θ 1 ⎦⎥ ⎣⎢−φ2 φ1 1 ⎦⎥ 2 2 2 2 where, φ1 + φ2 + φ3 = θ Now (problem linearized), given three (3) linear equations, Evaluate R . you have to obtain six (6) parameters. - Solution ? Take two point correspondences: ' ' P1 = R.P1 +T; P2 = R.P2 +T; Subtracting one from the other, gives: ' ' (eliminates the translation component) (P1 − P2 ) = R.(P1 − P2 ) ⎡∆x' ⎤ ⎡ 1 −φ φ ⎤⎡∆x ⎤ 12 3 2 12 ⎡φ ⎤ ⎢ ' ⎥ ⎢ ⎥⎢ ⎥ 1 ∆y = φ 1 −φ ∆y , Solve ⎢ 12 ⎥ ⎢ 3 1 ⎥⎢ 12 ⎥ Φ = ⎢φ ⎥; ⎢ ' ⎥ ⎢ ⎥⎢ ⎥ for: ⎢ 2 ⎥ ⎣∆z12 ⎦ ⎣−φ2 φ1 1 ⎦⎣∆z12 ⎦ ⎣⎢φ3 ⎦⎥ where,∆x12 = x1 − x2 , ' ' ' ⎡ 0 ∆z12 − ∆y12 ⎤ ∆x12 = x1 − x2; ∇ = ⎢− ∆z 0 ∆x ⎥ and so on .... for y and z. 12 ⎢ 12 12 ⎥ ⎣⎢ ∆y12 − ∆x12 0 ⎦⎥ Re-arrange to form: ⎡φ1 ⎤ ' ⎡∆x12 − ∆x12 ⎤ ⎢ ⎥ 2 ⎢ ⎥ []∇12 φ2 = ∆12 2 ' ⎢ ⎥ and ∆12 = ⎢∆y12 − ∆y12 ⎥ ⎢φ ⎥ ⎢ ' ⎥ ⎣ 3 ⎦ ⎣ ∆z12 − ∆z12 ⎦ ⎡φ1 ⎤ ⎢ ⎥ 2 []∇12 φ2 = ∆12 ⎡ ' ⎤ ⎢ ⎥ ⎡ 0 ∆z12 − ∆y12 ⎤ ∆x12 − ∆x12 ⎢ ⎥ ∇ = ⎢− ∆z 0 ∆x ⎥ and ∆2 = ∆y' − ∆y ⎣⎢φ3 ⎦⎥ 12 ⎢ 12 12 ⎥ 12 ⎢ 12 12 ⎥ ⎢ ⎥ ⎢ ' ⎥ ⎣ ∆y12 − ∆x12 0 ⎦ ⎣∆z12 − ∆z12 ⎦ ∇12 is a skew - symmetric matrix. Interprete, why is it so ? ∇12 = 0 So what to do ? Contact a Mathematician? ⎡φ1 ⎤ Take two (one pair) more point correspondences: ∇ ⎢φ ⎥ = ∆2 []34 ⎢ 2 ⎥ 34 ⎣⎢φ3 ⎦⎥ ' ⎡− ∆z34 0 ∆x34 ⎤ ⎡∆y34 − ∆y34 ⎤ ⎢ ⎥ ∇ = ⎢ ∆y − ∆x 0 ⎥ and ∆2 = ∆z ' − ∆z 34 ⎢ 34 34 ⎥ 34 ⎢ 34 34 ⎥ ⎢ ⎥ ⎢ ' ⎥ ⎣ 0 ∆z34 − ∆y34 ⎦ ⎣∆x34 − ∆x34 ⎦ Using two pairs (4 points) ⎡φ1 ⎤ ⎡φ1 ⎤ of correspondences: ∇ ⎢φ ⎥ = ∆2 ∇ ⎢φ ⎥ = ∆2 []12 ⎢ 2 ⎥ 12 []34 ⎢ 2 ⎥ 34 Adding: ⎢φ ⎥ ⎢φ ⎥ ⎡φ1 ⎤ ⎣ 3 ⎦ ⎣ 3 ⎦ ∇ + ∇ ⎢φ ⎥ = [∆2 + ∆2 ] ⇒ ∇ Φ = [∆2 ] []12 34 ⎢ 2 ⎥ 12 34 [][]1234 1234 ⎢φ ⎥ ⎣ 3 ⎦ ⎡− ∆z34 0 ∆x34 ⎤ 0 ∆z − ∆y ⎢ ⎥ ⎡ 12 12 ⎤ ∇ = ∆y − ∆x 0 ⎢ ⎥ 34 ⎢ 34 34 ⎥ ∇12 = − ∆z12 0 ∆x12 ⎢ ⎥ ⎣⎢ 0 ∆z34 − ∆y34 ⎦⎥ ⎢ ∆y − ∆x 0 ⎥ ⎣ 12 12 ⎦ ' ⎡∆y34 − ∆y34 ⎤ ⎡∆x' − ∆x ⎤ ⎢ ⎥ 12 12 and ∆2 = ∆z ' − ∆z 2 ⎢ ' ⎥ 34 ⎢ 34 34 ⎥ and ∆ = ∆y − ∆y 12 ⎢ 12 12 ⎥ ⎢∆x' − ∆x ⎥ ⎢ ' ⎥ ⎣ 34 34 ⎦ ⎣∆z12 − ∆z12 ⎦ ' ' ⎡ −∆z34 ∆z12 ∆x34 −∆y12⎤ ⎡∆y34 −∆y34 +∆x12 −∆x12⎤ ⎢ ⎥ ∇ = ⎢∆y −∆z −∆x ∆x ⎥ ; ∆2 = ∆z' −∆z +∆y' −∆y 1234 ⎢ 34 12 34 12 ⎥ 1234 ⎢ 34 34 12 12 ⎥ ⎢ ⎥ ⎢ ' ' ⎥ ⎣ ∆y12 −∆x12 −∆z34 −∆y34 ⎦ ⎣∆x34 −∆x34 +∆z12 −∆z12 ⎦ ⎡φ1 ⎤ ∇ + ∇ ⎢φ ⎥ = [∆2 + ∆2 ] ⇒ ∇ Φ = [∆2 ] []12 34 ⎢ 2 ⎥ 12 34 [][]1234 1234 ⎣⎢φ3 ⎦⎥ Condition for existence of a unique solution is based on a geometrical relationship of the coordinates of four points in space, at time t1: ⎡ −∆z34 ∆z12 ∆x34 −∆y12⎤ ∇ = ⎢∆y −∆z −∆x ∆x ⎥ 1234 ⎢ 34 12 34 12 ⎥ ⎣⎢ ∆y12 −∆x12 −∆z34 −∆y34 ⎦⎥ ' ' ⎡∆y34 −∆y34 +∆x12 −∆x12⎤ 2 ⎢ ' ' ⎥ and ∆1234 = ⎢∆z34 −∆z34 +∆y12 −∆y12 ⎥ ⎢ ' ' ⎥ ⎣∆x34 −∆x34 +∆z12 −∆z12⎦ This solution is often used as an initial guess for the final estimate of the motion parameters. Find geometric condition, when: ∇1234 = 0 OPTICAL FLOW A point in 3D space: X 0 = []kxo kyo kzo k ;k ≠ 0, an arbitary constant. Image point: where, X i = [wxi wyi w] X i = PX 0 ⎡x ⎤ x o ⎡ i ⎤ ⎢ zo ⎥ Assuming normalized focal length, f = 1: ⎢ ⎥ = ⎢ ⎥....(1) y yo ⎣ i ⎦ ⎢ ⎥ ⎣ zo ⎦ Assuming linear motion model (no acceleration), between successive frames: ⎡xo (t)⎤ ⎡ xo + ut ⎤ Combining equations (1) and (2): ⎢y (t)⎥ = ⎢ y + vt ⎥....(2) ⎢ o ⎥ ⎢ o ⎥ ⎡(x + ut) ⎤ x (t) o ⎢z (t)⎥ ⎢z + wt⎥ ⎡ i ⎤ ⎢ (zo + wt)⎥ ⎣ o ⎦ ⎣ o ⎦ ⎢ ⎥ = ⎢ ⎥....(3) y (t) (yo + vt) ⎣ i ⎦ ⎢ ⎥ ⎣ (zo + wt)⎦ ⎡(xo + ut) ⎤ ⎡xi (t)⎤ ⎢ (z + wt)⎥ Assume in equation (3), = o ....(3) that w (=dz/dt) < 0. ⎢ ⎥ ⎢ ⎥ y (t) (yo + vt) ⎣ i ⎦ ⎢ ⎥ ⎣ (zo + wt)⎦ In that case, the object (or points on the object) will appear to come closer to you. Now visualize, where does these points may come out from? ⎡u ⎤ This point “e” is a point in ⎡xi (t)⎤ w the image plane, known as Lt = ⎢ ⎥ = e.....(4) the: t→−∞⎢ ⎥ v ⎣yi (t)⎦ ⎢ ⎥ ⎣ w⎦ FOE (Focus of Expansion). The motion of the object points appears to emanate from a fixed point in the image plane. This point is known as FOE.
Recommended publications
  • Scale Invariant Interest Points with Shearlets
    Scale Invariant Interest Points with Shearlets Miguel A. Duval-Poo1, Nicoletta Noceti1, Francesca Odone1, and Ernesto De Vito2 1Dipartimento di Informatica Bioingegneria Robotica e Ingegneria dei Sistemi (DIBRIS), Universit`adegli Studi di Genova, Italy 2Dipartimento di Matematica (DIMA), Universit`adegli Studi di Genova, Italy Abstract Shearlets are a relatively new directional multi-scale framework for signal analysis, which have been shown effective to enhance signal discon- tinuities such as edges and corners at multiple scales. In this work we address the problem of detecting and describing blob-like features in the shearlets framework. We derive a measure which is very effective for blob detection and closely related to the Laplacian of Gaussian. We demon- strate the measure satisfies the perfect scale invariance property in the continuous case. In the discrete setting, we derive algorithms for blob detection and keypoint description. Finally, we provide qualitative justifi- cations of our findings as well as a quantitative evaluation on benchmark data. We also report an experimental evidence that our method is very suitable to deal with compressed and noisy images, thanks to the sparsity property of shearlets. 1 Introduction Feature detection consists in the extraction of perceptually interesting low-level features over an image, in preparation of higher level processing tasks. In the last decade a considerable amount of work has been devoted to the design of effective and efficient local feature detectors able to associate with a given interesting point also scale and orientation information. Scale-space theory has been one of the main sources of inspiration for this line of research, providing an effective framework for detecting features at multiple scales and, to some extent, to devise scale invariant image descriptors.
    [Show full text]
  • Hough Transform, Descriptors Tammy Riklin Raviv Electrical and Computer Engineering Ben-Gurion University of the Negev Hough Transform
    DIGITAL IMAGE PROCESSING Lecture 7 Hough transform, descriptors Tammy Riklin Raviv Electrical and Computer Engineering Ben-Gurion University of the Negev Hough transform y m x b y m 3 5 3 3 2 2 3 7 11 10 4 3 2 3 1 4 5 2 2 1 0 1 3 3 x b Slide from S. Savarese Hough transform Issues: • Parameter space [m,b] is unbounded. • Vertical lines have infinite gradient. Use a polar representation for the parameter space Hough space r y r q x q x cosq + ysinq = r Slide from S. Savarese Hough Transform Each point votes for a complete family of potential lines: Each pencil of lines sweeps out a sinusoid in Their intersection provides the desired line equation. Hough transform - experiments r q Image features ρ,ϴ model parameter histogram Slide from S. Savarese Hough transform - experiments Noisy data Image features ρ,ϴ model parameter histogram Need to adjust grid size or smooth Slide from S. Savarese Hough transform - experiments Image features ρ,ϴ model parameter histogram Issue: spurious peaks due to uniform noise Slide from S. Savarese Hough Transform Algorithm 1. Image à Canny 2. Canny à Hough votes 3. Hough votes à Edges Find peaks and post-process Hough transform example http://ostatic.com/files/images/ss_hough.jpg Incorporating image gradients • Recall: when we detect an edge point, we also know its gradient direction • But this means that the line is uniquely determined! • Modified Hough transform: for each edge point (x,y) θ = gradient orientation at (x,y) ρ = x cos θ + y sin θ H(θ, ρ) = H(θ, ρ) + 1 end Finding lines using Hough transform
    [Show full text]
  • A PERFORMANCE EVALUATION of LOCAL DESCRIPTORS 1 a Performance Evaluation of Local Descriptors
    MIKOLAJCZYK AND SCHMID: A PERFORMANCE EVALUATION OF LOCAL DESCRIPTORS 1 A performance evaluation of local descriptors Krystian Mikolajczyk and Cordelia Schmid Dept. of Engineering Science INRIA Rhone-Alpesˆ University of Oxford 655, av. de l'Europe Oxford, OX1 3PJ 38330 Montbonnot United Kingdom France [email protected] [email protected] Abstract In this paper we compare the performance of descriptors computed for local interest regions, as for example extracted by the Harris-Affine detector [32]. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [3], steerable filters [12], PCA-SIFT [19], differential invariants [20], spin images [21], SIFT [26], complex filters [37], moment invariants [43], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor, and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors. Index Terms Local descriptors, interest points, interest regions, invariance, matching, recognition. I. INTRODUCTION Local photometric descriptors computed for interest regions have proved to be very successful in applications such as wide baseline matching [37, 42], object recognition [10, 25], texture Corresponding author is K.
    [Show full text]
  • View Matching with Blob Features
    View Matching with Blob Features Per-Erik Forssen´ and Anders Moe Computer Vision Laboratory Department of Electrical Engineering Linkoping¨ University, Sweden Abstract mographic transformation that is not part of the affine trans- formation. This means that our results are relevant for both This paper introduces a new region based feature for ob- wide baseline matching and 3D object recognition. ject recognition and image matching. In contrast to many A somewhat related approach to region based matching other region based features, this one makes use of colour is presented in [1], where tangents to regions are used to de- in the feature extraction stage. We perform experiments on fine linear constraints on the homography transformation, the repeatability rate of the features across scale and incli- which can then be found through linear programming. The nation angle changes, and show that avoiding to merge re- connection is that a line conic describes the set of all tan- gions connected by only a few pixels improves the repeata- gents to a region. By matching conics, we thus implicitly bility. Finally we introduce two voting schemes that allow us match the tangents of the ellipse-approximated regions. to find correspondences automatically, and compare them with respect to the number of valid correspondences they 2. Blob features give, and their inlier ratios. We will make use of blob features extracted using a clus- tering pyramid built using robust estimation in local image regions [2]. Each extracted blob is represented by its aver- 1. Introduction age colour pk,areaak, centroid mk, and inertia matrix Ik. I.e.
    [Show full text]
  • 1 Introduction
    1: Introduction Ahmad Humayun 1 Introduction z A pixel is not a square (or rectangular). It is simply a point sample. There are cases where the contributions to a pixel can be modeled, in a low order way, by a little square, but not ever the pixel itself. The sampling theorem tells us that we can reconstruct a continuous entity from such a discrete entity using an appropriate reconstruction filter - for example a truncated Gaussian. z Cameras approximated by pinhole cameras. When pinhole too big - many directions averaged to blur the image. When pinhole too small - diffraction effects blur the image. z A lens system is there in a camera to correct for the many optical aberrations - they are basically aberrations when light from one point of an object does not converge into a single point after passing through the lens system. Optical aberrations fall into 2 categories: monochromatic (caused by the geometry of the lens and occur both when light is reflected and when it is refracted - like pincushion etc.); and chromatic (Chromatic aberrations are caused by dispersion, the variation of a lens's refractive index with wavelength). Lenses are essentially help remove the shortfalls of a pinhole camera i.e. how to admit more light and increase the resolution of the imaging system at the same time. With a simple lens, much more light can be bought into sharp focus. Different problems with lens systems: 1. Spherical aberration - A perfect lens focuses all incoming rays to a point on the optic axis. A real lens with spherical surfaces suffers from spherical aberration: it focuses rays more tightly if they enter it far from the optic axis than if they enter closer to the axis.
    [Show full text]
  • Feature Detection Florian Stimberg
    Feature Detection Florian Stimberg Outline ● Introduction ● Types of Image Features ● Edges ● Corners ● Ridges ● Blobs ● SIFT ● Scale-space extrema detection ● keypoint localization ● orientation assignment ● keypoint descriptor ● Resources Introduction ● image features are distinct image parts ● feature detection often is first operation to define image parts to process later ● finding corresponding features in pictures necessary for object recognition ● Multi-Image-Panoramas need to be “stitched” together at according image features ● the description of a feature is as important as the extraction Types of image features Edges ● sharp changes in brightness ● most algorithms use the first derivative of the intensity ● different methods: (one or two thresholds etc.) Types of image features Corners ● point with two dominant and different edge directions in its neighbourhood ● Moravec Detector: similarity between patch around pixel and overlapping patches in the neighbourhood is measured ● low similarity in all directions indicates corner Types of image features Ridges ● curves which points are local maxima or minima in at least one dimension ● quality depends highly on the scale ● used to detect roads in aerial images or veins in 3D magnetic resonance images Types of image features Blobs ● points or regions brighter or darker than the surrounding ● SIFT uses a method for blob detection SIFT ● SIFT = Scale-invariant feature transform ● first published in 1999 by David Lowe ● Method to extract and describe distinctive image features ● feature
    [Show full text]
  • EECS 442 Computer Vision: Homework 2
    EECS 442 Computer Vision: Homework 2 Instructions • This homework is due at 11:59:59 p.m. on Friday October 18th, 2019. • The submission includes two parts: 1. To Gradescope: a pdf file as your write-up, including your answers to all the questions and key choices you made. You might like to combine several files to make a submission. Here is an example online link for combining multiple PDF files: https://combinepdf.com/. 2. To Canvas: a zip file including all your code. The zip file should follow the HW formatting instructions on the class website. • The write-up must be an electronic version. No handwriting, including plotting questions. LATEX is recommended but not mandatory. Python Environment We are using Python 3.7 for this course. You can find references for the Python stan- dard library here: https://docs.python.org/3.7/library/index.html. To make your life easier, we recommend you to install Anaconda 5.2 for Python 3.7.x (https://www.anaconda.com/download/). This is a Python package manager that includes most of the modules you need for this course. We will make use of the following packages extensively in this course: • Numpy (https://docs.scipy.org/doc/numpy-dev/user/quickstart.html) • OpenCV (https://opencv.org/) • SciPy (https://scipy.org/) • Matplotlib (http://matplotlib.org/users/pyplot tutorial.html) 1 1 Image Filtering [50 pts] In this first section, you will explore different ways to filter images. Through these tasks you will build up a toolkit of image filtering techniques. By the end of this problem, you should understand how the development of image filtering techniques has led to convolution.
    [Show full text]
  • Image Analysis Feature Extraction: Corners and Blobs Corners And
    2 Corners and Blobs Image Analysis Feature extraction: corners and blobs Christophoros Nikou [email protected] Images taken from: Computer Vision course by Svetlana Lazebnik, University of North Carolina at Chapel Hill (http://www.cs.unc.edu/~lazebnik/spring10/). D. Forsyth and J. Ponce. Computer Vision: A Modern Approach, Prentice Hall, 2003. M. Nixon and A. Aguado. Feature Extraction and Image Processing. Academic Press 2010. University of Ioannina - Department of Computer Science C. Nikou – Image Analysis (T-14) Motivation: Panorama Stitching 3 Motivation: Panorama Stitching 4 (cont.) Step 1: feature extraction Step 2: feature matching C. Nikou – Image Analysis (T-14) C. Nikou – Image Analysis (T-14) Motivation: Panorama Stitching 5 6 Characteristics of good features (cont.) • Repeatability – The same feature can be found in several images despite geometric and photometric transformations. • Saliency – Each feature has a distinctive description. • Compactness and efficiency Step 1: feature extraction – Many fewer features than image pixels. Step 2: feature matching • Locality Step 3: image alignment – A feature occupies a relatively small area of the image; robust to clutter and occlusion. C. Nikou – Image Analysis (T-14) C. Nikou – Image Analysis (T-14) 1 7 Applications 8 Image Curvature • Feature points are used for: • Extends the notion of edges. – Motion tracking. • Rate of change in edge direction. – Image alignment. • Points where the edge direction changes rapidly are characterized as corners. – 3D reconstruction. • We need some elements from differential ggyeometry. – Objec t recogn ition. • Parametric form of a planar curve: – Image indexing and retrieval. – Robot navigation. Ct()= [ xt (),()] yt • Describes the points in a continuous curve as the endpoints of a position vector.
    [Show full text]
  • Detection and Evaluation Methods for Local Image and Video Features Julian Stottinger¨
    Technical Report CVL-TR-4 Detection and Evaluation Methods for Local Image and Video Features Julian Stottinger¨ Computer Vision Lab Institute of Computer Aided Automation Vienna University of Technology 2. Marz¨ 2011 Abstract In computer vision, local image descriptors computed in areas around salient interest points are the state-of-the-art in visual matching. This technical report aims at finding more stable and more informative interest points in the domain of images and videos. The research interest is the development of relevant evaluation methods for visual mat- ching approaches. The contribution of this work lies on one hand in the introduction of new features to the computer vision community. On the other hand, there is a strong demand for valid evaluation methods and approaches gaining new insights for general recognition tasks. This work presents research in the detection of local features both in the spatial (“2D” or image) domain as well for spatio-temporal (“3D” or video) features. For state-of-the-art classification the extraction of discriminative interest points has an impact on the final classification performance. It is crucial to find which interest points are of use in a specific task. One question is for example whether it is possible to reduce the number of interest points extracted while still obtaining state-of-the-art image retrieval or object recognition results. This would gain a significant reduction in processing time and would possibly allow for new applications e.g. in the domain of mobile computing. Therefore, the work investigates different corner detection approaches and evaluates their repeatability under varying alterations.
    [Show full text]
  • Edges and Binary Image Analysis
    1/25/2017 Edges and Binary Image Analysis Thurs Jan 26 Kristen Grauman UT Austin Today • Edge detection and matching – process the image gradient to find curves/contours – comparing contours • Binary image analysis – blobs and regions 1 1/25/2017 Gradients -> edges Primary edge detection steps: 1. Smoothing: suppress noise 2. Edge enhancement: filter for contrast 3. Edge localization Determine which local maxima from filter output are actually edges vs. noise • Threshold, Thin Kristen Grauman, UT-Austin Thresholding • Choose a threshold value t • Set any pixels less than t to zero (off) • Set any pixels greater than or equal to t to one (on) 2 1/25/2017 Original image Gradient magnitude image 3 1/25/2017 Thresholding gradient with a lower threshold Thresholding gradient with a higher threshold 4 1/25/2017 Canny edge detector • Filter image with derivative of Gaussian • Find magnitude and orientation of gradient • Non-maximum suppression: – Thin wide “ridges” down to single pixel width • Linking and thresholding (hysteresis): – Define two thresholds: low and high – Use the high threshold to start edge curves and the low threshold to continue them • MATLAB: edge(image, ‘canny’); • >>help edge Source: D. Lowe, L. Fei-Fei The Canny edge detector original image (Lena) Slide credit: Steve Seitz 5 1/25/2017 The Canny edge detector norm of the gradient The Canny edge detector thresholding 6 1/25/2017 The Canny edge detector How to turn these thick regions of the gradient into curves? thresholding Non-maximum suppression Check if pixel is local maximum along gradient direction, select single max across width of the edge • requires checking interpolated pixels p and r 7 1/25/2017 The Canny edge detector Problem: pixels along this edge didn’t survive the thresholding thinning (non-maximum suppression) Credit: James Hays 8 1/25/2017 Hysteresis thresholding • Use a high threshold to start edge curves, and a low threshold to continue them.
    [Show full text]
  • Local Descriptors
    Local descriptors SIFT (Scale Invariant Feature Transform) descriptor • SIFT keypoints at locaon xy and scale σ have been obtained according to a procedure that guarantees illuminaon and scale invance. By assigning a consistent orientaon, the SIFT keypoint descriptor can be also orientaon invariant. • SIFT descriptor is therefore obtained from the following steps: – Determine locaon and scale by maximizing DoG in scale and in space (i.e. find the blurred image of closest scale) – Sample the points around the keypoint and use the keypoint local orientaon as the dominant gradient direc;on. Use this scale and orientaon to make all further computaons invariant to scale and rotaon (i.e. rotate the gradients and coordinates by the dominant orientaon) – Separate the region around the keypoint into subregions and compute a 8-bin gradient orientaon histogram of each subregion weigthing samples with a σ = 1.5 Gaussian SIFT aggregaon window • In order to derive a descriptor for the keypoint region we could sample intensi;es around the keypoint, but they are sensi;ve to ligh;ng changes and to slight errors in x, y, θ. In order to make keypoint es;mate more reliable, it is usually preferable to use a larger aggregaon window (i.e. the Gaussian kernel size) than the detec;on window. The SIFT descriptor is hence obtained from thresholded image gradients sampled over a 16x16 array of locaons in the neighbourhood of the detected keypoint, using the level of the Gaussian pyramid at which the keypoint was detected. For each 4x4 region samples they are accumulated into a gradient orientaon histogram with 8 sampled orientaons.
    [Show full text]
  • Local Feature Detectors and Descriptors Overview
    CS 6350 – COMPUTER VISION Local Feature Detectors and Descriptors Overview • Local invariant features • Keypoint localization - Hessian detector - Harris corner detector • Scale Invariant region detection - Laplacian of Gaussian (LOG) detector - Difference of Gaussian (DOG) detector • Local feature descriptor - Scale Invariant Feature Transform (SIFT) - Gradient Localization Oriented Histogram (GLOH) • Examples of other local feature descriptors Motivation • Global feature from the whole image is often not desirable • Instead match local regions which are prominent to the object or scene in the image. • Application Area - Object detection - Image matching - Image stitching Requirements of a local feature • Repetitive : Detect the same points independently in each image. • Invariant to translation, rotation, scale. • Invariant to affine transformation. • Invariant to presence of noise, blur etc. • Locality :Robust to occlusion, clutter and illumination change. • Distinctiveness : The region should contain “interesting” structure. • Quantity : There should be enough points to represent the image. • Time efficient. Others preferable (but not a must): o Disturbances, attacks, o Noise o Image blur o Discretization errors o Compression artifacts o Deviations from the mathematical model (non-linearities, non-planarities, etc.) o Intra-class variations General approach + + + 1. Find the interest points. + + 2. Consider the region + ( ) around each keypoint. local descriptor 3. Compute a local descriptor from the region and normalize the feature.
    [Show full text]