Visual Search for Objects with Straight Lines

Total Page:16

File Type:pdf, Size:1020Kb

Visual Search for Objects with Straight Lines VISUAL SEARCH FOR OBJECTS WITH STRAIGHT LINES by SIMON HAIG MELIKIAN Submitted for the degree of Doctor of Philosophy Case School of Engineering Electrical Engineering and Computer Science Thesis Adviser: Prof. Dr. Christos Papachristou January, 2006 CASE WESTERN RESERVE UNIVERSITY SCHOOL OF GRADUATE STUDIES We hereby approve the dissertation of ______________________________________________________ candidate for the Ph.D. degree *. (signed)_______________________________________________ (chair of the committee) ________________________________________________ ________________________________________________ ________________________________________________ ________________________________________________ ________________________________________________ (date) _______________________ *We also certify that written approval has been obtained for any proprietary material contained therein. Copyright © 2006 by Simon Haig Melikian ALL RIGHTS RESERVED VISUAL SEARCH FOR OBJECTS WITH STRAIGHT LINES by SIMON HAIG MELIKIAN Abstract I present a new method of visual search for objects that include straight lines. This is usually the case for machine-made objects. I describe existing machine vision search methods and show how my method of visual search gives better performance on objects that have straight lines. Inspired from human vision, a two-step process is used. First straight line segments are detected in an image and characterized by their length, mid- point location, and orientation. Second, hypotheses that a particular straight line segment belongs to a known object are generated and tested. The set of hypotheses is constrained by spatial relationships in the known objects. I discuss implementation of my method and its performance and limitations in real and synthetic images. The speed and robustness of my method make it immediately applicable to many machine vision problems. Table of Contents’S CONTRIBUTIONS.............................................................................................................6 1.2 INTRODUCTION......................................................................................................................................7 1.3 SCOPE OF THIS THESIS .........................................................................................................................10 CHAPTER 2 ................................................................................................................................................13 2.1 A LOOK AT HUMAN VISION.................................................................................................................13 2.1.1 Modular Systems for Features and Hypotheses..........................................................................13 2.1.2 Are Straight Lines Salient Features?..........................................................................................14 2.2 PRIOR MACHINE VISION WORK...........................................................................................................17 2.2.1 Binary Search Methods...............................................................................................................17 2.2.2 Normalized Grayscale Correlation.............................................................................................19 2.2.3 Geometric Based Search and Recognition..................................................................................22 2.2.4 Contour Based Search ................................................................................................................24 2.2.5 Affine Invariant Constellation Based Recognition......................................................................27 2.2.5.1 Corner Based....................................................................................................................................... 28 2.2.5.2 Salient Icons........................................................................................................................................ 30 2.2.5.3 Scale Invariant Feature Transform ...................................................................................................... 34 CHAPTER 3 ................................................................................................................................................38 3.1 VISUAL SEARCH WITH STRAIGHT LINES ..............................................................................................38 3.1.1 Search Constraints for Machine Vision......................................................................................38 3.1.2 Using Straight Lines as Icons .....................................................................................................39 3.1.3 Search with Lines........................................................................................................................41 3.1.4 The Cost of Hypothesis Generation ............................................................................................49 3.1.5 The Cost of Verification..............................................................................................................49 3.1.6 Gradient Angle of a Line.............................................................................................................50 3.1.7 The Number of Reference Lines Needed for Robust Search .......................................................50 3.2 ABSTRACT LOOK AT VISUAL SEARCH WITH STRAIGHT LINES .............................................................50 CHAPTER 4 ................................................................................................................................................52 4.1 CURVATURE BASED STRAIGHT LINE EXTRACTION (CBSLE)..............................................................52 4.1.1 Effect of Span Value....................................................................................................................59 4.2 SPLIT AND MERGE METHOD FOR STRAIGHT LINE EXTRACTION..........................................................60 4.3 LINE EXTRACTION PERFORMANCE ......................................................................................................64 4.3.1 End Points Position Accuracy ....................................................................................................64 4.3.1.1 Effect of Object Size on End Point Accuracy...................................................................................... 64 4.3.1.2 Effect of Noise on End Point Accuracypeed vs. Background Straight Lines..........................................................................................76 6.1.2 Speed vs. Target Position............................................................................................................78 6.1.3 Speed vs. Angle ...........................................................................................................................79 1 of 106 6.1.4 Speed vs. Scale............................................................................................................................79 6.1.5 Speed vs. Scale and Angle...........................................................................................................79 6.2 REAL WORLD TESTING.........................................................................................................................80 6.3 FEASIBILITY OF TEACHING WITH SYNTHETIC MODEL .........................................................................90 CONCLUSION............................................................................................................................................96 APPENDIX 1. COMPARISON WITH OTHER METHODS.................................................................97 REFERENCES............................................................................................................................................99 2 of 106 Table of Figures Figure 1. Ice cream package lid on a conveyer belt............................................................ 9 Figure 2. Wheel-rim identification ..................................................................................
Recommended publications
  • Hough Transform, Descriptors Tammy Riklin Raviv Electrical and Computer Engineering Ben-Gurion University of the Negev Hough Transform
    DIGITAL IMAGE PROCESSING Lecture 7 Hough transform, descriptors Tammy Riklin Raviv Electrical and Computer Engineering Ben-Gurion University of the Negev Hough transform y m x b y m 3 5 3 3 2 2 3 7 11 10 4 3 2 3 1 4 5 2 2 1 0 1 3 3 x b Slide from S. Savarese Hough transform Issues: • Parameter space [m,b] is unbounded. • Vertical lines have infinite gradient. Use a polar representation for the parameter space Hough space r y r q x q x cosq + ysinq = r Slide from S. Savarese Hough Transform Each point votes for a complete family of potential lines: Each pencil of lines sweeps out a sinusoid in Their intersection provides the desired line equation. Hough transform - experiments r q Image features ρ,ϴ model parameter histogram Slide from S. Savarese Hough transform - experiments Noisy data Image features ρ,ϴ model parameter histogram Need to adjust grid size or smooth Slide from S. Savarese Hough transform - experiments Image features ρ,ϴ model parameter histogram Issue: spurious peaks due to uniform noise Slide from S. Savarese Hough Transform Algorithm 1. Image à Canny 2. Canny à Hough votes 3. Hough votes à Edges Find peaks and post-process Hough transform example http://ostatic.com/files/images/ss_hough.jpg Incorporating image gradients • Recall: when we detect an edge point, we also know its gradient direction • But this means that the line is uniquely determined! • Modified Hough transform: for each edge point (x,y) θ = gradient orientation at (x,y) ρ = x cos θ + y sin θ H(θ, ρ) = H(θ, ρ) + 1 end Finding lines using Hough transform
    [Show full text]
  • Scale Invariant Feature Transform (SIFT) Why Do We Care About Matching Features?
    Scale Invariant Feature Transform (SIFT) Why do we care about matching features? • Camera calibration • Stereo • Tracking/SFM • Image moiaicing • Object/activity Recognition • … Objection representation and recognition • Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging parameters • Automatic Mosaicing • http://www.cs.ubc.ca/~mbrown/autostitch/autostitch.html We want invariance!!! • To illumination • To scale • To rotation • To affine • To perspective projection Types of invariance • Illumination Types of invariance • Illumination • Scale Types of invariance • Illumination • Scale • Rotation Types of invariance • Illumination • Scale • Rotation • Affine (view point change) Types of invariance • Illumination • Scale • Rotation • Affine • Full Perspective How to achieve illumination invariance • The easy way (normalized) • Difference based metrics (random tree, Haar, and sift, gradient) How to achieve scale invariance • Pyramids • Scale Space (DOG method) Pyramids – Divide width and height by 2 – Take average of 4 pixels for each pixel (or Gaussian blur with different ) – Repeat until image is tiny – Run filter over each size image and hope its robust How to achieve scale invariance • Scale Space: Difference of Gaussian (DOG) – Take DOG features from differences of these images‐producing the gradient image at different scales. – If the feature is repeatedly present in between Difference of Gaussians, it is Scale Invariant and should be kept. Differences Of Gaussians
    [Show full text]
  • Exploiting Information Theory for Filtering the Kadir Scale-Saliency Detector
    Introduction Method Experiments Conclusions Exploiting Information Theory for Filtering the Kadir Scale-Saliency Detector P. Suau and F. Escolano {pablo,sco}@dccia.ua.es Robot Vision Group University of Alicante, Spain June 7th, 2007 P. Suau and F. Escolano Bayesian filter for the Kadir scale-saliency detector 1 / 21 IBPRIA 2007 Introduction Method Experiments Conclusions Outline 1 Introduction 2 Method Entropy analysis through scale space Bayesian filtering Chernoff Information and threshold estimation Bayesian scale-saliency filtering algorithm Bayesian scale-saliency filtering algorithm 3 Experiments Visual Geometry Group database 4 Conclusions P. Suau and F. Escolano Bayesian filter for the Kadir scale-saliency detector 2 / 21 IBPRIA 2007 Introduction Method Experiments Conclusions Outline 1 Introduction 2 Method Entropy analysis through scale space Bayesian filtering Chernoff Information and threshold estimation Bayesian scale-saliency filtering algorithm Bayesian scale-saliency filtering algorithm 3 Experiments Visual Geometry Group database 4 Conclusions P. Suau and F. Escolano Bayesian filter for the Kadir scale-saliency detector 3 / 21 IBPRIA 2007 Introduction Method Experiments Conclusions Local feature detectors Feature extraction is a basic step in many computer vision tasks Kadir and Brady scale-saliency Salient features over a narrow range of scales Computational bottleneck (all pixels, all scales) Applied to robot global localization → we need real time feature extraction P. Suau and F. Escolano Bayesian filter for the Kadir scale-saliency detector 4 / 21 IBPRIA 2007 Introduction Method Experiments Conclusions Salient features X HD(s, x) = − Pd,s,x log2Pd,s,x d∈D Kadir and Brady algorithm (2001): most salient features between scales smin and smax P.
    [Show full text]
  • Hough Transform 1 Hough Transform
    Hough transform 1 Hough transform The Hough transform ( /ˈhʌf/) is a feature extraction technique used in image analysis, computer vision, and digital image processing.[1] The purpose of the technique is to find imperfect instances of objects within a certain class of shapes by a voting procedure. This voting procedure is carried out in a parameter space, from which object candidates are obtained as local maxima in a so-called accumulator space that is explicitly constructed by the algorithm for computing the Hough transform. The classical Hough transform was concerned with the identification of lines in the image, but later the Hough transform has been extended to identifying positions of arbitrary shapes, most commonly circles or ellipses. The Hough transform as it is universally used today was invented by Richard Duda and Peter Hart in 1972, who called it a "generalized Hough transform"[2] after the related 1962 patent of Paul Hough.[3] The transform was popularized in the computer vision community by Dana H. Ballard through a 1981 journal article titled "Generalizing the Hough transform to detect arbitrary shapes". Theory In automated analysis of digital images, a subproblem often arises of detecting simple shapes, such as straight lines, circles or ellipses. In many cases an edge detector can be used as a pre-processing stage to obtain image points or image pixels that are on the desired curve in the image space. Due to imperfections in either the image data or the edge detector, however, there may be missing points or pixels on the desired curves as well as spatial deviations between the ideal line/circle/ellipse and the noisy edge points as they are obtained from the edge detector.
    [Show full text]
  • Context-Aware Features and Robust Image Representations 5 6 ∗ 7 P
    CAKE_article_final.tex Click here to view linked References 1 2 3 4 Context-Aware Features and Robust Image Representations 5 6 ∗ 7 P. Martinsa, , P. Carvalhoa,C.Gattab 8 aCenter for Informatics and Systems, University of Coimbra, Coimbra, Portugal 9 bComputer Vision Center, Autonomous University of Barcelona, Barcelona, Spain 10 11 12 13 14 Abstract 15 16 Local image features are often used to efficiently represent image content. The limited number of types of 17 18 features that a local feature extractor responds to might be insufficient to provide a robust image repre- 19 20 sentation. To overcome this limitation, we propose a context-aware feature extraction formulated under an 21 information theoretic framework. The algorithm does not respond to a specific type of features; the idea is 22 23 to retrieve complementary features which are relevant within the image context. We empirically validate the 24 method by investigating the repeatability, the completeness, and the complementarity of context-aware fea- 25 26 tures on standard benchmarks. In a comparison with strictly local features, we show that our context-aware 27 28 features produce more robust image representations. Furthermore, we study the complementarity between 29 strictly local features and context-aware ones to produce an even more robust representation. 30 31 Keywords: Local features, Keypoint extraction, Image content descriptors, Image representation, Visual 32 saliency, Information theory. 33 34 35 1. Introduction While it is widely accepted that a good local 36 37 feature extractor should retrieve distinctive, accu- 38 Local feature detection (or extraction, if we want 39 rate, and repeatable features against a wide vari- to use a more semantically correct term [1]) is a 40 ety of photometric and geometric transformations, 41 central and extremely active research topic in the 42 it is equally valid to claim that these requirements fields of computer vision and image analysis.
    [Show full text]
  • Histogram of Directions by the Structure Tensor
    Histogram of Directions by the Structure Tensor Josef Bigun Stefan M. Karlsson Halmstad University Halmstad University IDE SE-30118 IDE SE-30118 Halmstad, Sweden Halmstad, Sweden [email protected] [email protected] ABSTRACT entity). Also, by using the approach of trying to reduce di- Many low-level features, as well as varying methods of ex- rectionality measures to the structure tensor, insights are to traction and interpretation rely on directionality analysis be gained. This is especially true for the study of the his- (for example the Hough transform, Gabor filters, SIFT de- togram of oriented gradient (HOGs) features (the descriptor scriptors and the structure tensor). The theory of the gra- of the SIFT algorithm[12]). We will present both how these dient based structure tensor (a.k.a. the second moment ma- are very similar to the structure tensor, but also detail how trix) is a very well suited theoretical platform in which to they differ, and in the process present a different algorithm analyze and explain the similarities and connections (indeed for computing them without binning. In this paper, we will often equivalence) of supposedly different methods and fea- limit ourselves to the study of 3 kinds of definitions of di- tures that deal with image directionality. Of special inter- rectionality, and their associated features: 1) the structure est to this study is the SIFT descriptors (histogram of ori- tensor, 2) HOGs , and 3) Gabor filters. The results of relat- ented gradients, HOGs). Our analysis of interrelationships ing the Gabor filters to the tensor have been studied earlier of prominent directionality analysis tools offers the possibil- [3], [9], and so for brevity, more attention will be given to ity of computation of HOGs without binning, in an algo- the HOGs.
    [Show full text]
  • The Hough Transform As a Tool for Image Analysis
    THE HOUGH TRANSFORM AS A TOOL FOR IMAGE ANALYSIS Josep Llad´os Computer Vision Center - Dept. Inform`atica. Universitat Aut`onoma de Barcelona. Computer Vision Master March 2003 Abstract The Hough transform is a widespread technique in image analysis. Its main idea is to transform the image to a parameter space where clusters or particular configurations identify instances of a shape under detection. In this chapter we overview some meaningful Hough-based techniques for shape detection, either parametrized or generalized shapes. We also analyze some approaches based on the straight line Hough transform able to detect particular structural properties in images. Some of the ideas of these approaches will be used in the following chapter to solve one of the goals of the present work. 1 Introduction The Hough transform was first introduced by Paul Hough in 1962 [4] with the aim of detecting alignments in T.V. lines. It became later the basis of a great number of image analysis applications. The Hough transform is mainly used to detect parametric shapes in images. It was first used to detect straight lines and later extended to other parametric models such as circumferences or ellipses, being finally generalized to any parametric shape [1]. The key idea of the Hough transform is that spatially extended patterns are transformed into a parameter space where they can be represented in a spatially compact way. Thus, a difficult global detection problem in the image space is reduced to an easier problem of peak detection in a parameter space. Ü; Ý µ A set of collinear image points ´ can be represented by the equation: ÑÜ Ò =¼ Ý (1) Ò where Ñ and are two parameters, the slope and intercept, which characterize the line.
    [Show full text]
  • Pavement Crack Detection by Ridge Detection on Fractional Calculus and Dual-Thresholds
    International Journal of Multimedia and Ubiquitous Engineering Vol.10, No.4 (2015), pp.19-30 http://dx.doi.org/10.14257/ijmue.2015.10.4.03 Pavement Crack Detection by Ridge Detection on Fractional Calculus and Dual-thresholds Song Hongxun, Wang Weixing, Wang Fengping, Wu Linchun and Wang Zhiwei Shaanxi Road Traffic Intelligent Detection and Equipment Engineering Technology Research Center, Chang’an University, Xi’an, China School of Information Engineering, Chang’an University, Xi’an, China songhongx @163.com Abstract In this paper, a new road surface crack detection algorithm is proposed; it is based on the ridge edge detection on fractional calculus and the dual-thresholds on a binary image. First, the multi-scale reduction of image data is used to shrink an original image to eliminate noise, which can not only smooth an image but also enhance cracks. Then, the main cracks are extracted by using the ridge edge detection on fractional calculus in a grey scale image. Subsequently, the resulted binary image is further processed by applying both short and long line thresholds to eliminate short curves and noise for getting rough crack segments. Finally the gaps in cracks are connected with a curve connection function which is an artificial intelligence routine. The experiments show that the algorithm for pavement crack images has the good performance of noise immunity, accurate positioning, and high accuracy. It can accurately locate and detect small and thin cracks that are difficult to identify by other traditional algorithms. Keywords: Crack; Multiscale; Image enhancement; Ridge edge; Threshold 1. Introduction In a time period, any road can be damaged gradually due to traffic loads and the other natural/artificial factors, and cracks are the most common road pavement surface defects that need to mend as early as possible.
    [Show full text]
  • Feature Detection Florian Stimberg
    Feature Detection Florian Stimberg Outline ● Introduction ● Types of Image Features ● Edges ● Corners ● Ridges ● Blobs ● SIFT ● Scale-space extrema detection ● keypoint localization ● orientation assignment ● keypoint descriptor ● Resources Introduction ● image features are distinct image parts ● feature detection often is first operation to define image parts to process later ● finding corresponding features in pictures necessary for object recognition ● Multi-Image-Panoramas need to be “stitched” together at according image features ● the description of a feature is as important as the extraction Types of image features Edges ● sharp changes in brightness ● most algorithms use the first derivative of the intensity ● different methods: (one or two thresholds etc.) Types of image features Corners ● point with two dominant and different edge directions in its neighbourhood ● Moravec Detector: similarity between patch around pixel and overlapping patches in the neighbourhood is measured ● low similarity in all directions indicates corner Types of image features Ridges ● curves which points are local maxima or minima in at least one dimension ● quality depends highly on the scale ● used to detect roads in aerial images or veins in 3D magnetic resonance images Types of image features Blobs ● points or regions brighter or darker than the surrounding ● SIFT uses a method for blob detection SIFT ● SIFT = Scale-invariant feature transform ● first published in 1999 by David Lowe ● Method to extract and describe distinctive image features ● feature
    [Show full text]
  • Coding Images with Local Features
    Int J Comput Vis (2011) 94:154–174 DOI 10.1007/s11263-010-0340-z Coding Images with Local Features Timo Dickscheid · Falko Schindler · Wolfgang Förstner Received: 21 September 2009 / Accepted: 8 April 2010 / Published online: 27 April 2010 © Springer Science+Business Media, LLC 2010 Abstract We develop a qualitative measure for the com- The results of our empirical investigations reflect the theo- pleteness and complementarity of sets of local features in retical concepts of the detectors. terms of covering relevant image information. The idea is to interpret feature detection and description as image coding, Keywords Local features · Complementarity · Information and relate it to classical coding schemes like JPEG. Given theory · Coding · Keypoint detectors · Local entropy an image, we derive a feature density from a set of local fea- tures, and measure its distance to an entropy density com- puted from the power spectrum of local image patches over 1 Introduction scale. Our measure is meant to be complementary to ex- Local image features play a crucial role in many computer isting ones: After task usefulness of a set of detectors has vision applications. The basic idea is to represent the image been determined regarding robustness and sparseness of the content by small, possibly overlapping, independent parts. features, the scheme can be used for comparing their com- By identifying such parts in different images of the same pleteness and assessing effects of combining multiple de- scene or object, reliable statements about image geome- tectors. The approach has several advantages over a simple try and scene content are possible. Local feature extraction comparison of image coverage: It favors response on struc- comprises two steps: (1) Detection of stable local image tured image parts, penalizes features in purely homogeneous patches at salient positions in the image, and (2) description areas, and accounts for features appearing at the same lo- of these patches.
    [Show full text]
  • Line Detection by Hough Transformation
    Line Detection by Hough transformation 09gr820 April 20, 2009 1 Introduction When images are to be used in different areas of image analysis such as object recognition, it is important to reduce the amount of data in the image while preserving the important, characteristic, structural information. Edge detection makes it possible to reduce the amount of data in an image considerably. However the output from an edge detector is still a image described by it’s pixels. If lines, ellipses and so forth could be defined by their characteristic equations, the amount of data would be reduced even more. The Hough transform was originally developed to recognize lines [5], and has later been generalized to cover arbitrary shapes [3] [1]. This worksheet explains how the Hough transform is able to detect (imperfect) straight lines. 2 The Hough Space 2.1 Representation of Lines in the Hough Space Lines can be represented uniquely by two parameters. Often the form in Equation 1 is used with parameters a and b. y = a · x + b (1) This form is, however, not able to represent vertical lines. Therefore, the Hough transform uses the form in Equation 2, which can be rewritten to Equation 3 to be similar to Equation 1. The parameters θ and r is the angle of the line and the distance from the line to the origin respectively. r = x · cos θ + y · sin θ ⇔ (2) cos θ r y = − · x + (3) sin θ sin θ All lines can be represented in this form when θ ∈ [0, 180[ and r ∈ R (or θ ∈ [0, 360[ and r ≥ 0).
    [Show full text]
  • Low-Level Vision Tutorial 1
    Low-level vision tutorial 1 complexity and sophistication – so much so that Feature detection and sensitivity the problems of low-level vision are often felt to Image filters and morphology be unimportant and are forgotten. Yet it remains Robustness of object location the case that information that is lost at low level Validity and accuracy in shape analysis Low-level vision – a tutorial is never regained, while distortions that are Scale and affine invariance. introduced at low level can cause undue trouble at higher levels [1]. Furthermore, image Professor Roy Davies In what follows, references are given for the acquisition is equally important. Thus, simple topics falling under each of these categories, and measures to arrange suitable lighting can help to for other topics that could not be covered in make the input images easier and less ambiguous Overview and further reading depth in the tutorial. to interpret, and can result in much greater reliability and accuracy in applications such as 3. Feature detection and sensitivity automated inspection [1]. Nevertheless, in applications such as surveillance, algorithms General [1]. Abstract should be made as robust as possible so that the Edge detection [2, 3, 4]. vagaries of ambient illumination are rendered Line segment detection [5, 6, 7]. This tutorial aims to help those with some relatively unimportant. Corner and interest point detection [1, 8, 9]. experience of vision to obtain a more in-depth General feature mask design [10, 11]. understanding of the problems of low-level 2. Low-level vision The value of thresholding [1, 12, 13]. vision. As it is not possible to cover everything in the space of 90 minutes, a carefully chosen This tutorial is concerned with low-level vision, 4.
    [Show full text]