Introduction to Computer Vision & Robotics

16 December 2020 Lecture 7 [email protected] Georg-August University, Göttingen, Germany

0 Last time we talked about

1. Morphological image processing • Erosion, dilation, opening, closing • Detection of isolated pixels or certain (simple) shapes • Convex hull • Hole filling • Morphological gradients 2. Image Features • Gradients and Edges • Corners

1 What is a feature descriptor?

• useful information • simplifies an image (or image patch) • much smaller in size than a whole image • should not depend on rotation, scale, illumination and so on

Descriptors are then used for matching!

2 We can use only circle and color information

Example of a naive descriptor

How can we describe these images in a compressed form?

3 Example of a naive descriptor

How can we describe these images in a compressed form?

We can use only circle and color information

3 Basic shape descriptors: Hough Transformation

• It was invented and patented in 1960s • Then extended to arbitrary shapes in 1970s • The idea is to find imperfect instances of objects within a certain class of shapes by a voting procedure. • The simplest case of Hough transform is detecting straight lines. • Here we’ll consider lines and circles

4 Hough Lines

Different Line parameterizations possible:

• Cartesian Coordinates y • Parameters: (a, b) • Equation: y = ax + b • Problem: representation can not handle vertical lines • Polar Coordinates • Parameters: (r, θ) r • Equation: y = (− cos θ )x + ( r ) sin θ sin θ θ x

wiki/houghtransform

5 Hough Lines

− cos θ r • y = ( sin θ )x + ( sin θ ) • r = x cos θ + y sin θ • Idea of Hough algorithm: vote for possibilities • We need to discretize our voting space (called accumulator)

wiki/houghtransform

6 Hough Lines

7 Hough Algorithm

• Find Feature locations in Image

• For each feature point xi:

• For each possibility pj in the accumulator that passes through the feature point: • Increment acumulator at this position • Find the maximum element of the accumulator, it’s location reflect the according parameter values • Map back to image space

8 Example: Hough Lines Algorithm

9

wiki/houghtransform Example: Hough Lines Algorithm

wiki/houghtransform

10 OpenCV Functions

import cv2 import numpy as np

img = cv2.imread('dave.jpg') gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) edges = cv2.Canny(gray,50,150,apertureSize = 3) lines = cv2.HoughLines(edges,1,np.pi/180,200) for rho,theta in lines[0]: a = np.cos(theta) b = np.sin(theta) x0 = a*rho y0 = b*rho x1 = int(x0 + 1000*(-b)) y1 = int(y0 + 1000*(a)) x2 = int(x0 - 1000*(-b)) y2 = int(y0 - 1000*(a)) cv2.line(img,(x1,y1),(x2,y2),(0,0,255),2)

11 Example (Classical Hough)

http://opencvexamples.blogspot.com

12 Example (Probabilistic Hough)

13 docs.opencv.org Circle Hough Transform Circle Hough Transform

• We can also detect circles! • Circle with radius R and center at (a, b) has the parametric equations x = a + R cos θ y = b + R sin θ

14 Circle Hough Transform

• What if circles of different radii should be detected?

15 Hough Circles: Example

docs.opencv.org

16 OpenCV Circle Detector

import numpy as np import cv2 as cv img = cv.imread('opencv-logo-white.png',0) img = cv.medianBlur(img,5) cimg = cv.cvtColor(img,cv.COLOR_GRAY2BGR) circles = cv.HoughCircles(img,cv.HOUGH_GRADIENT,1,20, param1=50,param2=30, minRadius=0,maxRadius=0) circles = np.uint16(np.around(circles)) for i in circles[0,:]: # draw the outer circle cv.circle(cimg,(i[0],i[1]),i[2],(0,255,0),2) # draw the center of the circle cv.circle(cimg,(i[0],i[1]),2,(0,0,255),3)

17 Practical Problem

Industrial Vision: Check the status of the detail

18 Practical Problem

Industrial Vision: Check the status of the detail

19 Features Why Features?

• Features are used for matching • Template Matching: Find/Locate a specific object inside an image (object recognition) • Find correlating features in multiple images to reconstruct the (3D) scene (navigation, mapping, tracking)

20 Template Matching

21 Template Matching

22 Panorama Creation

23 Matching

24 Feature requirements?

• Rotation invariant (e.g. Harris-Corner-Detection) • Invariant under different lighting • Scale invariant

25 David Lowe

26 Outline: SIFT

• Identify Scale-Invariant Keypoint-Candidates in the Image • Filter out bad Candidates • Create one Feature Vector for each valid candidate, that describes the region around the keypoint

27 Feature descriptors

• useful information • simplifies an image (or image patch) • much smaller in size than a whole image • should not depend on rotation, scale, illumination and so on • Descriptors are then used for matching!

More advanced feature descriptors than basic shapes, edges, corners.

28 SIFT: Scale invariant feature transform

• Invented (and patented) by David Lowe (1999, 2004) • Invariant to scale, rotation, illumination, viewpoint • It gave start to a whole family of advanced feature descriptors • It has many extensions, e.g., SURF, PCA-SHIFT • It can robustly identify objects even among clutter and under partial occlusion • Within the last decades has many applications in Computer Vision, Robotics etc.

29 Basic Idea

Image content is transformed into local features that are invariant to translation, rotation, scale, and other imaging parameters

courses.cs.washington.edu

30 Algorithm Overview

1. Detector • Construct a scale-space • Detect scale-space extrema • Localize keypoints 2. Descriptor • Assign orientation to each key point • Compute descriptor vectors • Normalize the vectors

31 Scale-invariant feature transform (SIFT)

docs.opencv.org

32 1. Construction

• Goal: identify locations and scales that can be repeatably assigned under different views of the same scene or object. • Method: search for stable features across multiple scales using a continuous function of scale. ∗ 1 −(x2+y2)/(2σ2) • L(x, y, σ) = G(x, y, σ) I(x, y), where G(x, y, σ) = 2πσ e • The scale space of an image L(x, y, σ) is produced by progressively convolving the input image with a Gaussian kernel at different scales

33

• How to detect blobs of specific size? • Use • Convolution of a Gaussian kernel on an Image: blurring (suppression of lower frequencies) • Substracting two Images that were convolved by Gaussian kernels of different sizes preserves range of frequencies

courses.cs.washington.edu 34 Variable Size Blob Detection

• Use filters of different sizes

courses.cs.washington.edu 35 Image Pyramids

• Bottom level is the original image • Each next level is derived from the previous one according to some function (e.g., Downsampling and Convolution with Gaussian kernel)

courses.cs.washington.edu 36 1. Scale Space Construction

• In SIFT scale-space image representation consists of N octaves, defined by two parameters s and σ • Each octave is an ordered set of s + 3 images such that m m ∗ L(x, y, k σ)√ = G(x, y, k σ) fi(x, y), where k = 2,

fi is i-th sub-sample of I, m = 0, 1,..., s + 2, i = 1,..., N

37 1. Scale Space Construction

s = 2, five images in each octave

David Lowe

38 1. Scale Space Construction

s = 2, five images in each octave

David Lowe

39 2. Extrema Detection

Apply Difference of Gaussian (DoG) to approximate the Laplacian of Gaussian (LoG, which is a known blob detector)

David Lowe

40 2. Extrema Detection

• Detect maxima and minima of DoG in scale space • Each point is compared to its 8 neighbors in the octave and to the 9 neighbors of increased and decreased scale inside the same octave

David Lowe

41 3. Extrema Detection

• Local extrema are refined • Reject points with bad contrast (thresholding) • Reject points with strong edge response in one direction only • Edge points are detected using Hessian matrix (similar to Harris) • Basically, we want to keep only corner-like keypoints • Now, we can compute the descriptors

42 The aperture Problem

• Why do we want corner-like patches only and not edge-like features?

43 3. Extrema Detection

44 4. Orientation Assignment

• For each keypoint we have coordinates and scale (σ) • Use scale of point to choose the closest image L(x, y, σ) = G(x, y, σ) ∗ I(x, y) • Compute gradient magnitude and orientation of each keypoint

√m(x, y, σ) = (L(x + 1, y, σ) − L(x − 1, y, σ))2 + (L(x, y + 1, σ) − L(x, y − 1, σ))2 L(x,y+1,σ)−L(x,y−1,σ) θ(x, y, σ) = arctan L(x+1,y,σ)−L(x−1,y,σ)

45 4. Orientation Assignment

For region around keypoint

• Create Histogram with 36 bins for orientation • Weight each keypoint with Gaussian window of 1.5σ • Create keypoint for all peaks with value ≥ 0.8 of max bin

www.inf.fu-berlin.de

46 5. Keypoint Descriptors

• Each keypoint now has • location • scale • orientation • A descriptor for the local image region around each keypoint is computed • It must be highly distinctive • Invariant to changes in viewpoint and illumination

47 5. Keypoint Descriptors

• Find the blurred image of closest scale • Take a neighborhood around the keypoint (16 × 16) • Rotate the gradients and coordinates according to previously computed orientation (for rotation invariance) • Separate the region into subregions (4 × 4) • Create histogram for each subregion with 8 bins • The descriptor is 4 × 4 × 8 = 128 numbers • Normalize the 128 vector for illumination invariance

http://aishack.in

48 Example matching

docs.opencv.org

49 OpenCV Usage

import numpy as np import cv2 from matplotlib import pyplot as plt img1 = cv2.imread('box.png',0) # queryImage img2 = cv2.imread('box_in_scene.png',0) # trainImage sift = cv2.SIFT() kp1, des1 = sift.detectAndCompute(img1,None) kp2, des2 = sift.detectAndCompute(img2,None) bf = cv2.BFMatcher() matches = bf.knnMatch(des1,des2, k=2) good = [] for m,n in matches: if m.distance < 0.75*n.distance: good.append([m]) img3 = cv2.drawMatchesKnn(img1,kp1,img2,kp2,good,flags=2) plt.imshow(img3),plt.show()

50 Matching

• In OpenCV there are other matchers • For instance, KNN • Thereafter, the match points can be further thresholded

51 Further feature descriptors

• SURF • Speeded Up Robust Features • Approximation of DoG is used • Faster than SIFT • But misses some points • Histograms of oriented gradients (HOG)

52 Histogram of oriented gradients (HOG)

• Compute gradients and angles • Construct 1D histograms of orientations over small cells (8 × 8 pixels) • Normalize histogram values over (overlapping) blocks of cells (2 × 2 cells) • Concatenate histograms into descriptor vectors

Dalal and Triggs, 2005

53 Object detection and categorization

• HOG was used for pedestrian detection • SIFT (or SURF) were used for any object detection

54