<<

Vision: Filtering

Raquel Urtasun

TTI Chicago

Jan 10, 2013

Raquel Urtasun (TTI-C) Jan 10, 2013 1 / 82 Today’s lecture ...

Image formation Image Filtering

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 2 / 82 Readings

Chapter 2 and 3 of Rich Szeliski’s book

Available online here

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 3 / 82 How is an image created?

The image formation process that produced a particular image depends on lighting conditions scene surface properties camera

[Source: R. Szeliski] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 4 / 82 Image formation

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 5 / 82 What is an image?

!"#"$%&'(%)*+%'

0*1&&'23456'37'$-*6*'"7'$-"6'4&%66'

,-*'./*'

[Source: A. Efros] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 6 / 82 From photons to RGB values

Sample the 2D space on a regular grid. Quantize each sample, i.e., the photons arriving at each active cell are integrated and then digitized.

[Source: D. Hoiem]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 7 / 82 What is an image?

A grid () of intensity values

#$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$"

#$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$"

#$$" #$$" #$$" #%" %" #$$" #$$" #$$" #$$" #$$" #$$" #$$"

#$$" #$$" #$$" &$" &$" &$" #$$" #$$" #$$" #$$" #$$" #$$"

#$$" #$$" &$" '$" '$" &$" #$$" #$$" #$$" #$$" #$$" #$$"

#$$" #$$" '(" )#&" )*$" )&$" #$$" #$$" #$$" #$$" #$$" #$$"

#$$" #$$" )#&" )*$" )&$" )&$" )&$" #$$" #$$" #$$" #$$" #$$" !" #$$" #$$" )#&" )*$" #%%" #%%" )&$" )&$" '$" #$$" #$$" #$$" #$$" #$$" )#&" )*$" #%%" #%%" )&$" )&$" '$" *&" #$$" #$$"

#$$" #$$" )#&" )*$" )*$" )&$" )#&" )#&" '$" *&" #$$" #$$"

#$$" #$$" &*" )#&" )#&" )#&" '$" '$" '$" *&" #$$" #$$"

#$$" #$$" #$$" &*" &*" &*" &*" &*" &*" #$$" #$$" #$$"

#$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$"

#$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$" #$$"

Common to use one byte per value: 0=black, 255=white)

[Source: N. Snavely] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 8 / 82 What is an image?

We can think of a (grayscale) image as a f : <2 → < giving the intensity at position (x, y)

f (x, y)

x

y

A is a discrete (sampled, quantized) version of this function

[Source: N. Snavely]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 9 / 82 Image Transformations

As with any function, we can apply operators to an image

!g (x,y) = f (x,y) + 20 !g (x,y) = f (-x,y)

We’ll talk about special kinds of operators, correlation and (linear filtering)

[Source: N. Snavely]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 10 / 82 Filtering

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 11 / 82 What’s the next best thing?

[Source: S. Seitz]

Question:

Given a camera and a still scene, how can you reduce noise? Take lots of images and average them!

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 12 / 82 What’s the next best thing?

[Source: S. Seitz]

Question: Noise reduction

Given a camera and a still scene, how can you reduce noise? Take lots of images and average them!

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 12 / 82 Question: Noise reduction

Given a camera and a still scene, how can you reduce noise? Take lots of images and average them!

What’s the next best thing?

[Source: S. Seitz]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 12 / 82 Question: Noise reduction

Given a camera and a still scene, how can you reduce noise? Take lots of images and average them!

What’s the next best thing?

[Source: S. Seitz]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 12 / 82 Image filtering

Modify the pixels in an image based on some function of a local neighborhood of each pixel

#'" !" &" 5).0"678*9)8" $" !" #" %" #" #" %"

()*+,"-.+/0"1+2+" 3)1-401"-.+/0"1+2+"

[Source: L. Zhang]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 13 / 82 Detect patterns, e.g., .

Applications of Filtering

Enhance an image, e.g., denoise, resize. Extract information, e.g., texture, edges.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 14 / 82 Applications of Filtering

Enhance an image, e.g., denoise, resize. Extract information, e.g., texture, edges. Detect patterns, e.g., template matching.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 14 / 82 Applications of Filtering

Enhance an image, e.g., denoise, resize. Extract information, e.g., texture, edges. Detect patterns, e.g., template matching.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 14 / 82 [Source: S. Marschner]

Noise reduction

Simplest thing: replace each pixel by the average of its neighbors. This assumes that neighboring pixels are similar, and the noise to be independent from pixel to pixel.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 15 / 82 Noise reduction

Simplest thing: replace each pixel by the average of its neighbors. This assumes that neighboring pixels are similar, and the noise to be independent from pixel to pixel.

[Source: S. Marschner]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 15 / 82 Noise reduction

Simplest thing: replace each pixel by the average of its neighbors. This assumes that neighboring pixels are similar, and the noise to be independent from pixel to pixel.

[Source: S. Marschner]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 15 / 82 Noise reduction

Simplest thing: replace each pixel by the average of its neighbors This assumes that neighboring pixels are similar, and the noise to be independent from pixel to pixel. in 1D: [1, 1, 1, 1, 1]/5

[Source: S. Marschner]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 16 / 82 Noise reduction

Simpler thing: replace each pixel by the average of its neighbors This assumes that neighboring pixels are similar, and the noise to be independent from pixel to pixel. Non-uniform weights [1, 4, 6, 4, 1] / 16

[Source: S. Marschner]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 17 / 82 Moving Average in 2D

[Source: S. Seitz]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 18 / 82 Moving Average in 2D

[Source: S. Seitz]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 18 / 82 Moving Average in 2D

[Source: S. Seitz]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 18 / 82 Moving Average in 2D

[Source: S. Seitz]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 18 / 82 Moving Average in 2D

[Source: S. Seitz]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 18 / 82 Moving Average in 2D

[Source: S. Seitz]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 18 / 82 The entries of the weight kernel or mask h(k, l) are often called the filter coefficients. This operator is the correlation operator

g = f ⊗ h

Linear Filtering: Correlation

Involves weighted combinations of pixels in small neighborhoods. The output pixels value is determined as a weighted sum of input pixel values X g(i, j) = f (i + k, j + l)h(k, l) k,l

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 19 / 82 This operator is the correlation operator

g = f ⊗ h

Linear Filtering: Correlation

Involves weighted combinations of pixels in small neighborhoods. The output pixels value is determined as a weighted sum of input pixel values X g(i, j) = f (i + k, j + l)h(k, l) k,l

The entries of the weight kernel or mask h(k, l) are often called the filter coefficients.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 19 / 82 Linear Filtering: Correlation

Involves weighted combinations of pixels in small neighborhoods. The output pixels value is determined as a weighted sum of input pixel values X g(i, j) = f (i + k, j + l)h(k, l) k,l

The entries of the weight kernel or mask h(k, l) are often called the filter coefficients. This operator is the correlation operator

g = f ⊗ h

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 19 / 82 Linear Filtering: Correlation

Involves weighted combinations of pixels in small neighborhoods. The output pixels value is determined as a weighted sum of input pixel values X g(i, j) = f (i + k, j + l)h(k, l) k,l

The entries of the weight kernel or mask h(k, l) are often called the filter coefficients. This operator is the correlation operator

g = f ⊗ h

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 19 / 82 by averaging

What if the filter size was 5 x 5 instead of 3 x 3?

[Source: K. Graumann] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 20 / 82 Gaussian filter

What if we want nearest neighboring pixels to have the most influence on the output? Removes high-frequency components from the image (low-pass filter).

[Source: S. Seitz] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 21 / 82 Smoothing with a Gaussian

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 22 / 82 Mean vs Gaussian

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 23 / 82 Gaussian filter: Parameters

Size of kernel or mask: has infinite , but discrete filters use finite kernels.

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 24 / 82 Gaussian filter: Parameters

Variance of the Gaussian: determines extent of smoothing.

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 25 / 82 Gaussian filter: Parameters

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 26 / 82 Is this the most general Gaussian?

No, the most general form for x ∈

1  1  N (x; µ, Σ) = exp − (x − µ)T Σ−1(x − µ) (2π)d/2|Σ|1/2 2

But the simplified version is typically use for filtering.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 27 / 82 Amount of smoothing proportional to mask size. Remove high-frequency components; low-pass filter.

[Source: K. Grauman]

Properties of the Smoothing

All values are positive. They all sum to 1.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 28 / 82 Remove high-frequency components; low-pass filter.

[Source: K. Grauman]

Properties of the Smoothing

All values are positive. They all sum to 1. Amount of smoothing proportional to mask size.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 28 / 82 Properties of the Smoothing

All values are positive. They all sum to 1. Amount of smoothing proportional to mask size. Remove high-frequency components; low-pass filter.

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 28 / 82 Properties of the Smoothing

All values are positive. They all sum to 1. Amount of smoothing proportional to mask size. Remove high-frequency components; low-pass filter.

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 28 / 82 Example of Correlation

What is the result of filtering the impulse (image) F with the arbitrary kernel H?

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 29 / 82 Convolution

Convolution operator X X g(i, j) = f (i − k, j − l)h(k, l) = f (k, l)h(i − k, j − l) = f ∗ h k,l k,l

and h is then called the function. Equivalent to flip the filter in both (bottom to top, right to left) and apply cross-correlation.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 30 / 82 Convolution

Convolution operator X X g(i, j) = f (i − k, j − l)h(k, l) = f (k, l)h(i − k, j − l) = f ∗ h k,l k,l

and h is then called the impulse response function. Equivalent to flip the filter in both dimensions (bottom to top, right to left) and apply cross-correlation.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 30 / 82 Matrix form

Correlation and convolution can both be written as a matrix-vector multiply, if we first convert the two-dimensional images f (i, j) and g(i, j) into raster-ordered vectors f and g

g = Hf

with H a sparse matrix.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 31 / 82 Correlation vs Convolution

Convolution X g(i, j) = f (i − k, j − l)h(k, l) k,l G = H ∗ F

Correlation X g(i, j) = f (i + k, j + l)h(k, l) k,l G = H ⊗ F

For a Gaussian or box filter, how will the outputs differ? If the input is an impulse signal, how will the outputs differ? h ∗ δ?, and h ⊗ δ?

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 32 / 82 Example

What’s the result?

[Source: D. Lowe]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 33 / 82 Example

What’s the result?

[Source: D. Lowe]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 33 / 82 Example

What’s the result?

[Source: D. Lowe]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 33 / 82 Example

What’s the result?

[Source: D. Lowe]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 33 / 82 Example

What’s the result?

#" #" #" !" !" !" #" $" #" -! !" !" !" %" * #" #" #" !" !" !"

Original!

[Source: D. Lowe]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 33 / 82 Example

What’s the result?

#" #" #" !" !" !" #" $" #" -! !" !" !" 0" * #" #" #" !" !" !" !"#$%&'(')*+,-&$* Original! %&''()*+&*(,"(-.(,/*

[Source: D. Lowe]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 33 / 82 Sharpening

[Source: D. Lowe]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 34 / 82

Convolution with itself is another Gaussian * !"

Convolving twice with Gaussian√ kernel of width σ is the same as convolving once with kernel of width σ 2

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 35 / 82 Sharpening revisited

What does blurring take away?

!" #"

!"#$#%&'( )*!!+,-.(/0102( .-+&#'(

Let’s add it back

#"$" !"

*$+,+'#-) (&.#+-) !"#$%&'&()

[Source: S. Lazebnik] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 36 / 82 Sharpening

!"#$%&'&()

#$%&'&()

[Source: N. Snavely]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 37 / 82 ”Optical” Convolution

Camera Shake !" * Figure: Fergus, et al., SIGGRAPH 2006

Blur in out-of-focus regions of an image.

Figure: Bokeh: Click for more info [Source: N. Snavely] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 38 / 82 Both correlation and convolution are linear shift-invariant (LSI) operators, which obey both the

h ◦ (f0 + f1) = h ◦ fo + h ◦ f1

and the shift invariance principle

if g(i, j) = f (i + k, j + l) ↔ (h ◦ g)(i, j) = (h ◦ f )(i + k, j + l)

which means that shifting a signal commutes with applying the operator. Is the same as saying that the effect of the operator is the same everywhere.

Correlation vs Convolution

The convolution is both commutative and associative. The of two convolved images is the of their individual Fourier transforms.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 39 / 82 Is the same as saying that the effect of the operator is the same everywhere.

Correlation vs Convolution

The convolution is both commutative and associative. The Fourier transform of two convolved images is the product of their individual Fourier transforms. Both correlation and convolution are linear shift-invariant (LSI) operators, which obey both the superposition principle

h ◦ (f0 + f1) = h ◦ fo + h ◦ f1

and the shift invariance principle

if g(i, j) = f (i + k, j + l) ↔ (h ◦ g)(i, j) = (h ◦ f )(i + k, j + l)

which means that shifting a signal commutes with applying the operator.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 39 / 82 Correlation vs Convolution

The convolution is both commutative and associative. The Fourier transform of two convolved images is the product of their individual Fourier transforms. Both correlation and convolution are linear shift-invariant (LSI) operators, which obey both the superposition principle

h ◦ (f0 + f1) = h ◦ fo + h ◦ f1

and the shift invariance principle

if g(i, j) = f (i + k, j + l) ↔ (h ◦ g)(i, j) = (h ◦ f )(i + k, j + l)

which means that shifting a signal commutes with applying the operator. Is the same as saying that the effect of the operator is the same everywhere.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 39 / 82 Correlation vs Convolution

The convolution is both commutative and associative. The Fourier transform of two convolved images is the product of their individual Fourier transforms. Both correlation and convolution are linear shift-invariant (LSI) operators, which obey both the superposition principle

h ◦ (f0 + f1) = h ◦ fo + h ◦ f1

and the shift invariance principle

if g(i, j) = f (i + k, j + l) ↔ (h ◦ g)(i, j) = (h ◦ f )(i + k, j + l)

which means that shifting a signal commutes with applying the operator. Is the same as saying that the effect of the operator is the same everywhere.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 39 / 82 A number of alternative padding or extension modes have been developed.

Boundary Effects

The results of filtering the image in this form will lead to a darkening of the corner pixels. The original image is effectively being padded with 0 values wherever the convolution kernel extends beyond the original image boundaries.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 40 / 82 Boundary Effects

The results of filtering the image in this form will lead to a darkening of the corner pixels. The original image is effectively being padded with 0 values wherever the convolution kernel extends beyond the original image boundaries. A number of alternative padding or extension modes have been developed.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 40 / 82 Boundary Effects

The results of filtering the image in this form will lead to a darkening of the corner pixels. The original image is effectively being padded with 0 values wherever the convolution kernel extends beyond the original image boundaries. A number of alternative padding or extension modes have been developed.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 40 / 82 Boundary Effects

The results of filtering the image in this form will lead to a darkening of the corner pixels. The original image is effectively being padded with 0 values wherever the convolution kernel extends beyond the original image boundaries. A number of alternative padding or extension modes have been developed.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 40 / 82 If his is possible, then the convolution kernel is called separable. And it is the outer product of two kernels

K = vhT

Separable Filters

The process of performing a convolution requires K 2 operations per pixel, where K is the size (width or height) of the convolution kernel. In many cases, this operation can be speed up by first performing a 1D horizontal convolution followed by a 1D vertical convolution, requiring 2K operations.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 41 / 82 And it is the outer product of two kernels

K = vhT

Separable Filters

The process of performing a convolution requires K 2 operations per pixel, where K is the size (width or height) of the convolution kernel. In many cases, this operation can be speed up by first performing a 1D horizontal convolution followed by a 1D vertical convolution, requiring 2K operations. If his is possible, then the convolution kernel is called separable.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 41 / 82 Separable Filters

The process of performing a convolution requires K 2 operations per pixel, where K is the size (width or height) of the convolution kernel. In many cases, this operation can be speed up by first performing a 1D horizontal convolution followed by a 1D vertical convolution, requiring 2K operations. If his is possible, then the convolution kernel is called separable. And it is the outer product of two kernels

K = vhT

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 41 / 82 Separable Filters

The process of performing a convolution requires K 2 operations per pixel, where K is the size (width or height) of the convolution kernel. In many cases, this operation can be speed up by first performing a 1D horizontal convolution followed by a 1D vertical convolution, requiring 2K operations. If his is possible, then the convolution kernel is called separable. And it is the outer product of two kernels

K = vhT

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 41 / 82 Let’s play a game...

Is this separable? If yes, what’s the separable version?

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 42 / 82 Let’s play a game...

Is this separable? If yes, what’s the separable version?

What does this filter do?

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 42 / 82 Let’s play a game...

Is this separable? If yes, what’s the separable version?

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 43 / 82 Let’s play a game...

Is this separable? If yes, what’s the separable version?

What does this filter do?

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 43 / 82 Let’s play a game...

Is this separable? If yes, what’s the separable version?

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 44 / 82 Let’s play a game...

Is this separable? If yes, what’s the separable version?

What does this filter do?

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 44 / 82 Let’s play a game...

Is this separable? If yes, what’s the separable version?

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 45 / 82 Let’s play a game...

Is this separable? If yes, what’s the separable version?

What does this filter do?

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 45 / 82 Let’s play a game...

Is this separable? If yes, what’s the separable version?

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 46 / 82 Let’s play a game...

Is this separable? If yes, what’s the separable version?

What does this filter do?

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 46 / 82 Look at the singular value decomposition (SVD), and if only one singular value is non-zero, then it is separable

T X T K = UΣV = σi ui vi i

with Σ = diag(σi ). √ √ T σ1u1 and σ1v1 are the vertical and horizontal kernels.

How can we tell if a given kernel K is indeed separable?

Inspection... this is what we were doing. Looking at the analytic form of it.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 47 / 82 √ √ T σ1u1 and σ1v1 are the vertical and horizontal kernels.

How can we tell if a given kernel K is indeed separable?

Inspection... this is what we were doing. Looking at the analytic form of it. Look at the singular value decomposition (SVD), and if only one singular value is non-zero, then it is separable

T X T K = UΣV = σi ui vi i

with Σ = diag(σi ).

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 47 / 82 How can we tell if a given kernel K is indeed separable?

Inspection... this is what we were doing. Looking at the analytic form of it. Look at the singular value decomposition (SVD), and if only one singular value is non-zero, then it is separable

T X T K = UΣV = σi ui vi i

with Σ = diag(σi ). √ √ T σ1u1 and σ1v1 are the vertical and horizontal kernels.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 47 / 82 How can we tell if a given kernel K is indeed separable?

Inspection... this is what we were doing. Looking at the analytic form of it. Look at the singular value decomposition (SVD), and if only one singular value is non-zero, then it is separable

T X T K = UΣV = σi ui vi i

with Σ = diag(σi ). √ √ T σ1u1 and σ1v1 are the vertical and horizontal kernels.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 47 / 82 Application of filtering: Template matching

Filters as templates: filters look like the effects they are intended to find. Use normalized cross-correlation score to find a given pattern (template) in the image. Normalization needed to control for relative brightnesses.

[Source: K. Grauman] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 48 / 82 Template matching

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 49 / 82 More complex Scenes

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 50 / 82 Let’s talk about

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 51 / 82 Filtering: Edge detection

Map image from 2d array of pixels to a set of curves or line segments or contours. More compact than pixels. Look for strong gradients, post-process.

Figure: [Shotton et al. PAMI, 07]

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 52 / 82 Origin of edges

Edges are caused by a variety of factors

surface normal discontinuity

depth discontinuity

surface color discontinuity

illumination discontinuity

[Source: N. Snavely]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 53 / 82 What causes an edge?

[Source: K. Grauman] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 54 / 82 Looking more locally...

[Source: K. Grauman] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 55 / 82 Images as functions

Edges look like steep cliffs

[Source: N. Snavely]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 56 / 82 Characterizing Edges

An edge is a place of rapid change in the image intensity function.

[Source: S. Lazebnik]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 57 / 82 What would be the filter to implement this using convolution?

!" 1 -1 !" -1 1

[Source: S. Seitz]

How to Implement with Convolution

How can we differentiate a digital image F[x,y]? Option 1: reconstruct a continuous image f , then compute the partial as ∂f (x, y) f (x + , y) − f (x) = lim ∂x →0 

Option 2: take discrete derivative (finite difference) ∂f (x, y) f [x + 1, y] − f [x] ≈ ∂x 1

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 58 / 82 !" 1 -1 !" -1 1

[Source: S. Seitz]

How to Implement Derivatives with Convolution

How can we differentiate a digital image F[x,y]? Option 1: reconstruct a continuous image f , then compute the as ∂f (x, y) f (x + , y) − f (x) = lim ∂x →0 

Option 2: take discrete derivative (finite difference) ∂f (x, y) f [x + 1, y] − f [x] ≈ ∂x 1

What would be the filter to implement this using convolution?

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 58 / 82 How to Implement Derivatives with Convolution

How can we differentiate a digital image F[x,y]? Option 1: reconstruct a continuous image f , then compute the partial derivative as ∂f (x, y) f (x + , y) − f (x) = lim ∂x →0 

Option 2: take discrete derivative (finite difference) ∂f (x, y) f [x + 1, y] − f [x] ≈ ∂x 1

What would be the filter to implement this using convolution?

!" 1 -1 !" -1 1

[Source: S. Seitz] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 58 / 82 How to Implement Derivatives with Convolution

How can we differentiate a digital image F[x,y]? Option 1: reconstruct a continuous image f , then compute the partial derivative as ∂f (x, y) f (x + , y) − f (x) = lim ∂x →0 

Option 2: take discrete derivative (finite difference) ∂f (x, y) f [x + 1, y] − f [x] ≈ ∂x 1

What would be the filter to implement this using convolution?

!" 1 -1 !" -1 1

[Source: S. Seitz] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 58 / 82 Partial derivatives of an image

Figure: Using correlation filters

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 59 / 82 Finite Difference Filters

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 60 / 82 The gradient direction (orientation of edge normal) is given by:

 ∂f ∂f  θ = tan−1 / ∂y ∂x q ∂f 2 ∂f 2 The edge strength is given by the magnitude ||∇f || = ( ∂x ) + ( ∂y )

[Source: S. Seitz]

Image Gradient

h ∂f ∂f i The gradient of an image ∇f = ∂x , ∂y The gradient points in the direction of most rapid change in intensity

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 61 / 82 q ∂f 2 ∂f 2 The edge strength is given by the magnitude ||∇f || = ( ∂x ) + ( ∂y )

[Source: S. Seitz]

Image Gradient

h ∂f ∂f i The gradient of an image ∇f = ∂x , ∂y The gradient points in the direction of most rapid change in intensity

The gradient direction (orientation of edge normal) is given by:

 ∂f ∂f  θ = tan−1 / ∂y ∂x

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 61 / 82 Image Gradient

h ∂f ∂f i The gradient of an image ∇f = ∂x , ∂y The gradient points in the direction of most rapid change in intensity

The gradient direction (orientation of edge normal) is given by:

 ∂f ∂f  θ = tan−1 / ∂y ∂x q ∂f 2 ∂f 2 The edge strength is given by the magnitude ||∇f || = ( ∂x ) + ( ∂y )

[Source: S. Seitz]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 61 / 82 Image Gradient

h ∂f ∂f i The gradient of an image ∇f = ∂x , ∂y The gradient points in the direction of most rapid change in intensity

The gradient direction (orientation of edge normal) is given by:

 ∂f ∂f  θ = tan−1 / ∂y ∂x q ∂f 2 ∂f 2 The edge strength is given by the magnitude ||∇f || = ( ∂x ) + ( ∂y )

[Source: S. Seitz]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 61 / 82 Image Gradient

[Source: S. Lazebnik]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 62 / 82 Effects of noise

Consider a single row or column of the image. Plotting intensity as a function of position gives a signal.

!"#$%&#'()*&#+,-.&

[Source: S. Seitz]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 63 / 82 Effects of noise

∂ Smooth first, and look for picks in ∂x (h ∗ f ).

[Source: S. Seitz] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 64 / 82 Derivative theorem of convolution Differentiation property of convolution ∂ ∂h ∂f (h ∗ f ) = ( ) ∗ f = h ∗ ( ) ∂x ∂x ∂x It saves one operation

[Source: S. Seitz] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 65 / 82 2D Edge Detection Filters

Gaussian Derivative of Gaussian (x) 2 2 − u +v 1 2σ2 ∂ hσ(x, y) = 2πσ2 exp ∂x hσ(u, v)

[Source: N. Snavely]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 66 / 82 Derivative of Gaussians

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 67 / 82 Laplacian of Gaussians

Edge by detecting zero-crossings of bottom graph

[Source: S. Seitz]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 68 / 82 2D Edge Filtering

2 2 ∂2f ∂2f with ∇ the Laplacian operator ∇ f = ∂x2 + ∂y 2

[Source: S. Seitz] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 69 / 82 Effect of σ on derivatives

The detected structures differ depending on the Gaussian’s scale parameter: Larger values: larger scale edges detected. Smaller values: finer features detected.

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 70 / 82 High absolute value at points of high contrast.

[Source: K. Grauman]

Derivatives

Use opposite signs to get response in regions of high contrast. They sum to 0 so that there is no response in constant regions.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 71 / 82 Derivatives

Use opposite signs to get response in regions of high contrast. They sum to 0 so that there is no response in constant regions. High absolute value at points of high contrast.

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 71 / 82 Derivatives

Use opposite signs to get response in regions of high contrast. They sum to 0 so that there is no response in constant regions. High absolute value at points of high contrast.

[Source: K. Grauman]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 71 / 82 These filters are band-pass filters: they filter low and high frequencies. The second derivative of a two-dimensional image is the laplacian operator

∂2f ∂2f ∇2f = + ∂x 2 ∂y 2

Blurring an image with a Gaussian and then taking its Laplacian is equivalent to convolving directly with the Laplacian of Gaussian (LoG) filter,

Band-pass filters

The Sobel and corner filters are band-pass and oriented filters. More sophisticated filters can be obtained by convolving with a Gaussian filter 1  x 2 + y 2  G(x, y, σ) = exp − 2πσ2 2σ2 and taking the first or second derivatives.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 72 / 82 The second derivative of a two-dimensional image is the laplacian operator

∂2f ∂2f ∇2f = + ∂x 2 ∂y 2

Blurring an image with a Gaussian and then taking its Laplacian is equivalent to convolving directly with the Laplacian of Gaussian (LoG) filter,

Band-pass filters

The Sobel and corner filters are band-pass and oriented filters. More sophisticated filters can be obtained by convolving with a Gaussian filter 1  x 2 + y 2  G(x, y, σ) = exp − 2πσ2 2σ2 and taking the first or second derivatives. These filters are band-pass filters: they filter low and high frequencies.

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 72 / 82 Blurring an image with a Gaussian and then taking its Laplacian is equivalent to convolving directly with the Laplacian of Gaussian (LoG) filter,

Band-pass filters

The Sobel and corner filters are band-pass and oriented filters. More sophisticated filters can be obtained by convolving with a Gaussian filter 1  x 2 + y 2  G(x, y, σ) = exp − 2πσ2 2σ2 and taking the first or second derivatives. These filters are band-pass filters: they filter low and high frequencies. The second derivative of a two-dimensional image is the laplacian operator

∂2f ∂2f ∇2f = + ∂x 2 ∂y 2

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 72 / 82 Band-pass filters

The Sobel and corner filters are band-pass and oriented filters. More sophisticated filters can be obtained by convolving with a Gaussian filter 1  x 2 + y 2  G(x, y, σ) = exp − 2πσ2 2σ2 and taking the first or second derivatives. These filters are band-pass filters: they filter low and high frequencies. The second derivative of a two-dimensional image is the laplacian operator

∂2f ∂2f ∇2f = + ∂x 2 ∂y 2

Blurring an image with a Gaussian and then taking its Laplacian is equivalent to convolving directly with the Laplacian of Gaussian (LoG) filter,

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 72 / 82 Band-pass filters

The Sobel and corner filters are band-pass and oriented filters. More sophisticated filters can be obtained by convolving with a Gaussian filter 1  x 2 + y 2  G(x, y, σ) = exp − 2πσ2 2σ2 and taking the first or second derivatives. These filters are band-pass filters: they filter low and high frequencies. The second derivative of a two-dimensional image is the laplacian operator

∂2f ∂2f ∇2f = + ∂x 2 ∂y 2

Blurring an image with a Gaussian and then taking its Laplacian is equivalent to convolving directly with the Laplacian of Gaussian (LoG) filter,

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 72 / 82 Band-pass filters

The directional or oriented filter can obtained by smoothing with a Gaussian (or some other filter) and then taking a directional derivative ∂ ∇u = ∂u u · ∇(G ∗ f ) = ∇u(G ∗ f ) = (∇uG) ∗ f with u = (cos θ, sin θ). The is a simple approximation of this:

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 73 / 82 Practical Example

[Source: N. Snavely]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 74 / 82 Finding Edges

Figure: Gradient magnitude

[Source: N. Snavely] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 75 / 82 Finding Edges

!"#$#%&'%("#%#)*#+%

Figure: Gradient magnitude

[Source: N. Snavely] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 75 / 82 Non-Maxima Suppression

Figure: Gradient magnitude

Check if pixel is local maximum along gradient direction: requires interpolation [Source: N. Snavely] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 76 / 82 Finding Edges

Figure: Thresholding

[Source: N. Snavely] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 77 / 82 Finding Edges

Figure: Thinning: Non-maxima suppression

[Source: N. Snavely] Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 77 / 82

Matlab: edge(image,’canny’)

1 Filter image with derivative of Gaussian

2 Find magnitude and orientation of gradient

3 Non-maximum suppression

4 Linking and thresholding (hysteresis): Define two thresholds: low and high Use the high threshold to start edge curves and the low threshold to continue them

[Source: D. Lowe and L. Fei-Fei]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 78 / 82 Canny edge detector

Still one of the most widely used edge detectors in computer vision J. Canny, A Computational Approach To Edge Detection, IEEE Trans. Pattern Analysis and Machine Intelligence, 8:679-714, 1986. Depends on several parameters: σ of the blur and the thresholds

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 79 / 82 Canny edge detector

large σ detects large-scale edges small σ detects fine edges

original Canny with Canny with

[Source: S. Seitz]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 80 / 82 (Witkin 83)

first derivative peaks

larger

Gaussian filtered signal

Properties of scale space (w/ Gaussian smoothing) edge position may shift with increasing scale (σ) two edges may merge with increasing scale an edge may not split into two with increasing scale [Source: N. Snavely]

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 81 / 82 Next class ... more on filtering and image features

Raquel Urtasun (TTI-C) Computer Vision Jan 10, 2013 82 / 82