Lecture 12: Local Descriptors, SIFT & Single Object Recognition

Professor Fei-Fei Li Stanford Vision Lab

Fei-Fei Li Lecture 12 - 1 7-Nov-11 Midterm overview

Mean: 70.24 +/- 15.50 Highest: 96 Lowest: 36.5

Fei-Fei Li Lecture 12 - 2 7-Nov-11 What we will learn today

• Local descriptors – SIFT (Problem Set 3 (Q2)) – An assortment of other descriptors (Problem Set 3 (Q4)) • Recognition and matching with local features (Problem Set 3 (Q3)) – Matching objects with local features: David Lowe – Panorama stitching: Brown & Lowe – Applications

Fei-Fei Li Lecture 12 - 3 7-Nov-11 Local Descriptors

• We know how to detect points Last lecture (#11) • Next question: How to describe them for matching? ?

Point descriptor should be: This lecture (#12) 1. Invariant

2. Distinctive Grauman Kristen credit: Slide

Fei-Fei Li Lecture 12 - 4 7-Nov-11 Rotation Invariant Descriptors • Find local orientation – Dominant direction of gradient for the image patch • Rotate patch according to this angle – This puts the patches into a canonical orientation. Slide credit: Svetlana Lazebnik, Matthew Brown Matthew Lazebnik, Svetlana credit: Slide

Fei-Fei Li Lecture 12 - 5 7-Nov-11 Orientation Normalization: Computation [Lowe, SIFT, 1999] • Compute orientation histogram • Select dominant orientation • Normalize: rotate to fixed orientation

0 2π Slide adapted from David Lowe David from adapted Slide

Fei-Fei Li Lecture 12 - 6 7-Nov-11 The Need for Invariance

• Up to now, we had invariance to – Translation – Scale – Rotation • Not sufficient to match regions under viewpoint changes – For this, we need also affine adaptation Slide credit: Tinne Tuytelaars Tinne credit: Slide

Fei-Fei Li Lecture 12 - 7 7-Nov-11 Affine Adaptation • Problem: – Determine the characteristic shape of the region. – Assumption: shape can be described by “local affine frame”. • Solution: iterative approach – Use a circular window to compute second moment matrix. – Compute eigenvectors to adapt the circle to an ellipse. – Recompute second moment matrix using new window and iterate… Slide adapted from Svetlana Lazebnik Svetlana from adapted Slide

Fei-Fei Li Lecture 12 - 8 7-Nov-11 Iterative Affine Adaptation

1. Detect keypoints, e.g. multi-scale Harris 2. Automatically select the scales 3. Adapt affine shape based on second order moment matrix 4. Refine point location

K. Mikolajczyk and C. Schmid, Scale and affine invariant interest point detectors, IJCV 60(1):63-86, 2004. Slide credit: Tinne credit: Slide Tuytelaars

Fei-Fei Li Lecture 12 - 9 7-Nov-11 Affine Normalization/Deskewing

rotate rescale

• Steps – Rotate the ellipse’s main axis to horizontal – Scale the x axis, such that it forms a circle Slide credit: Tinne credit: Slide Tuytelaars

Fei-Fei Li Lecture 12 - 10 7-Nov-11 Affine Adaptation Example

Scale-invariant regions (blobs) Slide credit: Svetlana Lazebnik Svetlana credit: Slide

Fei-Fei Li Lecture 12 - 11 7-Nov-11 Affine Adaptation Example

Affine-adapted blobs Slide credit: Svetlana Lazebnik Svetlana credit: Slide

Fei-Fei Li Lecture 12 - 12 7-Nov-11 Summary: Affine-Inv. Feature Extraction

Eliminate rotational Compare Extract affine regions Normalize regions ambiguity descriptors

Slide credit: Svetlana Lazebnik

Fei-Fei Li Lecture 12 - 13 7-Nov-11 Invariance vs. Covariance • Invariance: – features(transform(image)) = features(image) • Covariance: – features(transform(image)) = transform(features(image))

Covariant detection ⇒ invariant description Slide credit: Svetlana Lazebnik, David Lowe David Lazebnik, Svetlana credit: Slide

Fei-Fei Li Lecture 12 - 14 7-Nov-11 Local Descriptors • We know how to detect points • Next question: How to describe them for matching? ?

Point descriptor should be: 1. Invariant

2. Distinctive Grauman Kristen credit: Slide

Fei-Fei Li Lecture 12 - 15 7-Nov-11 Local Descriptors

• Simplest descriptor: list of intensities within a patch. • What is this going to be invariant to? Slide credit: Kristen Grauman Kristen credit: Slide

Fei-Fei Li Lecture 12 - 16 7-Nov-11 Feature Descriptors • Disadvantage of patches as descriptors: – Small shifts can affect matching score a lot

• Solution: histograms

π 0 2 Slide credit: Svetlana Lazebnik Svetlana credit: Slide

Fei-Fei Li Lecture 12 - 17 7-Nov-11 Feature Descriptors: SIFT • Scale Invariant Feature Transform • Descriptor computation: – Divide patch into 4x4 sub-patches: 16 cells – Compute histogram of gradient orientations (8 reference angles) for all pixels inside each sub-patch – Resulting descriptor: 4x4x8 = 128 dimensions

David G. Lowe. "Distinctive image features from scale-invariant keypoints.” IJCV 60 (2), pp. 91-110, 2004. Slide credit: Svetlana Lazebnik Svetlana credit: Slide

Fei-Fei Li Lecture 12 - 18 7-Nov-11 Overview: SIFT • Extraordinarily robust matching technique – Can handle changes in viewpoint up to ~60 deg. out-of-plane rotation – Can handle significant changes in illumination • Sometimes even day vs. night (below) – Fast and efficient—can run in real time – Lots of code available • http://people.csail.mit.edu/albert/ladypack/wiki/index.php/Known_implementations_of_SIFT Slide credit: Steve Seitz Steve credit: Slide

Fei-Fei Li Lecture 12 - 19 7-Nov-11 Working with SIFT Descriptors • One image yields: – n 128-dimensional descriptors: each one is a histogram of the gradient orientations within a patch • [n x 128 matrix] – n scale parameters specifying the size of each patch • [n x 1 vector] – n orientation parameters specifying the angle of the patch • [n x 1 vector] – n 2D points giving positions of the patches • [n x 2 matrix] Slide credit: Steve Seitz Steve credit: Slide

Fei-Fei Li Lecture 12 - 20 7-Nov-11 Local Descriptors: SURF

• Fast approximation of SIFT idea

‹ Efficient computation by 2D box filters & integral images ⇒ 6 times faster than SIFT

‹ Equivalent quality for object identification

‹ http://www.vision.ee.ethz.ch/~surf

• tp://robots.ox.ac.uk/~vgg/research/a ffine • GPU implementation available • http://www.cs.ubc.ca/~lowe/keypoin ‹ Feature extraction @ 100Hz ts/ (detector + descriptor, 640×480 img)

• http://www.vision.ee.ethz.ch/~surf ‹ http://homes.esat.kuleuven.be/~ncorneli/gpusurf/ • http://homes.esat.kuleuven.be/~ncor [Bay, ECCV’06], [Cornelis, CVGPU’08] neli/gpusurf/

Fei-Fei Li Lecture 12 - 21 7-Nov-11 Other local descriptors: Gray-scale intensity

Normalize 11x11 patch

Projection onto PCA basis c1

c2 ………..

c15

Fei-Fei Li Lecture 12 - 22 7-Nov-11 Other local descriptors: Steerable filters

Fei-Fei Li Lecture 12 - 23 7-Nov-11 Other local descriptors: Gray-scale intensity

Fei-Fei Li Lecture 12 - 24 7-Nov-11 Other local descriptors: GLOH

Fei-Fei Li Lecture 12 - 25 7-Nov-11 Other local descriptors: Shape context [Belongie et al. 2002]

Fei-Fei Li Lecture 12 - 26 7-Nov-11 Other local descriptors: Geometric Blur [Berg et al. 2001]

Fei-Fei Li Lecture 12 - 27 7-Nov-11 Other local descriptors: Geometric Blur

Fei-Fei Li Lecture 12 - 28 7-Nov-11 http://www.robots.ox.ac.uk/~vgg/research/affine/detectors.html#binaries

Fei-Fei Li Lecture 12 - 29 7-Nov-11 Value of Local Features

• Advantages – Critical to find distinctive and repeatable local regions for multi-view matching. – Complexity reduction via selection of distinctive points. – Describe images, objects, parts without requiring segmentation; robustness to clutter & occlusion. – Robustness: similar descriptors in spite of moderate view changes, noise, blur, etc.

Fei-Fei Li Lecture 12 - 307-Nov-11 What we will learn today

• Local descriptors – SIFT – An assortment of other descriptors • Recognition and matching with local features (Problem Set 3 (Q3)) – Matching objects with local features: David Lowe – Panorama stitching: Brown & Lowe – Applications

Fei-Fei Li Lecture 12 - 31 7-Nov-11 Recognition with Local Features • Image content is transformed into local features that are invariant to translation, rotation, and scale • Goal: Verify if they belong to a consistent configuration

Local Features, e.g. SIFT Slide credit: David Lowe

Fei-Fei Li Lecture 12 - 32 7-Nov-11 Finding Consistent Configurations • Global spatial models – Generalized [Lowe99] – RANSAC [Obdrzalek02, Chum05, Nister06] – Basic assumption: object is planar

• Assumption is often justified in practice – Valid for many structures on buildings – Sufficient for small viewpoint variations on 3D objects

Fei-Fei Li Lecture 12 - 33 7-Nov-11 Reminder: Hough Transform

• Origin: Detection of straight lines in clutter – Basic idea: each candidate point votes for all lines that it is consistent with. – Votes are accumulated in quantized array – Local maxima correspond to candidate lines • Representation of a line – Usual form y = a x + b has a singularity around 90º. – Better parameterization: x cos(θ) + y sin( θ) = ρ

y ρ y ρ θ x

x θ

Fei-Fei Li Lecture 12 - 34 7-Nov-11

• Gen. Hough T. for Recognition [Lowe99] – Typically only 3 feature matches needed for recognition – Extra matches provide robustness – Affine model can be used for planar objects

Fei-Fei Li Lecture 12 - 35 7-Nov-11 View Interpolation

• Training – Training views from similar viewpoints are clustered based on feature matches. – Matching features between adjacent views are linked. • Recognition – Feature matches may be spread over several training viewpoints. ⇒ Use the known links to “transfer votes” to other viewpoints.

Fei-Fei Li Lecture 12 - 7-Nov-1136 Recognition Using View Interpolation

Fei-Fei Li Lecture 12 - 7-Nov-1137 Location Recognition

Training

Fei-Fei Li Lecture 12 - 38 7-Nov-11 What we will learn today

• Local descriptors – SIFT – An assortment of other descriptors • Recognition and matching with local features (Problem Set 3 (Q3)) – Matching objects with local features: David Lowe – Panorama stitching: Brown & Lowe – Applications

Fei-Fei Li Lecture 12 - 39 7-Nov-11 Introduction

• Are you getting the whole picture? – Compact Camera FOV = 50 x 35°

Slide credit: Matthew Brown

Fei-Fei Li Lecture 12 - 40 7-Nov-11 Introduction

• Are you getting the whole picture? – Compact Camera FOV = 50 x 35° – Human FOV = 200 x 135°

Slide credit: Matthew Brown

Fei-Fei Li Lecture 12 - 41 7-Nov-11 Introduction

• Are you getting the whole picture? – Compact Camera FOV = 50 x 35° – Human FOV = 200 x 135° – Panoramic Mosaic = 360 x 180°

Slide credit: Matthew Brown

Fei-Fei Li Lecture 12 - 42 7-Nov-11 Why “Recognising Panoramas”? • 1D Rotations ( θ) – Ordering ⇒ matching images

Slide credit: Matthew Brown

Fei-Fei Li Lecture 12 - 43 7-Nov-11 Why “Recognising Panoramas”? • 1D Rotations ( θ) – Ordering ⇒ matching images

Slide credit: Matthew Brown

Fei-Fei Li Lecture 12 - 44 7-Nov-11 Why “Recognising Panoramas”? • 1D Rotations ( θ) – Ordering ⇒ matching images

Slide credit: Matthew Brown

Fei-Fei Li Lecture 12 - 45 7-Nov-11 Why “Recognising Panoramas”? • 1D Rotations ( θ) – Ordering ⇒ matching images

• 2D Rotations ( θ, φ) – Ordering ⇒ matching images

Slide credit: Matthew Brown

Fei-Fei Li Lecture 12 - 46 7-Nov-11 Why “Recognising Panoramas”? • 1D Rotations ( θ) – Ordering ⇒ matching images

• 2D Rotations ( θ, φ) – Ordering ⇒ matching images

Fei-Fei Li Lecture 12 - 47 7-Nov-11 Why “Recognising Panoramas”? • 1D Rotations ( θ) – Ordering ⇒ matching images

• 2D Rotations ( θ, φ) – Ordering ⇒ matching images

Fei-Fei Li Lecture 12 - 48 7-Nov-11 Why “Recognising Panoramas”?

Slide credit: Matthew Brown

Fei-Fei Li Lecture 12 - 49 7-Nov-11 Algorithm (Brown & Lowe, 2003)

1. Feature Matching a) SIFT Features b) Nearest Neighbor Matching 2. Image Matching 3. Bundle Adjustment 4. Multi-band Blending

Fei-Fei Li Lecture 12 - 50 7-Nov-11 Nearest Neighbor Matching of SIFT Features

π 0 2

• Find k-NN for each feature – k ≈ number of overlapping images (we use k = 4) • Use k-d tree – k-d tree recursively bi-partitions data at mean in the dimension of maximum variance – Approximate nearest neighbors found in O(nlogn)

Slide credit: Matthew Brown

Fei-Fei Li Lecture 12 - 51 7-Nov-11 Algorithm (Brown & Lowe, 2003)

1. Feature Matching 2. Image Matching a) RANSAC for Homography b) Probabilistic model for verification 3. Bundle Adjustment 4. Multi-band Blending

Fei-Fei Li Lecture 12 - 52 7-Nov-11 RANSAC for Homography Slide credit: Matthew Brown Matthew credit: Slide

Fei-Fei Li Lecture 12 - 53 7-Nov-11 RANSAC for Homography Slide credit: Matthew Brown Matthew credit: Slide

Fei-Fei Li Lecture 12 - 54 7-Nov-11 RANSAC for Homography Slide credit: Matthew Brown Matthew credit: Slide

Fei-Fei Li Lecture 12 - 55 7-Nov-11 Probabilistic model for verification Slide credit: Matthew Brown Matthew credit: Slide

Fei-Fei Li Lecture 12 - 56 7-Nov-11 Probabilistic model for verification

• Compare probability that this set of RANSAC inliers/outliers was generated by a correct/false image match

– ni = #inliers, n f = #features – p1 = p(inlier | match), p 0 = p(inlier | ~match) – pmin = acceptance probability

• Choosing values for p 1, p 0 and p min Slide credit: Matthew Brown Matthew credit: Slide

Fei-Fei Li Lecture 12 - 57 7-Nov-11 Finding the panoramas Slide credit: Matthew Brown Matthew credit: Slide

Fei-Fei Li Lecture 12 - 58 7-Nov-11 Finding the panoramas Slide credit: Matthew Brown Matthew credit: Slide

Fei-Fei Li Lecture 12 - 59 7-Nov-11 Finding the panoramas Slide credit: Matthew Brown Matthew credit: Slide

Fei-Fei Li Lecture 12 - 60 7-Nov-11 Finding the panoramas Slide credit: Matthew Brown Matthew credit: Slide

Fei-Fei Li Lecture 12 - 61 7-Nov-11 Algorithm (Brown & Lowe, 2003)

1. Feature Matching 2. Image Matching 3. Bundle Adjustment (extra reading: H & Z, Chp 18) a) Error function 4. Multi-band Blending

Fei-Fei Li Lecture 12 - 62 7-Nov-11 Error function • Sum of squared projection errors

– n = #images – I(i) = set of image matches to image i – F(i, j) = set of feature matches between images i,j k th – rij = residual of k feature match between images i,j • Robust error function Slide credit: Matthew Brown Matthew credit: Slide

Fei-Fei Li Lecture 12 - 63 7-Nov-11 Bundle Adjustment

• New images initialised with rotation, focal length of best matching image Slide credit: Matthew Brown Matthew credit: Slide

Fei-Fei Li Lecture 12 - 64 7-Nov-11 Bundle Adjustment

• New images initialised with rotation, focal length of best matching image Slide credit: Matthew Brown Matthew credit: Slide

Fei-Fei Li Lecture 12 - 65 7-Nov-11 Algorithm (Brown & Lowe, 2003)

1. Feature Matching 2. Image Matching 3. Bundle Adjustment 4. Multi-band Blending

Fei-Fei Li Lecture 12 - 66 7-Nov-11 Multi-band Blending

• Burt & Adelson 1983 – Blend frequency bands over range ∝ λ Slide credit: Matthew Brown Matthew credit: Slide

Fei-Fei Li Lecture 12 - 67 7-Nov-11 2-band Blending

Low frequency ( λ > 2 pixels)

High frequency ( λ < 2 pixels) Slide credit: Matthew Brown Matthew credit: Slide

Fei-Fei Li Lecture 12 - 68 7-Nov-11 Linear Blending

Fei-Fei Li Lecture 12 - 69 7-Nov-11 2-band Blending

Fei-Fei Li Lecture 12 - 70 7-Nov-11 Results Slide credit: Matthew Brown Matthew credit: Slide

Fei-Fei Li Lecture 12 - 71 7-Nov-11 Results

iPhone version available

[Brown, Szeliski, and Winder, 2005] http://www.cs.ubc.ca/~mbrown/autostitch/autostitch.html

Fei-Fei Li Lecture 12 - 72 7-Nov-11 Applications of Local Invariant Features

• Wide baseline stereo • Motion tracking • Mobile robot navigation • 3D reconstruction • Recognition – Specific objects – Textures – Categories • …

Fei-Fei Li Lecture 12 - 73 7-Nov-11 Wide-Baseline Stereo

Image from T. Tuytelaars ECCV 2006 tutorial

Fei-Fei Li Lecture 12 - 74 7-Nov-11 Recognition of Specific Objects, Scenes

Schmid and Mohr 1997 Sivic and Zisserman, 2003

Rothganger et al. 2003 Lowe 2002 Slide credit: Kristen Grauman Kristen credit: Slide

Fei-Fei Li Lecture 12 - 75 7-Nov-11 Recognition of Categories

Constellation model Bags of words

Csurka et al. (2004) Fei-Fei & Perona (2005) Weber et al. (2000) Dorko & Schmid (2005) Fergus et al. (2003) Sivic et al. (2005) Fei-Fei et al. (2003) Lazebnik et al. (2006), … Slide credit: Svetlana Lazebnik Svetlana credit: Slide

Fei-Fei Li Lecture 12 - 76 7-Nov-11 References and Further Reading

• More details on homography estimation can be found in Chapter 4.7 of – R. Hartley, A. Zisserman Multiple View Geometry in 2nd Ed., Cambridge Univ. Press, 2004 • Details about the DoG detector and the SIFT descriptor can be found in – D. Lowe, Distinctive image features from scale-invariant keypoints , IJCV 60(2), pp. 91-110, 2004

• Try the available local feature detectors and descriptors – http://www.robots.ox.ac.uk/~vgg/research/affine/detectors.html#binaries

Fei-Fei Li Lecture 12 - 77 7-Nov-11 What we have learned today

• Local descriptor – SIFT (Problem Set 3 (Q2)) – An assortment of other descriptors (Problem Set 3 (Q4)) • Recognition and matching with local features (Problem Set 3 (Q3)) – Matching objects with local features: David Lowe – Panorama stitching: Brown & Lowe – Applications

Fei-Fei Li Lecture 12 - 78 7-Nov-11 Supplementary materials

• Matching images – Affine Transformation – Projective Transformation (Homography)

Fei-Fei Li Lecture 12 - 79 7-Nov-11 Recognition with Local Features • Image content is transformed into local features that are invariant to translation, rotation, and scale • Goal: Verify if they belong to a consistent configuration

Local Features, e.g. SIFT Slide credit: David Lowe David credit: Slide

Lecture 12 - 80 Alignment Problem

• We have previously considered how to fit a model to image evidence – e.g., a line to edge points • In alignment, we will fit the parameters of some transformation according to a set of matching feature pairs (“correspondences”). xi ' xi

T Slide credit: Kristen Grauman Kristen credit: Slide

Fei-Fei Li Lecture 12 - 81 7-Nov-11 Basic 2D Transformations

• Basic 2D transformations as 3x3 matrices

x' 1 0 tx x x' sx 0 0x           = y' = 0 s 0 y y' 0 1 t y y    y    1  0 0 11  1   0 0 11 Translation Scaling

θθ − x' cos sin 0  x x' 1 shx 0  x     y'= sin θθ cos 0 y =   y' shy 1 0  y 1 0 011  1 0 011  Rotation Shearing Slide credit: Alyosha credit: Slide Efros

Fei-Fei Li Lecture 12 - 82 7-Nov-11 What Can be Represented by a 2×2 Matrix?

• 2D Scaling? x'= s * x x x' sx 0   x =   y'= s * y   y y' 0 sy   y • 2D Rotation around (0,0)?

x'= cos θθ * x − sin * y x' cos θθ − sin  x =  y'= sin θθ * x + cos * y y' sin θθ cos  y

• 2D Shearing? = + x' x shx * y x' 1 sh x   x =    = + sh 1 y' shy * x y y' y   y Slide credit: Alyosha credit: Slide Efros

Fei-Fei Li Lecture 12 - 83 7-Nov-11 What Can be Represented by a 2×2 Matrix?

• 2D Mirror about y axis? x' = − x x'− 1 0  x = =   y' y y' 0 1  y • 2D Mirror over (0,0)? x' = − x x'− 1 0  x = = −   y' y y' 0− 1  y • 2D Translation? = + x' x t x = + NO! y' y t y Slide credit: Alyosha credit: Slide Efros

Fei-Fei Li Lecture 12 - 84 7-Nov-11 2D Linear Transforms

x'  abx  =   y'  cdy 

• Only linear 2D transformations can be represented with a 2x2 matrix. • Linear transformations are combinations of … – Scale, – Rotation, – Shear, and – Mirror Slide credit: Alyosha credit: Slide Efros

Fei-Fei Li Lecture 12 - 85 7-Nov-11 Homogeneous Coordinates

• Q: How can we represent translation as a 3x3 matrix using homogeneous coordinates? = + x' x t x = + y' y t y

• A: Using the rightmost column: 1 0 t  x  = Translation 0 1 ty    0 0 1  Slide credit: Alyosha credit: Slide Efros

Fei-Fei Li Lecture 12 - 86 7-Nov-11 2D Affine Transformation Projective Transformation x' a b c  x  x' a b c  x y' = d e f  y  y' = d e f  y         w 0 0 1 w w' g h i w • Affine transformations: • Projective transformations: – Linear transformations – Affine transformations, and – Translations – Projective warps

• Parallel lines remain parallel • Parallel lines do not necessarily remain parallel

Fei-Fei Li Lecture 12 - 87 7-Nov-11 Let’s Start with Affine Transformations

• Simple fitting procedure (linear least squares) • Approximates viewpoint changes for roughly planar objects and roughly orthographic cameras • Can be used to initialize fitting for more complex models Slide credit: Svetlana Lazebnik, David Lowe David Lazebnik, Svetlana credit: Slide

Fei-Fei Li Lecture 12 - 88 7-Nov-11 Fitting an Affine Transformation • Assuming we know the correspondences, how do we get the transformation? ′ ′ (xi , yi ) (xi , yi ) A1 B3

A 2 A3 B B1 2

m1     L  L m2      x′ m m x  t  x y 0 0 1 0 m  x′ i = 1 2 i + 1  i i  3  =  i   ′         ′ yi  m3 m4 yi  t2  0 0 xi yi 0 1 m4  yi      L  t  L  1   t  2 Leibe Bastian credit: Slide

Fei-Fei Li Lecture 12 - 89 7-Nov-11 Fitting an Affine Transformation • Assuming we know the correspondences, how do we get the transformation? m1     L  L m2      x′ m m x  t  x y 0 0 1 0 m  x′ i = 1 2 i + 1  i i  3  =  i   ′         ′ yi  m3 m4 yi  t2  0 0 xi yi 0 1 m4  yi      L  t  L  1   t2 

• How many matches (correspondence pairs) do we need to solve for the transformation parameters? • Once we have solved for the parameters, how do we compute the coordinates of the corresponding point for ( x new , y new ) ? Slide credit: Bastian Leibe Bastian credit: Slide

Fei-Fei Li Lecture 12 - 90 7-Nov-11 Fitting a Projective Transformation (i.e. Homography) • A projective transform is a mapping between any two perspective projections with the same center of projection. – I.e. two planes in 3D along the same sight ray • Properties – Rectangle should map to arbitrary quadrilateral PP2 – Parallel lines aren’t – but must preserve straight lines • This is called a homography

wx'  * * * x PP1 wy'  = * * * y      w  * * * 1 p’ H p Slide credit: Alyosha credit: Slide Efros

Fei-Fei Li Lecture 12 - 91 7-Nov-11 Fitting a Projective Transformation (i.e. Homography) • A projective transform is a mapping between any two perspective projections with the same center of projection. – I.e. two planes in 3D along the same sight ray • Properties – Rectangle should map to arbitrary quadrilateral – Parallel lines aren’t – but must preserve straight lines • This is called a homography

wx' h11 h 12 h 13  x =  Set scale factor to 1 wy' h21 h 22 h 23  y ⇒ 8 parameters left.   w hh31 32 1  1 p’ H p Slide credit: Alyosha credit: Slide Efros

Fei-Fei Li Lecture 12 - 92 7-Nov-11 Fitting a Projective Transformation (i.e. Homography) • Estimating the transformation

A1 B3

A 2 A3 B B1 2 Homogenous coordinates Image coordinates Matrix notation ↔ x A xB ′′ 1 1 x' h11 h12 h13  x x  x' = ↔         1   x' Hx x A xB = ⋅ ′′ = 2 2 y' h21 h22 h23  y y y' ↔   z'   = 1 x A xB           x'' x ' 3 3 z' h31 h32 1  1  1  z' z ' M

Slide credit: Krystian Mikolajczyk

Fei-Fei Li Lecture 12 - 93 7-Nov-11 Fitting a Projective Transformation (i.e. Homography) • Estimating the transformation

A1 B3

A 2 A3 B B1 2 Homogenous coordinates Image coordinates Matrix notation ↔ x A xB ′′ 1 1 x' h11 h12 h13  x x  x' = ↔         1   x' Hx x A xB = ⋅ ′′ = 2 2 y' h21 h22 h23  y y y' ↔   z'   = 1 x A xB           x'' x ' 3 3 z' h31 h32 1  1  1  z' z ' M

Slide credit: Krystian Mikolajczyk

Fei-Fei Li Lecture 12 - 94 7-Nov-11 Fitting a Projective Transformation (i.e. Homography) • Estimating the transformation

A1 B3

A 2 A3 B B1 2 Homogenous coordinates Image coordinates Matrix notation ↔ x A xB ′′ 1 1 x' h11 h12 h13  x x  x' = ↔         1   x' Hx x A xB = ⋅ ′′ = 2 2 y' h21 h22 h23  y y y' ↔   z'   = 1 x A xB           x'' x ' 3 3 z' h31 h32 1  1  1  z' z ' h x + h y + h M x = 11 B1 12 B1 13 A1 h x + h y +1 31 B1 32 B1 Slide credit: Krystian Mikolajczyk

Fei-Fei Li Lecture 12 - 95 7-Nov-11 Fitting a Projective Transformation (i.e. Homography) • Estimating the transformation

A1 B3

A 2 A3 B B1 2 Homogenous coordinates Image coordinates Matrix notation ↔ x A xB ′′ 1 1 x' h11 h12 h13  x x  x' = ↔         1   x' Hx x A xB = ⋅ ′′ = 2 2 y' h21 h22 h23  y y y' ↔   z'   = 1 x A xB           x'' x ' 3 3 z' h31 h32 1  1  1  z' z ' h x + h y + h + + M 11 B1 12 B1 13 h x h y h x = = 21 B1 22 B1 23 A1 y h x + h y +1 A1 + + 31 B1 32 B1 h x h y 1 31 B1 32 B1

Fei-Fei Li Lecture 12 - 96 7-Nov-11 Fitting a Projective Transformation (i.e. Homography) • Estimating the transformation

A1 B3

A 2 A3 B B1 2 Homogenous coordinates Image coordinates x ↔ x h x + h y + h h x + h y + h A1 B1 x = 11 B1 12 B1 13 y = 21 B1 22 B1 23 x ↔ x A1 h x + h y +1 A1 h x + h y +1 A2 B2 31 B1 32 B1 31 B1 32 B1 x ↔ x A3 B3 xhx+ xhy += x hx + hy + h ABABA1111131 32 11 B 1 12 B 1 13 M

Slide credit: Krystian Mikolajczyk

Fei-Fei Li Lecture 12 - 97 7-Nov-11 Fitting a Projective Transformation (i.e. Homography) • Estimating the transformation

A1 B3

A 2 A3 B B1 2 Homogenous coordinates Image coordinates x ↔ x h x + h y + h h x + h y + h A1 B1 x = 11 B1 12 B1 13 y = 21 B1 22 B1 23 x ↔ x A1 h x + h y +1 A1 h x + h y +1 A2 B2 31 B1 32 B1 31 B1 32 B1 x ↔ x A3 B3 xhx+ xhy += x hx + hy + h ABABA1111131 32 11 B 1 12 B 1 13 M h x + h y + h − x h x − x h y − x = 0 11 B1 12 B1 13 A1 31 B1 A1 32 B1 A1 h x + h y + h − y h x − y h y − y = 0 21 B1 22 B1 23 A1 31 B1 A1 32 B1 A1 Slide credit: Krystian credit: Slide Mikolajczyk

Fei-Fei Li Lecture 12 - 98 7-Nov-11 Fitting a Projective Transformation (i.e. Homography)

• Estimating the transformation A1 B3 h x + h y + h − x h x − x h y − x = 0 A 11 B1 12 B1 13 A1 31 B1 A1 32 B1 A1 2 A3 h x + h y + h − y h x − y h y − y = 0 B 21 B1 22 B1 23 A1 31 B1 A1 32 B1 A1 B1 2

h11    h ↔  12  x A xB x y 1 0 0 0 − x x − x y − x  h  0 1 1 B1 B1 A1 B1 A1 B1 A1  13  ↔  − − −    x x 0 0 0 xB yB 1 yA xB yA yB yA h21  0 A2 B2  1 1 1 1 1 1 1     ......  ⋅ h  = .  x ↔ x 22 A3 B3        ......  h23  .  M  ......  h  .     31    h32    Ah = 0  1  Slide credit: Krystian credit: Slide Mikolajczyk

Fei-Fei Li Lecture 12 - 99 7-Nov-11 Fitting a Projective Transformation (i.e. Homography)

A Estimating the transformation 1 • B3 A • Solution: 2 A3 B B – Null-space vector of A 1 2 – Corresponds to smallest eigenvector = SVD Ah 0 x ↔ x T A1 B1 L L d11 d19 v11 v19  x ↔ x T    A2 B2 A = UDV = U M O M M O M x ↔ x    A3 B3 d L d v L v  M  91 99  91 99  [v ,L,v ] h = 19 99 Minimizes least square error v99 Slide credit: Krystian credit: Slide Mikolajczyk

Fei-Fei Li Lecture 12 - 100 7-Nov-11 Uses: Analyzing Patterns and Shapes • What is the shape of the b/w floor pattern? Slide credit: Antonio Criminisi

The floor (enlarged)

Fei-Fei Li Lecture 12 - 101 7-Nov-11 Analyzing Patterns and Shapes Slide credit: Antonio Criminisi

Automatic rectification Automatic From Martin Kemp The Science of Art (manual reconstruction)

Fei-Fei Li Lecture 12 - 102 7-Nov-11