Intro to Computer Vision Midterm Review

CS 1674: Intro to Computer Vision Midterm Review Prof. Adriana Kovashka University of Pittsburgh October 10, 2016 Reminders • The midterm exam is in class on this coming Wednesday • There will be no make-up exams unless you or a close relative is seriously ill! Review requests I received • Textures and texture representations, image responses to size and orientation of Gaussian filter banks, comparisons – 4 • Corner detection alg, Harris – 4 • Invariance vs covariance, affine intensity change, and applications to know – 3 • Scale-invariant detection, blob detection, Harris automatic scale selection – 3 • Sift and feature description – 3 • Keypoint matching alg, feature matching – 2 • Examples of how to compute and apply homography, epipolar geometry – 2 • Why it makes sense to use the ratio: distance to best match / distance to second best match when matching features across images • Summary of equations students need to know • Pyramids • Convolution practical use • Filters for transforming the image Transformations, Homographies, Epipolar Geometry 2D Linear Transformations x' a bx y' c dy Only linear 2D transformations can be represented with a 2x2 matrix. Linear transformations are combinations of … • Scale, • Rotation, • Shear, and • Mirror Alyosha Efros 2D Affine Transformations x' a b c x y' d e f y w' 0 0 1 w Affine transformations are combinations of … • Linear transformations, and • Translations Maps lines to lines, parallel lines remain parallel Adapted from Alyosha Efros Projective Transformations x' a b c x y' d e f y w' g h i w Projective transformations: • Affine transformations, and • Projective warps Parallel lines do not necessarily remain parallel Kristen Grauman How to stitch together a panorama (a.k.a. mosaic)? • Basic Procedure – Take a sequence of images from the same position • Rotate the camera about its optical center – Compute the homography (transformation) between second image and first – Transform the second image to overlap with the first – Blend the two together to create a mosaic – (If there are more images, repeat) Modified from Steve Seitz Computing the homography x, y x1, y1 1 1 x2 , y2 x2 , y2 … … xn , yn xn , yn To compute the homography given pairs of corresponding points in the images, we need to set up an equation where the parameters of H are the unknowns… Kristen Grauman Computing the homography p’ = Hp wx' a b c x wy' d e f y w g h i 1 Can set scale factor i=1. So, there are 8 unknowns. Set up a system of linear equations: Ah = b where vector of unknowns h = [a,b,c,d,e,f,g,h]T Need at least 8 eqs, but the more the better… Solve for h. If overconstrained, solve using least-squares: min Ah b 2 Kristen Grauman Computing the homography • Assume we have four matched points: How do we compute homography H? h1 w' x' h h h 1 2 3 h 2 p’=Hp p' w' y' H h4 h5 h6 h3 w' h7 h8 h9 h 4 A h h5 h x y 1 0 0 0 xx' yx' x' 6 h 0 h7 0 0 0 x y 1 xy' yy' y' h8 h9 • Apply SVD: UDVT = A [U, S, V] = svd(A); • h = Vsmallest (column of V corr. to smallest singular value) Derek Hoiem Transforming the second image Image 2 Image 1 canvas Test point: wy wx x, y w, w x, y To apply a given homography H wx' * * * x • Compute p’ = Hp (regular matrix multiply) • Convert p’ from homogeneous to image wy' * * * y coordinates w * * * 1 Modified from Kristen Grauman p’ H p Transforming the second image Image 2 Image 1 canvas H(x,y) y y’ x f(x,y) x’ g(x’,y’) Forward warping: Send each pixel f(x,y) to its corresponding location (x’,y’) = H(x,y) in the right image Modified from Alyosha Efros Depth from disparity We have two images taken from cameras with different intrinsic and extrinsic parameters. • How do we match a point in the first image to a point in the second? image I(x,y) Disparity map D(x,y) image I´(x´,y´) So if we could find the corresponding points in two images, we could estimate relative depth… Kristen Grauman Epipolar geometry: notation X x x’ • Baseline – line connecting the two camera centers • Epipoles = intersections of baseline with image planes = projections of the other camera center • Epipolar Plane – plane containing baseline • Epipolar Lines - intersections of epipolar plane with image planes (always come in corresponding pairs) • Note: All epipolar lines intersect at the epipole. Derek Hoiem Epipolar constraint The epipolar constraint is useful because it reduces the correspondence problem to a 1D search along an epipolar line. Kristen Grauman, image from Andrew Zisserman Essential matrix XTRX 0 X[Tx ]RX 0 Let E [T x]R XEX XT EX 0 E is called the essential matrix, and it relates corresponding image points between both cameras, given the rotation and translation. Before we said: If we observe a point in one image, its position in other image is constrained to lie on line defined by above. • Turns out Ex’ is the epipolar line through x in the first image, corresp. to x’. Note: these points are in camera coordinate systems. Kristen Grauman Basic stereo matching algorithm • For each pixel in the first image – Find corresponding epipolar scanline in the right image – Search along epipolar line and pick the best match x’ – Compute disparity x-x’ and set depth(x) = f*T/(x-x’) Derek Hoiem Correspondence search Left Right scanline Matching cost disparity • Slide a window along the right scanline and compare contents of that window with the reference window in the left image • Matching cost: e.g. Euclidean distance Derek Hoiem Geometry for a simple stereo system • Assume parallel optical axes, known camera parameters (i.e., calibrated cameras). What is expression for Z? Similar triangles (pl, P, pr) and (Ol, P, Or): T x x T l r Z f Z depth T Z f xr xl disparity Kristen Grauman Results with window search Data Left image Right image Window-based matching Ground truth Window-based matching Ground truth Derek Hoiem How can we improve? • Uniqueness – For any point in one image, there should be at most one matching point in the other image • Ordering – Corresponding points should be in the same order in both views • Smoothness – We expect disparity values to change slowly (for the most part) Derek Hoiem Many of these constraints can be encoded in an energy function and solved using graph cuts Before Graph cuts Ground truth Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 2001 For the latest and greatest: http://vision.middlebury.edu/stereo/ Derek Hoiem Projective structure from motion • Given: m images of n fixed 3D points xij = Pi Xj , i = 1,… , m, j = 1, … , n • Problem: estimate m projection matrices Pi and n 3D points Xj from the mn corresponding 2D points xij Xj x1j x3j x2j P1 P3 P2 Svetlana Lazebnik Photo synth Noah Snavely, Steven M. Seitz, Richard Szeliski, "Photo tourism: Exploring photo collections in 3D," SIGGRAPH 2006 http://photosynth.net/ 3D from multiple images Building Rome in a Day: Agarwal et al. 2009 Recap: Epipoles • Point x in left image corresponds to epipolar line l’ in right image • Epipolar line passes through the epipole (the intersection of the cameras’ baseline with the image plane C C Derek Hoiem Recap: Essential, Fundamental Matrices • Fundamental matrix maps from a point in one image to a line in the other • If x and x’ correspond to the same 3d point X: • Essential matrix is like fundamental matrix but more constrained Adapted from Derek Hoiem Recap: stereo with calibrated cameras • Given image pair, R, T • Detect some features • Compute essential matrix E • Match features using the epipolar and other constraints • Triangulate for 3d structure and get depth Kristen Grauman Texture representations Correlation filtering Say the averaging window size is 2k+1 x 2k+1: Attribute uniform weight Loop over all pixels in neighborhood around to each pixel image pixel F[i,j] Now generalize to allow different weights depending on neighboring pixel’s relative position: Non-uniform weights Kristen Grauman Convolution vs. correlation Cross-correlation F 5 2 5 4 4 5 200 3 200 4 1 5 5 4 4 (i, j) u = -1, v = -1 5 5 1 1 2 200 1 3 5 200 1 200 200 200 1 Convolution H .06 .12 .06 .12 .25 .12 (0, 0) .06 .12 .06 Filters for computing gradients 1 0 -1 2 0 -2 1 0 -1 * = Slide credit: Derek Hoiem Texture representation: example mean mean d/dx d/dy value value Win. #1 4 10 Win.#2 18 7 … Win.#9 20 20 original image … statistics to summarize derivative filter patterns in small Kristen Grauman responses, squared windows Filter banks orientations scales “Edges” “Bars” “Spots” • What filters to put in the bank? – Typically we want a combination of scales and orientations, different types of patterns. Matlab code available for these examples: http://www.robots.ox.ac.uk/~vgg/research/texclass/filters.html Kristen Grauman Matching with filters • Goal: find in image • Method 0: filter the image with eye patch g[m,n] h[k,l] f [m k,n l] k,l f = image g = filter What went wrong? Input Filtered Image Derek Hoiem Likes bright pixels where filters are Matching with filters above average, dark pixels where filters are below average. • Goal: find in image • Method 1: filter the image with zero-mean eye g[m,n] (h[k,l]mean(h)) ( f [m k,n l]) k,l True detections False detections Input Filtered Image (scaled) Thresholded Image Derek Hoiem Showing magnitude of responses Kristen Grauman Kristen Grauman Kristen Grauman Representing texture by mean abs response Filters Mean abs responses Derek Hoiem Computing distances using texture 2 2 a D(a,b) (a1 b1) (a2 b2) #dim 2 D(a,b) (ai bi ) b i1 Dimension 2 Dimension Dimension 1 Kristen Grauman Feature detection: Harris Corners as distinctive interest points • We should easily recognize the keypoint by looking through a small window • Shifting a window in any direction should give a large change in intensity “flat” region: “edge”: “corner”: no change in no change along significant change all directions the edge direction in all directions A.

Load more