Precise Image Registration and Occlusion Detection A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University By Vinod Khare, B. Tech. Graduate Program in Civil Engineering The Ohio State University 2011 Master's Examination Committee: Asst. Prof. Alper Yilmaz, Advisor Prof. Ron Li Prof. Carolyn Merry c Copyright by Vinod Khare 2011 Abstract Image registration and mosaicking is a fundamental problem in computer vision. The large number of approaches developed to achieve this end can be largely divided into two categories - direct methods and feature-based methods. Direct methods work by shifting or warping the images relative to each other and looking at how much the pixels agree. Feature based methods work by estimating parametric transformation between two images using point correspondences. In this work, we extend the standard feature-based approach to multiple images and adopt the photogrammetric process to improve the accuracy of registration. In particular, we use a multi-head camera mount providing multiple non-overlapping images per time epoch and use multiple epochs, which increases the number of images to be considered during the estimation process. The existence of a dominant scene plane in 3-space, visible in all the images acquired from the multi-head platform formulated in a bundle block adjustment framework in the image space, provides precise registration between the images. We also develop an appearance-based method for detecting potential occluders in the scene. Our method builds upon existing appearance-based approaches and extends it to multiple views. ii to my mother iii Acknowledgments I would like to extend my heartfelt gratitude to my advisor Dr. Alper Yilmaz whose invaluable guidance in work and personal life made this work possible. His never ending support was valuable every step of the way. He made sure that he was always availabe and had constructive suggestions towards my research. I would also like to thank Dr. Li and Dr. Merry for their participation as committee members for my Master's Thesis Examination. I would like to thank all the members of the Photogrammetric Computer Vision Lab for a friendly and intellectually stimulating environment that made my stay at OSU the most memorable experience of my life. I would also like to thank my friends at Ohio State for some of the most fun times ever. iv Vita April 1986 . Born - Indore, India June 2007 . B. Tech. Civil Engineering, Indian Institute of Technology Kanpur, India September 2008 - present . Graduate Student, Department of Civil & Environmental Eng. and Geodetic Science, The Ohio State University Publications Research Publications V. Khare, A. Yilmaz, and O. Mendoza-Schrok. Precise Image Registration and Occlusion Detection In Proceedings of the IEEE National Aeronautics and Aerospace Electronics Conference, 2010. V. Khare, S. Yadav, A. Rastogi, and O. Dikshit. Textural Classification of SAR Images In Proceedings of the XXVII INCA Internation Congress, 2007. Fields of Study Major Field: Civil Engineering v Table of Contents Page Abstract . ii Dedication . iii Acknowledgments . iv Vita......................................... v List of Tables . ix List of Figures . x 1. Introduction . 1 1.1 Overview . 1 1.2 Related Work . 3 1.3 Motivations and Contributions . 10 1.4 Organization of the Thesis . 12 1.5 Notation . 12 2. Projective Geometry and Homography Estimation . 13 2.1 Projective Space and Homogenous Coordinates . 15 2.2 Projective Transformations and Planar Homography . 16 2.3 Estimation of Planar Homography . 18 2.3.1 Number of Measurements Required . 19 2.3.2 Direct Linear Transform (DLT) . 20 2.4 Cost Functions . 23 2.4.1 Algebraic Distance . 23 2.4.2 Symmetric Transfer Error . 23 2.4.3 Reprojection Error . 24 vi 2.5 Transformation Invariance and Normalization . 25 2.5.1 Image Coordinate Transforms . 26 2.5.2 Non-Invariance of DLT . 26 2.5.3 Invariance of Geometric Error . 27 2.5.4 Normalization . 27 2.6 Iterative Minimization . 28 2.6.1 Gold Standard Algorithm . 29 3. Image Mosaicking . 31 3.1 Key-point Detection . 32 3.1.1 Harris and Stephen's Corner Detector . 33 3.1.2 Scale Invariant Features Transform (SIFT) . 34 3.2 Key-point Matching . 35 3.3 Robust Estimation (RANSAC) . 36 4. Image Registration for Multi-Head Cameras . 40 4.1 Projection Model and Initial Estimates . 41 4.2 Algorithm Overview . 42 4.2.1 Key-point Detection and Matching . 44 4.2.2 Image Connectivity Graph . 45 4.2.3 Initial Estimation by Graph Traversal . 46 4.2.4 Optimizing with the Levenberg-Marquardt Algorithm . 48 4.3 Occlusion Detection . 50 5. Experiments and Results . 54 5.1 Data Description . 54 5.2 Experiments with Simulated Data . 55 5.3 Registration of CLIF Dataset . 59 5.4 Details of Occlusion Detection . 62 6. Conclusions and Future Work . 67 Bibliography . 69 Appendices 74 A. Software Implementation Details . 74 vii A.1 Software Environment . 74 A.2 Input . 74 A.3 Caching . 75 A.4 Output . 75 A.5 Running the program . 76 A.6 Viewing Results . 76 A.7 Notes for Programmers . 77 viii List of Tables Table Page 2.1 Characteristics of various geometries and invariants therein. 14 5.1 RMS errors for the registration of simulated dataset. 57 ix List of Figures Figure Page 2.1 Points become lines and lines become planes in projective space. 17 2.2 A central projection projects all points from one plane to another along rays passing through a perspective center. This is a simpler case of projecticity called perspectivity. 19 2.3 Examples of 2D Homography, x0 = Hx. A planar homography is induced between two images by a world plane. 20 2.4 An illustration of (a) Symmetric transfer error and (b) Reprojection error. The points x and x0 are measured points and hence do not correspond perfectly because of noise. The esimated points x^ and x^0 do match perfectly for the estimated homography H^ . 25 3.1 Detecting key-points with sift and finding geometrically consistent points with RANSAC. The figure above shows the key-points detected with SIFT and matched using BBF. The lower figure shows matches geo- metrically consistent with homography calculated using RANSAC. 39 3.2 Mosaicing images after estimating homography using RANSAC. 39 4.1 The projection model used for bundle adjusment using Levenber-Marquardt estimation. 42 4.2 Flow chart depicting the various stages of registering images from a multi-head camera. 43 4.3 The number of points detected decreases as one moves down the scale space octaves. 44 x 4.4 Schematic of an Image Connectivity Graph. 46 4.5 A typical scenario in creating the mosaic. The images I1 :::In−1 are 0 0 warped into the reference image space I0 as I1 :::In−1 to create the panorama. 52 5.1 The relative positioning of the cameras as shown from the camera viewpoint onboard the aircraft. 56 5.2 The images from the six camera heads arranged side by side. Note that there is some overlap between the images as the principle axes of the cameras are not parallel. 57 5.3 Capturing simulated data points over an urban terrain with two camera heads in linear motion. (a) Two camera heads forming the multi-head camera move in an oblique direction to simulate flight for four time epochs. (b), (c), (d) Random terrains are generated with a ‘flat’ ground and random high-rise `buildings'. Data points are captured for the four time epochs. 58 5.4 Bundle adjustment of the synthetic dataset. Points captured for epoch 2 are projected onto points captured for epoch 1. (a), (b) and (c) plot the points for three random datasets. Points are divided into ground points and potential occluders (building points). Ground points register accurately while building points show misregistration due to parallax. 59 5.5 `Mosaicing' for adjacent cameras for the simulated dataset. Despite having no overlap between them, adjacent cameras are connected in the connectivity graph and hence can be `stitched' together. The projected points are compared to the real points locations. 60 5.6 Mosaicking for two images. The two images taken from camera 2 shown above in (a) and (b) were part of the bundle adjustment and have been precisely registered in (c). The difference in pixel values is visible in (d). The difference image was contrast enhanced to improve visibility. 61 5.7 Mosaicking for two images. The two images taken from camera 4 shown above in (a) and (b) were part of the bundle adjustment and have been precisely registered in (c). The difference in pixel values is visible in (d). The difference image was contrast enhanced to improve visibility. 62 xi 5.8 Comparison with traditional approach. (a) shows the mosaic generated for 20 epochs using the traditional approach of sequentially register- ing pairs of images. (b) Shows the same images registered using our method. Note that in (a) the misregistration increases as we move away from the reference image and our approach (b) successfully reduces this misregistration. 63 5.9 Mosaic of the CLIF dataset across 20 epochs. This mosaic is generated by using pixel values from only one transformed image. 63 5.10 Occlusion detection on the CLIF dataset across 20 epochs. The occlu- sion is overlayed in red. 64 5.11 Mosaic of the CLIF dataset across 20 epochs. This mosaic is generated by blending pixel values from all the transformed images. 64 5.12 (a) Mosaic across 20 epochs for a different area. (b) Mosaic with over- laid occlusions. 65 5.13 Mosaic across 150 epochs. (a) shows the panorama and (b) shows the overlayed potential occluders.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages91 Page
-
File Size-