Performance Evaluation for Full 3D Projector Calibration

PERFORMANCE EVALUATION FOR FULL 3D PROJECTOR CALIBRATION METHODS IN SPATIAL AUGMENTED REALITY A Thesis Submitted to the Temple University Graduate Board In Partial Fulfillment Of the Requirements for the Degree MASTER OF SCIENCE of ELECTRICAL ENGINEERING By Michael Korostelev August, 2011 Thesis Approval(s): Dr. Li Bai, Thesis Adviser, Electrical Engineering Dr. Seong Kong, Electrical Engineering Dr. Robert Yantorno, Electrical Engineering Abstract Spatial Augment Reality (SAR) has presented itself to be an interesting tool for not only interesting ways to visualize information but to develop creative works in performance arts. The main challenge is to determine accurate geometry of a projection space and determine an efficient and effective way to project digital media and information to create an augmented space. In our previous implementation of SAR, we developed a projector-camera calibration approach using infrared markers. However, the projection suffers severe distortion due to the lack of depth information in the projection space. For this research, we propose to develop a RGBD sensor - projector system to replace our current projector-camera SAR system. Proper calibration between the camera or sensor and projector links vision to projection, answering the question of which point in camera space maps to what point in the space of projection. Calibra- tion will resolve the problem of capturing the geometry of the space and allow us to accurately augment the surfaces of volumetric objects and features. In this work three calibration methods are examined for performance and accu- racy. Two of these methods are existing adaptations of 2D camera - projector calibrations (calibration using arbitrary planes and ray-plane intersection) with our third proposed novel technique which utilizes point cloud information from the RGBD sensor directly. Through analysis and evaluation using re-projection error, results are presented, identifying the proposed method as practical and robust. i Contents Abstract i Contents ii List of Figures v 1 Introduction 1 1.1 Art and Engineering . 1 1.2 Augmented Reality . 3 1.2.1 Types of AR . 5 1.2.2 Advances in the Field of SAR . 9 1.3 Previous Work at Temple . 9 2 Current and Related Work 11 2.1 "Dance and Engineering in Augmented Reality" . 12 2.1.1 Methodology . 13 2.1.2 Calibration . 14 2.1.3 Detection . 19 2.1.4 Tracking . 23 2.1.5 Warping and Projection . 29 2.2 Microsoft Kinect "Hacks" . 32 ii 2.2.1 Kinect at Temple . 35 2.2.2 Getting Usable Sensor Data . 35 2.2.3 Interactive Particle Systems . 37 2.2.4 Superpowers with Projection . 38 2.2.5 Kinect 3D modeling . 39 2.2.6 The Point Cloud Library . 40 3 Calibration Methods 41 3.1 Projector Calibration Methods . 41 3.1.1 Arbitrary Planes and Calibrated Camera . 43 3.1.2 Ray-Plane Intersection Method . 46 3.2 3D Sensor - Projector Calibration . 51 3.2.1 Calibration of depth and image sensors . 54 3.2.2 Homography Based Alignment for Two View Correspondence . 57 3.2.3 Stereo Calibration for Two View Correspondence . 59 3.2.4 Comparison Between Homography Based and Stereo Calibra- tion for Two View Correspondence . 62 3.2.5 Extracting Depth Information . 64 3.2.6 Use 3D points for projector intrinsics . 65 3.2.7 Determining Projector Intrinsics . 68 3.2.8 Find R and t between sensor and Projector, Extrinsics . 73 iii 4 Implementation 77 4.1 Displaying the Augmented Scene . 77 4.2 Evaluating Projection Performance . 79 5 Discussion and Conclusions 83 5.1 Performance . 83 5.2 Application . 85 5.3 Future Work . 86 5.4 Conclusion . 87 References 90 iv List of Figures 1 Jan Borchers, René Bohne and Gero Herkenrath, LumiNet Jacket and LumiNet, 2008, Media Computing Group at RWTH Aachen University, Germany 2 2 In this simple example a printed pattern is recognized by the phone’s camera, then augmented within the camera view with the cube. As the phone moves around the pattern, the cube is rendered in the correct place with correct pose. .................................... 4 3 The Video See Through HMD(1) is the AR display. The display is combined with battery powered(3) portable computer(2) that is capable of real time tracking and reconstruction to allow for robust vision based AR applications. 6 4 In our current system, an infra-red camera (b) senses infra-red markers in- tegrated into a panel (a). Software (c) then manipulates the coordinates of the markers to transform an image and then send it to a projection system (d) to impose it correctly into the environment. ................... 13 5 The homography is an invertible transformation from some plane in a projective to space to another projective plane, even though the homography transformation is invariant to scale, it maps straight lines to straight lines. 15 6 The projective rectification here is transforming the perspective warped camera image to the ideal parallel image. ....................... 16 7 The homography we determined perspective warps the captured image to undistort the cameras view and transform it into the projected image plane. 17 8 The drawback with the current calibration system exists since the 2D homography is only valid for the plane at which calibration was done, so at different distance, parallax problems are observed. ................... 18 9 Using the Hough transform in OpenCV allows for the detection of edges, ana- lyzing these edges, we can find quadrangles in the image. 20 v 10 Infrared fiducial markers are made from two IR led’s diffused by ping-pong ball hemispheres. ................................. 21 11 The camera view is thresholded based on luminance in the red channel. It can be seen that using the LEDs is a good solution for a dynamic application such as dance where tracking success is very important. 22 12 The projection panel a consists of a plain white cardboard with an IR LED marker at each corner and one in the middle, off center. This is tracked and overlaid with projection in b. ......................... 25 13 The detected blobs are classified by comparing the set to an ideal blob pattern. Each frame, we cycle through blob orders and compute a perspective transformation between the ideal and current pattern configuration. 26 14 The final result of the warped media in OpenGL, when this is made full screen in the projector’s window, it overlays the projection panel. 31 15 The final result of the warped media in OpenGL, when this is made full screen in the projector’s window, it overlays the projection panel. 32 16 In the Microsoft Kinect a laser beam is refracted through a lens creates a matrix of laser dots in the space, based on the disparity in the dots, the depth of objects in the space can be determined. ........................ 33 17 The depth information captured with an open source Kinect driver is 640 x 480 x 640. The units are relative all relative to the origin of the camera. 34 18 The smoothed depth image from the raw Kinect data is deterimed with the help of OpenCV’s inpaint technique. ..................... 36 19 The particle system uses attraction and repulsion forces for each particle as well as a larger attraction force to user’s hands that are detected in the scene. 37 20 A user waves a stick which attracts a swarm of particles. 38 21 Lightning and trail effects are projected over the performer’s hands creating an illusion of superpowers. ............................ 38 vi 22 Virtual Playdoh allowed users to mold a warp-able NURBS surface using both of their hands and rendering in anaglyphic 3D. 39 23 The pinhole camera model maps real space to image space. 42 24 The two degrees of freedom, 1 in rotation and 1 in translation. 46 25 The extrinsic parameters, that are used to compute the plane equation. 49 26 Ray plane intersection in its entirety. The object is to determine vectors V from the camera center, to the projected calibration pattern. 50 27 The depth and RGB image on top of each other, it can be seen that they do not perfectly correspond. ........................... 54 28 Calibration [12] of depth and RGB cameras on the 3D sensor . 55 29 Physical 3D chessboard calibration pattern used that can be seen in both RGB and range cameras. .............................. 56 30 Physical 3D chessboard calibration pattern with detected corners. 57 31 Missmatched chessboard corners are matched after multiplying the RGB image by the determined homography. ........................ 59 32 When not calibrated the position of each corner does not correspond. With a disparity of close to 30 pixels. ......................... 63 33 Plot of correspondence when calibrated with homography. 63 34 Plot of correspondence when calibrated with stereo calibration. 64 35 The distance sensed in meters vs. ambiguous units.[12] . 66 36 Determining metric distance for object points in the camera. 66 37 The projector’s camera matrix provides information that will allow to under- stand the perspective of the projected image. 68 vii 38 Scene generated with intrinsic matrix K1 ................... 71 39 Scene generated with intrinsic matrix K2 ................... 72 40 Scene generated with intrinsic matrix K3 ................... 72 41 Scene generated without the use of intrinsic matrix. 73 0 42 Where Hπ is a planar homography, e are the epipoles result in a fun- 0 damental matrix: F = [e ] × Hπ ..................... 74 43 The scene with default R and T, and on the right, the scene with R and T input. 77 44 On the left, the virtual 3D scene that is modified with rotation, translation and projection matrix, on the right, the scene overlaid by projection. 77 45 The OpenGL viewing pipeline ........................ 78 46 The reprojection was evaluated by projecting a virtual chessboard pattern over a real one.

Load more