The Robust 3-D Sceneflow

Total Page:16

File Type:pdf, Size:1020Kb

Load more

Technische Universität Berlin THE ROBUST 3-D SCENEFLOW Generalized Video-based 3-D Analysis using Robust Camera and Scene Geometry Estimations Von der Fakult¨at IV - Elektrotechnik und Informatik der Technischen Universit¨at Berlin zur Verleihung des akademischen Grades Doktor der Ingenieurwissenschaften Dr.-Ing. genehmigte Dissertation vorgelegt von Jang Heon Kim aus S¨ud Korea Promotionsausschuss: Vorsitzender: Prof. Dr.-Ing. Klaus-Robert M¨uller Berichter: Prof. Dr.-Ing. Olaf Hellwich Berichter: Prof. Dr.-Ing. Thomas Sikora Tag der wissenschaftlichen Aussprache : 9. 9. 2008 Berlin 2009 D83 Technische Universität Berlin THE ROBUST 3-D SCENEFLOW Generalized Video-based 3-D Analysis using Robust Camera and Scene Geometry Estimations by Jang Heon Kim Master of Science in Electronic Engineering Yonsei University, Seoul, Korea, 2003 A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Engineering in Electrical Engineering and Computer Science Faculty IV. Electrical Engineering and Computer Sciences Technische Universit¨at Berlin Berlin 2009 D83 Copyright c 2008 Jang Heon Kim All Rights Reserved. No part of the publication may be reproduced in any form by print, photoprint, microfilm or any other means without written permission from the publisher. The University reserves the right to make copies of the work for academic purposes within the University and to allow free access to the copy of the thesis retained by the Library. This page must form part of any copies made. In written agreements of sponsorship, the institute retains the ownership of the intellectual property in supported contributions. Technische Universit¨at Berlin Address: Strasse des 17. Juni 135, 10623 Berlin, Germany Phone: +49 (0)30 314-0, Fax: +49 (0)30 314-23222 http://www.tu-berlin.de Institut f¨ur Telekommunkationssysteme Address: Sekr. EN 1, Einsteinufer 17, 10587 Berlin, Germany Phone: +49 (0)30 314-25093, Fax: +49 (0)30 314-22514 http://www.nue.tu-berlin.de Declaration This dissertation is submitted to the Technische Universit¨at Berlin in partial fulfillment for the degree of Doctor of Philosophy. It is an account of work undertaken at the Faculty IV. Electrical Engineering and Computer Sciences between May 2004 and September 2008 under the supervision of Prof. Dr.-Ing. Thomas Sikora. I already read and understand all regulations for completion of the degree of Doctor of Philosophy. This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration. This dissertation is not substantially the same as any I have submitted for a degree or diploma or other qualification at any other University. I further state that no part of my dissertation/thesis has already been, or is being concurrently submitted for any other degree, or other qualification. September, 2008 ——————————————————— Jang Heon Kim iii To my Parents inspiring me and dedicating their whole lives for my education. Contents Declaration iii List of Tables x List of Fingures xiii Acknowledgments xiv Abstract xv Zusammenfassung xvi 1 Introduction 1 1.1 DepthCues .................................................... 1 1.2 3-DVideo...................................................... 3 1.3 Multi-viewApproachesfor3-DMotionAnalysis........................ 6 1.4 ProposedMethod................................................ 9 1.5 BackgroundandContributions..................................... 12 2 Related Works 15 2.1 MultiviewVideoAnalysis ......................................... 15 2.2 Image-basedModelingandRendering................................ 17 2.3 CameraGeometryEstimation...................................... 17 2.4 SceneGeometryEstimation........................................ 20 3 Camera Geometry Estimation 25 3.1 Projective Geometry Estimation . 25 3.2 PinholeCameraModel............................................ 30 3.3 Complete3-DCameraModel ...................................... 33 3.4 Ray-spaceDefinitionusingOpticalFlow.............................. 36 3.5 ExperimentalResults............................................. 39 vii 4 Spatio-temporal Collineation of View Geometry 46 4.1 EpipolarGeometry............................................... 46 4.2 Fundamental Matrix . 48 4.3 Fundamental Matrix and Camera Projections . 49 4.4 Fundamental Matrix Estimation with Parallax . 50 4.5 EssentialMatrixEstimation ....................................... 52 4.6 Maximum and Minimum Bounds of Epipolar Range . 53 4.7 Epipolar Rectification . 54 4.8 TemporalEpipolarGeometry ...................................... 57 4.9 ExperimentalResults............................................. 58 5 Unified Representation of Robust Estimations 68 5.1 RobustnessoftheEstimator....................................... 69 5.2 RobustEstimationwithRandomSampling............................ 70 5.3 Unified Representation of Robust Fundamental Matrix Estimation . 71 5.4 AnisotropicRegularizationusingRobustEstimator..................... 75 5.5 Perceptual Maximum Variation Modeling for Over-diffusion Problem . 82 5.6 Robust Anisotropic Color Image Regularization with Perceptual Maximum VariationModeling .............................................. 87 5.7 ExperimentalResults............................................. 89 6 Robust 3-D Sceneflow Estimation 99 6.1 RobustAnisotropicDisparityEstimation............................. 99 6.2 Anisotropic Disparity Estimation with Perceptual Maximum Variation . 106 6.3 RobustHybridRecursiveOpticalFlowEstimation...................... 107 6.4 3-DSceneflowEstimation ......................................... 117 6.5 ExperimentalResults............................................. 119 7 Confocal Stereo with Pinhole Image Recovery 133 7.1 ProblemofPinholeCameraAssumption.............................. 133 7.2 Real-ApertureStereoCameraAnalysis............................... 134 7.3 ConfocalDisparityEstimation...................................... 135 7.4 ConfocalConstraintofDefocus..................................... 136 7.5 Anisotropic Disparity Estimation with Confocal Constraint and Recovery of PinholeImage .................................................. 138 7.6 ExperimentalResults............................................. 139 8 Applications 143 8.1 Image-andVideo-basedRendering.................................. 143 8.2 DepthSegmentationandImageComposition.......................... 148 viii 8.3 ImageandVideo-basedModeling................................... 152 9 Discussion 155 9.1 SceneflowMethodforRobust3-DVideoAnalysis....................... 155 9.2 Future directions . 157 10 Conclusion 159 A Appendix 160 A.1 Relationship between Collineations and Epipolar Geometry . 160 A.2 Triangulation................................................... 161 A.3 HomographyandCanonicalBasis................................... 162 A.4 Projective Transformations of a Plane . 162 A.5 8-PointAlgorithm............................................... 163 A.6 ComputationofRotationandTranslation............................. 164 A.7 Bundle Adjustment . 166 ix List of Tables 1.1 2-Dand3-Dmotionmodelsincomputervision ........................ 4 3.1 Measures that remain invariant under the transformations in the hierarchy of geometries..................................................... 28 5.1 Several kinds of ρ, ψ and w functions................................. 71 5.2 Graphs of ρ, ψ and w functions..................................... 72 5.3 Algorithmoftheperceptualcolormodeling ........................... 83 5.4 Comparison using Lena image with 4, 10% color impulsive noise . 93 6.1 Performancecomparisonusingpercentoferrorpixels.................... 127 6.2 Performancecomparisonfordiscretemotion........................... 128 x List of Figures 1.1 Humanvisualsystem............................................. 2 1.2 Disparitymap .................................................. 3 1.3 Spatio-temporal motion effects . 4 1.4 Relationship between SfM (Structure-from-Motion) and multi-stereo methods 7 1.5 Blockdiagramoftheproposedmethod............................... 10 1.6 3-Dsceneflowmethod ............................................ 12 2.1 Multiviewcameraconfigurations.................................... 16 2.2 Collineation in camera calibration pattern . 19 2.3 Stereocamerabaselines........................................... 21 2.4 Combinationofwideandnarrowbaselines............................ 22 3.1 Perspective projection and parallelism . 27 3.2 Pinholecameramodel............................................ 31 3.3 Cameracalibrationandskewfactor.................................. 32 3.4 Transformation between the camera and world coordinates . 34 3.5 Plenopticfunction............................................... 36 3.6 Space-timeray-space............................................. 37 3.7 Featuretrackinginaplanarsurface ................................. 40 3.8 Trajectory of moving features by camera motion . 41 3.9 Cameracalibrationresultsfor4frames............................... 41 3.10Relative3-Dpositionbetweencamerasandtrackedpoints................ 42 3.11Optical-ray..................................................... 43 3.12Virtualviewpointinray-space...................................... 44 3.13 Fourier spectral analysis of ray-space . 45 4.1 Epipolargeometry............................................... 47 4.2 Parallax in projection rays . 51 4.3 Epipolar rectification . 55 4.4 Minimum angle to avoid pixel loss in epipolar rectification . 56 4.5 Rectified images . 56 xi 4.6 Spatio-temporalepipolargeometry.................................
Recommended publications
  • A Jackal UGV 79 A.1 Main Characteristics

    A Jackal UGV 79 A.1 Main Characteristics

    POLITECNICO DI TORINO Master of Science in Mechatronic Engineering Master’s Degree Thesis GPS-based autonomous navigation of unmanned ground vehicles in precision agriculture applications Supervisor Candidate Prof. Marcello CHIABERGE Simone CERRATO October 2020 Abstract The global population is growing exponentially and the actual agricultural tech- niques and resources will not be able to feed every person on the Earth in a few years. To account for this serious problem, groups of research are focusing their attention on precision agriculture, because it looks for the improvement of the productivity and efficiency of both agricultural and farming production processes, while reducing the environmental impact, exploiting automation and robotics. The thesis aims to design and develop a solution, based on GPS, for the autonomous navigation problem in precision agriculture, using only few sensors: an Inertial Measurement Unit, a GPS receiver and a depth camera, in order to be cost effec- tive. The proposed goal has been achieved through a system of inter-operating sub-components, that have to share information and collaborate each other in order to provide a complete autonomous navigation. In particular, the main involved entities are: a localization filter, a global and a local path planning algorithms and an obstacle avoidance approach, that have been developed and can cooperate each other by means of the Robot Operating System. Eventually, the proposed solution has been tested in a simulation environment, through different possible scenarios providing good results in each of them. However, it may be considered as a starting point for future improvement in the field of autonomous navigation for precision agriculture.
  • Optimised Implementation of Dense Stereo Correspondence for Resource Limited Hardware

    Optimised Implementation of Dense Stereo Correspondence for Resource Limited Hardware

    Journal of Computer Science Original Research Paper Optimised Implementation of Dense Stereo Correspondence for Resource Limited Hardware 1,2 Deepambika Vadekkettathu Anirudhan and 3,4 Malangai Abdul Rahiman 1Department of ECE, Karpagam Academy of Higher Education, Coimbatore, Tamilnadu, India 2Department of ECE, LBS Institute of Technology for Women, Trivandrum, Kerala, India 3Karpagam Academy of Higher Education, Coimbatore, Tamilnadu, India 4Pro-Vice Chancellor, APJ Abdul Kalam Technological University, Trivandrum, Kerala, India Article history Abstract: Computer stereo vision is a passive sensing technique that Received: 21-06-2018 helps to recover 3D information of an environment from 2D images. Revised: 16-07-2018 The stereo correspondence is a challenging task that finds out matching Accepted: 17-10-2018 pixels between the stereo image pair based on Lambertian criteria and its result is a disparity space image. The depth of the objects from the Corresponding Author: Deepambika V.A. camera can be calculated from this disparity value by using the principle Department of ECE, LBS of triangulation. For the vision guided robot navigation, the requirement Institute of Technology for of stereo matching algorithms on low power dedicated hardware that Women, Trivandrum, Kerala, can achieve a high frame rate is unambiguous. A new, highly optimized India implementation of correlation based, Sum of Absolute Differences Email: [email protected] correspondences algorithm on a low cost resource limited FPGA is presented here. This System-on-Programmable-Chip architecture based system achieved a higher frame rate of 50 fps with 64 disparity levels without using a microprocessor. On performance evaluation, the disparity map shows a maximum error of 0.308% only.
  • Real Time Distance Calculation Using Stereo Vision Technique

    Real Time Distance Calculation Using Stereo Vision Technique

    Real Time Distance Calculation using Stereo Vision Technique Session 2005-2009 Project Advisor Mr. Khan Asmar Azar Submitted by: Ahmed Tassaduq 050620-158 Aisha Ashraf 050620-128 Fatima Zehra Hassan 060820-089 Shajieuddin Hyder Khan 050620-086 Department of Electrical Engineering University of Management and Technology Real Time Distance Calculation Using Stereo Vision Technique Page i A report submitted to the Department of Electrical Engineering In partial fulfillment of the requirements for the Degree Bachelor of Science in Electrical Engineering by Ahmed Tassaduq (050620-158) Aisha Ashraf (050620-128) Fatima Zehra Hassan (060820-089) Shajieuddin Hyder Khan (050620-086) University of Management and Technology October 12, 2009 Real Time Distance Calculation Using Stereo Vision Technique Page ii CERTIFICATE OF APPROVAL It is certified that the work contained in this project report, entitled “Real Time Distance Calculation using Stereo Vision Technique” carried out by Ahmed Tassaduq (050620-158) Aisha Ashraf (050620-128) Fatima Zehra Hassan (060820-089) Shajieuddin Hyder Khan (050620-086) Under the supervision of Mr. Khan Asmar Azar for the partial fulfillment of the degree requirement of Bachelor in Electrical Engineering Approved By ________________ __________________ Dr. Aziz Bhatti Khan Asmar Azar Dean SST Project Advisor Real Time Distance Calculation Using Stereo Vision Technique Page iii Acknowledgements We truly acknowledge the cooperation and help make by our advisor Mr. Khan Asmar Azar, University of Management and Technology. He has been a constant source of guidance throughout the course of this project. We would also like to thank Mr. Saeed-ur-Rehman Turk, Government College University, Lahore for his help and guidance in understanding many important issues.
  • Real-Time Stereo Vision on Fpgas with Scenescan

    Real-Time Stereo Vision on Fpgas with Scenescan

    Real-Time Stereo Vision on FPGAs with SceneScan Konstantin Schauwecker1 Nerian Vision GmbH, Gutenbergstr. 77a, 70197 Stuttgart, Germany www.nerian.com Abstract We present a flexible FPGA stereo vision implementa- tion that is capable of processing up to 100 frames per second or image resolutions up to 3.4 megapixels, while consuming only 8 W of power. The implementation uses a variation of the Semi- Global Matching (SGM) algorithm, which provides superior re- sults compared to many simpler approaches. The stereo match- ing results are improved significantly through a post-processing chain that operates on the computed cost cube and the disparity map. With this implementation we have created two stand-alone hardware systems for stereo vision, called SceneScan and Scene- Scan Pro. Both systems have been developed to market maturity and are available from Nerian Vision GmbH. Keywords stereo vision, depth sensing, FPGA, semi-global arXiv:1809.07977v1 [cs.CV] 21 Sep 2018 matching 1 Introduction Computer stereo vision is one of the best researched areas in the field of computer vision. Its origins date back to the 1970s, and it has since seen significant scientific advancement. Compared to other approaches for dense depth perception – such as time-of-flight cameras or struc- tured light camera systems – stereo vision has the advantage of being a passive technology. This means that apart from the ambient light, no further active light source is required for performing measurements. Active camera systems struggle in situations with bright ambient light such as outdoors during bright sunlight. In this case, the active 2 K. Schauwecker light source can no longer provide sufficient contrast from the ambi- ent light, and hence measurements become impossible.
  • PMF: a Stereo Correspondence Algorithm Using a Disparity Gradient Limit Stephen B Pollard, John E W Mayhew and John P Frisby

    PMF: a Stereo Correspondence Algorithm Using a Disparity Gradient Limit Stephen B Pollard, John E W Mayhew and John P Frisby

    [1] PMF: A Stereo Correspondence Algorithm Using a Disparity Gradient Limit Stephen B Pollard, John E W Mayhew and John P Frisby AI Vision Research Unit, University of Sheffield, Sheffield S10 2TN, UK Reprinted, with permission of Pion Ltd, from Perception, 1985, 14, 449-470. ABSTRACT world, whereas this is not true ofincorrect matches. The value of 1 is not in itself critical - all that is required is a limit that The advantages of solving the stereo correspondence problem balances satisfactorily the competing requirements of by imposing a limit on the magnitude of allowable disparity disambiguating power and ability to deal with as wide a range gradients are examined. It is shown how the imposition of of surfaces as possible (section 2). However, we have chosen such a limit can provide a suitable balance between the twin 1 because this seems to be roughly the limit found for the requirements of disambiguating power and the ability to deal human visual system (Burt and Julesz 1980). with a wide range of surfaces. Next, the design of a very simple stereo algorithm called PMF is described. In In SUbsequent sections we describe the design and performance conjunction with certain other constraints used in many other of an algorithm called P:MFthat is based upon this constraint, stereo algorithms, P:MF employs a limit on allowable and several others that have been identified previously1. disparity gradients of 1, a value that coincides with that Finally we briefly discuss how the theory underlying PMF reported for human stereoscopic vision. The excellent relates to other computational theories of how to solve the performance of P1vIF is illustrated on a series of natural and stereo correspondence problem.
  • Computer Stereo Vision for Autonomous Driving

    Computer Stereo Vision for Autonomous Driving

    Computer Stereo Vision for Autonomous Driving Rui Fan, Li Wang, Mohammud Junaid Bocus, Ioannis Pitas Abstract As an important component of autonomous systems, autonomous car perception has had a big leap with recent advances in parallel computing archi- tectures. With the use of tiny but full-feature embedded supercomputers, com- puter stereo vision has been prevalently applied in autonomous cars for depth perception. The two key aspects of computer stereo vision are speed and ac- curacy. They are both desirable but conflicting properties, as the algorithms with better disparity accuracy usually have higher computational complexity. Therefore, the main aim of developing a computer stereo vision algorithm for resource-limited hardware is to improve the trade-off between speed and accu- racy. In this chapter, we introduce both the hardware and software aspects of computer stereo vision for autonomous car systems. Then, we discuss four au- tonomous car perception tasks, including 1) visual feature detection, description and matching, 2) 3D information acquisition, 3) object detection/recognition and 4) semantic image segmentation. The principles of computer stereo vision and parallel computing on multi-threading CPU and GPU architectures are then detailed. Rui Fan UC San Diego, e-mail: [email protected] Li Wang ATG Robotics, e-mail: [email protected] Mohammud Junaid Bocus University of Bristol, e-mail: [email protected] Ioannis Pitas Aristotle University of Thessaloniki, e-mail: [email protected] 1 2 Rui Fan, Li Wang, Mohammud Junaid Bocus, Ioannis Pitas 1 Introduction Autonomous car systems enable self-driving cars to navigate in complicated environments, without any intervention of human drivers.
  • A1the Eye in Detail

    A1the Eye in Detail

    A. The Eye A1. Eye in detail EYE ANATOMY A guide to the many parts of the human eye and how they function. The ability to see is dependent on the actions of several structures in and around the eyeball. The graphic below lists many of the essential components of the eye's optical system. When you look at an object, light rays are reflected from the object to the cornea , which is where the miracle begins. The light rays are bent, refracted and focused by the cornea, lens , and vitreous . The lens' job is to make sure the rays come to a sharp focus on the retina . The resulting image on the retina is upside-down. Here at the retina, the light rays are converted to electrical impulses which are then transmitted through the optic nerve , to the brain, where the image is translated and perceived in an upright position! Think of the eye as a camera. A camera needs a lens and a film to produce an image. In the same way, the eyeball needs a lens (cornea, crystalline lens, vitreous) to refract, or focus the light and a film (retina) on which to focus the rays. If any one or more of these components is not functioning correctly, the result is a poor picture. The retina represents the film in our camera. It captures the image and sends it to the brain to be developed. The macula is the highly sensitive area of the retina. The macula is responsible for our critical focusing vision. It is the part of the retina most used.
  • Distance Estimation from Stereo Vision: Review and Results

    Distance Estimation from Stereo Vision: Review and Results

    DISTANCE ESTIMATION FROM STEREO VISION: REVIEW AND RESULTS A Project Presented to the faculty of the Department of the Computer Engineering California State University, Sacramento Submitted in partial satisfaction of the requirements for the degree of MASTER OF SCIENCE in Computer Engineering by Sarmad Khalooq Yaseen SPRING 2019 © 2019 Sarmad Khalooq Yaseen ALL RIGHTS RESERVED ii DISTANCE ESTIMATION FROM STEREO VISION: REVIEW AND RESULTS A Project by Sarmad Khalooq Yaseen Approved by: __________________________________, Committee Chair Dr. Fethi Belkhouche __________________________________, Second Reader Dr. Preetham B. Kumar ____________________________ Date iii Student: Sarmad Khalooq Yaseen I certify that this student has met the requirements for format contained in the University format manual, and that this project is suitable for shelving in the Library and credit is to be awarded for the project. __________________________, Graduate Coordinator ___________________ Dr. Behnam S. Arad Date Department of Computer Engineering iv Abstract of DISTANCE ESTIMATION FROM STEREO VISION: REVIEW AND RESULTS by Sarmad Khalooq Yaseen Stereo vision is the one of the major researched domains of computer vision, and it can be used for different applications, among them, extraction of the depth of a scene. This project provides a review of stereo vision and matching algorithms, used to solve the correspondence problem [22]. A stereo vision system consists of two cameras with parallel optical axes which are separated from each other by a small distance. The system is used to produce two stereoscopic pictures of a given object. Distance estimation between the object and the cameras depends on two factors, the distance between the positions of the object in the pictures and the focal lengths of the cameras [37].
  • Stereo Vision System Module for Low-Cost Fpgas for Autonomous Mobile Robots

    Stereo Vision System Module for Low-Cost Fpgas for Autonomous Mobile Robots

    STEREO VISION SYSTEM MODULE FOR LOW-COST FPGAS FOR AUTONOMOUS MOBILE ROBOTS A Thesis Presented to the Faculty of California Polytechnic State University San Luis Obispo In Partial Fulfillment of the Requirements for the Degree Master of Science in Computer Science by Connor Citron August 2014 c 2014 Connor Citron ALL RIGHTS RESERVED ii COMMITTEE MEMBERSHIP TITLE: Stereo Vision System Module for Low-Cost FPGAs for Autonomous Mobile Robots AUTHOR: Connor Citron DATE SUBMITTED: August 2014 COMMITTEE CHAIR: Associate Professor John Seng, Ph.D. Department of Computer Science COMMITTEE MEMBER: Professor Franz Kurfess, Ph.D. Department of Computer Science COMMITTEE MEMBER: Associate Professor Chris Lupo, Ph.D. Department of Computer Science iii ABSTRACT Stereo Vision System Module for Low-Cost FPGAs for Autonomous Mobile Robots Connor Citron Stereo vision uses two adjacent cameras to create a 3D image of the world. A depth map can be created by comparing the offset of the corresponding pixels from the two cameras. However, for real-time stereo vision, the image data needs to be processed at a reasonable frame rate. Real-time stereo vision allows for mobile robots to more easily navigate terrain and interact with objects by providing both the images from the cameras and the depth of the objects. Fortunately, the image processing can be parallelized in order to increase the processing speed. Field-programmable gate arrays (FPGAs) are highly parallelizable and lend themselves well to this problem. This thesis presents a stereo vision module which uses the Sum of Absolute Differ- ences (SAD) algorithm. The SAD algorithm uses regions of pixels called windows to compare pixels to find matching pairs for determining depth.
  • User-Oriented Markerless Augmented Reality Framework Based on 3D Reconstruction and Loop Closure Detection

    User-Oriented Markerless Augmented Reality Framework Based on 3D Reconstruction and Loop Closure Detection

    The University of Birmingham A USER-ORIENTED MARKERLESS AUGMENTED REALITY FRAMEWORK BASED ON 3D RECONSTRUCTION AND LOOP CLOSURE DETECTION by YUQING GAO A thesis submitted to the University of Birmingham for the degree of DOCTOR OF PHILOSOPHY (Ph.D.) School of Engineering Department of Electronic, Electrical & Systems Engineering University of Birmingham November 2016 University of Birmingham Research Archive e-theses repository This unpublished thesis/dissertation is copyright of the author and/or third parties. The intellectual property rights of the author or third parties in respect of this work are as defined by The Copyright Designs and Patents Act 1988 or as modified by any successor legislation. Any use made of information contained in this thesis/dissertation must be in accordance with that legislation and must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the permission of the copyright holder. ABSTRACT Visual augmented reality aims to blend computer-generated graphics with real-world scenes to enhance the end users’ perception and interaction with a given environment. For those AR applications which require registering the augmentations in the exact locations within the real scene, the system must track the user’s camera motion and estimate the spatial relationship between the user and the environment. Vision-based tracking is one of the more effective tracking methods, and uses relatively low-cost and easy-access cameras as input devices and, furthermore, exploits several computer vision (CV) techniques to solve the problem. It can typically be divided into marker-based and markerless methods. The marker-based AR applications have, in the past, proved sufficiently robust and accurate, and the marker-based tracking methods have been widely supported by almost every AR software development kit developed and marketed to date.
  • Towards Hardware Accelerated Rectification of High Speed Stereo Image Streams

    Towards Hardware Accelerated Rectification of High Speed Stereo Image Streams

    M¨alardalenUniversity School of Innovation Design and Engineering V¨aster˚as,Sweden Thesis for the Degree of Master of Science with Specialization in Embedded Systems - 30.0 credits TOWARDS HARDWARE ACCELERATED RECTIFICATION OF HIGH SPEED STEREO IMAGE STREAMS Sudhangathan Bankarusamy [email protected] Examiner: Mikael Ekstr¨om M¨alardalenUniversity, V¨aster˚as,Sweden Supervisor: Carl Ahlberg M¨alardalenUniversity, V¨aster˚as,Sweden November 19, 2017 M¨alardalenUniversity Master Thesis Abstract The process of combining two views of a scene in order to obtain depth information is called stereo vision. When the same is done using a computer it is then called computer stereo vision. Stereo vision is used in robotic application where depth of an object plays a role. Two cameras mounted on a rig is called a stereo camera system. Such a system is able to capture two views and enable robotic application to use the depth information to complete tasks. Anomalies are bound to occur in such a stereo rig, when both the cameras are not parallel to each other. Mounting of the cameras on a rig accurately has physical alignment limitations. Images taken from such a rig has inaccurate depth information and has to be rectified. Therefore rectification is a pre-requisite to computer stereo vision. One such a stereo rig used in this thesis is the GIMME2 stereo camera system. The system has two 10 mega-pixel cameras with on-board FPGA, RAM, processor running Linux operating system, multiple Ethernet ports and an SD card feature amongst others. Stereo rectification on memory constrained hardware is a challenging task as the process itself requires both the images to be stored in the memory.
  • Computed Phase Stereo Lensless X-Ray Imaging

    Computed Phase Stereo Lensless X-Ray Imaging

    Computed phase stereo lensless X-ray imaging 1 1 1 1,2 2 3 4 4 J. Duarte , R. Cassin , J. Huijts , B. Iwan , M. Kovacev , M. Fajardo , F. Fortuna , L. Delbecq , W. Boutu1 and H. Merdji1* 1LIDYL, CEA, CNRS and Université Paris-Saclay, CEA Saclay 91191 Gif-sur-Yvette, France. 2Leibniz Universität Hannover, Institut für Quantenoptik, Hannover, Germany 3Instituto de Plasmas e Fusão Nuclear, IST Lisboa, Portugal 4 CSNSM, CNRS/IN2P3 and Université Paris-Saclay, 91405 Orsay, France. *correspondence to: [email protected] The ability to gain insights into the 3D properties of artificial or biological systems is often critical. However, 3D structures are difficult to retrieve at low dose and with extremely fast processing, as most techniques are based on acquiring and computing hundreds of 2D angular projections. This is even more challenging with ultrashort X-rays which allow realizing nanometre scale studies1–5 and ultrafast time resolved 2D movies6. Here we show that computer stereo vision concepts can be transposed to X-rays. We demonstrate nanoscale three-dimensional reconstruction from a single ultrafast acquisition. Two diffraction patterns are recorded simultaneously on a single CCD camera and after phase retrieval two stereo images are reconstructed. A 3D representation of the sample is then computed from quantitative disparity maps with about 130x130x380nm3 voxel resolution in a snapshot of 20 femtoseconds. We extend our demonstration to phase contrast X-ray stereo imaging and reveal hidden 3D features of a sample. Computed phase stereo imaging will find scientific applications at X-ray free electron lasers, synchrotrons and laser-based sources, but also in fast industrial and medical 3D diagnostics.