Realtime Mapping and Scene Reconstruction Based on Mid-Level Geometric Features
Total Page:16
File Type:pdf, Size:1020Kb
REALTIME MAPPING AND SCENE RECONSTRUCTION BASED ON MID-LEVEL GEOMETRIC FEATURES A Dissertation Submitted to the Temple University Graduate Board In Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY by Kristiyan Georgiev August 2014 Examining committee members: Dr Rolf Lakaemper, Advisory Chair, Dept. of Computer and Information Sciences Dr Alexander Yates, Department of Computer and Information Sciences Dr Longin Jan Latecki, Department of Computer and Information Sciences Dr M. Ani Hsieh, External Member, Drexel University Copyright c 2014 by Kristiyan Georgiev ABSTRACT Robot mapping is a major field of research in robotics. Its basic task is to combine (register) spatial data, usually gained from range devices, to a single data set. This data set is called global map and represents the environment, observed from different locations, usually without knowledge of their positions. Various approaches can be classified into groups based on the type of sensor, e.g. Lasers, Microsoft Kinect, Stereo Image Pair. A major disadvantage of current methods is the fact, that they are derived from hardly scalable 2D approaches that use a small amount of data. However, 3D sensing yields a large amount of data in each 3D scan. Autonomous mobile robots have limited computational power, which makes it harder to run 3D robot mapping algorithms in real-time. To remedy this limitation, the proposed research uses mid-level geometric features (lines and ellipses) to construct 3D geometric primitives (planar patches, cylinders, spheres and cones) from 3D point data. Such 3D primitives can serve as distinct features for faster registration, allowing real-time performance on a mobile robot. This approach works in real-time, e.g. using a Microsoft Kinect to detect planes with 30 frames per second. While previous approaches show insufficient performance, the proposed method operates in real-time. In its core, the algorithm performs a fast model fitting with a model update in constant time (O(1)) for each new data point added to the model using a three stage approach. The first step inspects 1.5D sub spaces, to find lines and ellipses. The next iii stage uses these lines and ellipses as input by examining their neighborhood structure to form sets of candidates for the 3D geometric primitives. Finally, candidates are fitted to the geometric primitives. The complexity for point processing is O(n); additional time of lower order is needed for working on significantly smaller amount of mid-level objects. The real-time performance suggests this approach as a pre- processing step for 3D real-time higher level tasks in robotics, like tracking or feature based mapping. In this thesis, I will show how these features are derived and used for scene regis- tration. Optimal registration is determined by finding plane-feature correspondence based on mutual similarity and geometric constraints. Our approach determines the plane correspondence in three steps. First step computes the distance between all pairs of planes from the first scan to all pair of planes from the second scan. The distance function captures angular, distance and co-planarity differences. The result- ing distances are accumulated in a distance matrix. The next step uses the distance matrix to compute the correlation matrix between planes from the first and second scan. Finally plane correspondence is found by finding the global optimal assignment from the correlation matrix. After finding the plane correspondence, an optimal pose registration is computed. In addition to that, I will provide a comparison to existing state-of-the-art algorithms. This work is part of an industry collaboration effort sponsored by the National Institute of Standards and Technology (NIST), aiming at performance evaluation and modeling of autonomous navigation in unstructured and dynamic environments. Additional field work, in the form of evaluation of real robotic systems in a robot test arena was performed. iv To my parents Rosen and Rositsa. To my grandparents Ivanka, Katya and Penko. To Winnie. This dissertation is a result of their continuous love and support. In loving memory of my grandfather Kostadin Georgiev. v ACKNOWLEDGEMENTS I am sincerely thankful to my advisor Dr Rolf Lakaemper, whose encouragement, guidance and support from the initial to the final level enabled me to gain un- derstanding of the subject and develop as a researcher. His enormous knowledge, enthusiasm and extraordinary attention to details were of great inspiration through- out my studies and had a major influence in shaping my critical thinking in scientific research. During my graduate studies I was fortunate to work as a teaching assistant for Prof. Rolf Lakaemper, Prof. Ola Ajaj, Prof. Pei Wang. Their experience and enthusiasm about the topics being taught were of great inspiration throughout my years of teaching. I am also thankful to all my students at the Department of Computer and Information Sciences. I would also like to thank Dr. Alexander Yates, Dr. Longin Jan Latecki and Dr. M. Ani Hsieh for serving on my dissertation committee and for providing their invaluable feedback. I am very thankful for the opportunity to work closely with my lab colleagues Motaz AlHami and Ross Creed. Special thanks to Dr. Raj Madhaven for his scientific pursuit and dedication during out time at Robotics Technology Park (RTP) near Huntsville, AL. Special thanks to Jan Elseberg and Dorit Borman for clarifying the use of SLAM6D ICP implementation, as well as Dirk Holtz and Moataz Elmasry for their PCL and vi ROS compatible implementation of my plane extraction algorithm. I thank Dr. Maurice Wright and Dr. Rolf Lakaemper for allowing me to help and experience the production the robot opera GALATEA RESET. The author gratefully acknowledge the financial support from Temple University and ARRA-NIST-10D012 of the National Institute of Standards and Technology (NIST). Last, but not least a big thank you to all my friends through all these years. vii TABLE OF CONTENTS ABSTRACT iii ACKNOWLEDGEMENTS vi LIST OF TABLES xi LIST OF FIGURES xiii 1 INTRODUCTION1 1.1 Mobile Robots...............................1 1.2 Robot Sensors...............................2 1.2.1 3D Sensors............................2 1.3 Robot Mapping.............................. 10 1.3.1 3D Data Acquisition....................... 16 2 MID-LEVEL GEOMETRY FEATURE (MLGF) REPRESENTA- TION 21 2.1 Introduction................................ 21 2.2 Related Work............................... 28 2.3 Points to Line-Segments to Planes.................... 31 2.3.1 Line Segments........................... 31 2.3.2 Segment Neighborhood N and Connected Components.... 33 2.3.3 Planes............................... 35 2.3.4 Plane Update........................... 38 viii 2.3.5 Region Growing.......................... 39 2.3.6 Line Segment Extraction..................... 40 2.4 Ellipse Extraction............................. 41 2.4.1 Motivation............................. 41 2.4.2 System Limitations........................ 44 2.5 3D Object Extraction........................... 46 2.6 Experiments................................ 48 2.6.1 Improving the System using Kalman Filtering......... 49 2.6.2 Basic Test: Object detection, detection speed......... 50 2.6.3 Comparison to RANSAC..................... 50 2.6.4 Comparison to RANSAC for Ellipse Fitting.......... 53 2.6.5 Robustness of Mid-Level Geometric Features to Noise..... 54 2.6.6 Accuracy............................. 57 2.6.7 Occlusion............................. 58 2.6.8 Object Tracking.......................... 59 2.7 Conclusion and Future Work....................... 59 3 CORRESPONDENCES BASED ON MID-LEVEL GEOMETRIC FEATURES 62 3.1 Introduction................................ 62 3.2 Related Work............................... 65 3.3 Methodology............................... 68 3.3.1 Plane correspondence....................... 69 3.3.2 Pose-registration using planar patches.............. 72 3.4 Plane-Correspondence Experiments................... 73 3.4.1 Comparison to ICP........................ 73 3.4.2 Comparison to PRRUS...................... 74 ix 3.5 Conclusion................................. 78 4 ADDITIONAL PRACTICAL EXPERIENCE 79 4.1 Benchmarking and Standardization................... 79 4.1.1 NIST Test Arena......................... 82 4.2 Robot Opera............................... 84 5 CONCLUSION 92 BIBLIOGRAPHY 93 x LIST OF TABLES 1.1 Range sensors overview..........................4 2.1 Overall segmentation results on all 30 test images assuming 80% pixel overlap as described by [23]....................... 53 2.2 Accuracy test. Left table: without Kalman filter. DG: ground truth distance, DM : mean measured distance, Dσ: standard deviation of distance measurements, RM : mean measured radius, Rσ: standard deviation of radius measurements, N: number of measurements. Right table: with Kalman filter, labels accordingly............... 55 2.3 Occlusion experiment of a cylinder(R = 0:2m) at different distances (DG), with and without Kalman filtering. O: occlusion percentage (10-40), the algorithm does not handle occlusions ≥ 50%. R: mea- sured radius, Rσ: standard deviation radius, RK , RKσ: with Kalman filter, N: number of measurements.................... 55 2.4 Observations of a moving cylinder. Estimated velocity VM is computed using the observed change in distance over time and is compared to the ground truth velocity (VG), Vσ denotes standard deviation in velocity. Similar computation is shown using the Kalman filter (VK , VK σ). N: number