Real-Time Scene Reconstruction and Triangle Mesh Generation Using Multiple RGB-D Cameras Siim Meerits, Vincent Nozick, Hideo Saito
Total Page:16
File Type:pdf, Size:1020Kb
Real-time scene reconstruction and triangle mesh generation using multiple RGB-D cameras Siim Meerits, Vincent Nozick, Hideo Saito To cite this version: Siim Meerits, Vincent Nozick, Hideo Saito. Real-time scene reconstruction and triangle mesh genera- tion using multiple RGB-D cameras. Journal of Real-Time Image Processing, Springer Verlag, 2019, 10.1007/s11554-017-0736-x. hal-01638241 HAL Id: hal-01638241 https://hal-upec-upem.archives-ouvertes.fr/hal-01638241 Submitted on 26 Dec 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Real-time scene reconstruction and triangle mesh generation using multiple RGB-D cameras Siim Meerits, Vincent Nozick & Hideo Saito Journal of Real-Time Image Processing ISSN 1861-8200 J Real-Time Image Proc DOI 10.1007/s11554-017-0736-x 1 23 Your article is published under the Creative Commons Attribution license which allows users to read, copy, distribute and make derivative works, as long as the author of the original work is cited. You may self- archive this article on your own website, an institutional repository or funder’s repository and make it publicly available immediately. 1 23 J Real-Time Image Proc https://doi.org/10.1007/s11554-017-0736-x ORIGINAL RESEARCH PAPER Real-time scene reconstruction and triangle mesh generation using multiple RGB-D cameras 1 2,3 1 Siim Meerits • Vincent Nozick • Hideo Saito Received: 3 April 2017 / Accepted: 26 September 2017 Ó The Author(s) 2017. This article is an open access publication Abstract We present a novel 3D reconstruction system have practical uses in fields such as archeology [42], cin- that can generate a stable triangle mesh using data from ematography [45] and robotics [31]. A majority of the multiple RGB-D sensors in real time for dynamic scenes. works produced so far have focused on static scenes. The first part of the system uses moving least squares However, many interesting applications, such as telepres- (MLS) point set surfaces to smooth and filter point clouds ence [13, 37, 44], require 3D reconstruction in a dynamic acquired from RGB-D sensors. The second part of the environment, i.e. in a scene where geometric and colori- system generates triangle meshes from point clouds. The metric properties are not constant over time. whole pipeline is executed on the GPU and is tailored to For most 3D reconstruction systems, the process can be scale linearly with the size of the input data. Our contri- divided conceptually into three stages: butions include changes to the MLS method for improving 1. Data acquisition Traditionally, acquiring a real-time meshing, a fast triangle mesh generation method and GPU 3D structure of an environment using stereo or multi- implementations of all parts of the pipeline. view stereo algorithms has been a challenging and computationally expensive stage. The advent of con- Keywords 3D reconstruction Á Meshing Á Mesh zippering Á sumer RGB-D sensors has enabled the 3D reconstruc- RGB-D cameras Á GPU tion process to become truly real-time, but RGB-D devices still have their own drawbacks. The generated depth maps tend to be noisy, and the scene coverage is 1 Introduction restricted due to the sensor’s limited focal length, making the following process stages more difficult to 3D surface reconstruction and meshing methods have been achieve. Many reconstruction systems start with a researched for decades in the computer vision and com- preprocessing stage to reduce some noise inherent to puter graphics fields. Its applications are numerous and RGB-D sensors. 2. Surface reconstruction The surface reconstruction & Siim Meerits consolidates available 3D information to a single [email protected] consistent surface. Vincent Nozick 3. Geometry extraction After a surface has been defined, [email protected] it should be converted to a geometric representation Hideo Saito that is useful for a particular application. Some [email protected] commonly used formats are point clouds, triangle meshes and depth maps. 1 Department of Information and Computer Science, Keio University, Yokohama, Japan Our work involves both surface reconstruction and triangle 2 Institut Gaspard Monge, Universite´ Paris-Est Marne-la- mesh generation. We enhance a moving least squares Valle´e, Champs-sur-Marne, France (MLS)-based surface reconstruction method to fit our 3 Japanese-French Laboratory for Informatics, Tokyo, Japan needs. We generate triangle meshes in the geometry 123 J Real-Time Image Proc extraction stage, since they are the most commonly used their ability to integrate noisy input data in real time to representation in computer graphics and are view produce high-quality scene models. However, a major independent. drawback is their high memory consumption. Whelan et al. [50] and Chen et al. [10] propose out-of-core approaches 1.1 Related works where reconstruction volume is moved around in space to lower system memory requirements. However, these We constrain the related literature section to 3D recon- methods cannot capture dynamic scenes. Newcombe et al. struction methods that can work with range image data [40], Dou et al. [16] and Innmann et al. [27] support from RGB-D sensors and provide explicit surface geometry changes to the scene by deforming the reconstructed vol- outputs, such as triangle meshes. As such, view-dependent ume. These methods expect accurate object tracking, which methods are not discussed. can fail under complex or fast movement. Variational volumetric methods, such as those of 1.1.1 Visibility methods Kazhdan et al. [30] and Zach [51], reconstruct surfaces by solving an optimization task under specified constraints. Visual hull methods, as introduced in Laurentini [34], Recently, the method of Kazhdan and Hoppe [29] has reconstruct models using an intersection of object silhou- become a popular choice with Collet et al. [13] further ettes from multiple viewpoints. Polyhedral geometry developing it for use in real-time 3D reconstruction, albeit [20, 39] has become a popular representation of hull utilizing an incredible amount of computational power. structure. To speed up the process, Li et al. [36], Duck- Indeed, the global nature of the optimization comes with a worth and Roberts [17] developed GPU-accelerated great computational cost, making it infeasible in most sit- reconstruction methods. The general drawback with those uations with consumer-grade hardware. approaches is that they need to extract objects from an image background using silhouettes. In a cluttered and 1.1.3 Point-based methods open scene, this is difficult to do. We also may wish to include backgrounds in reconstructions, but this is gener- MLS methods have a long history in data science as a tool ally not supported. for smoothing noisy data. Alexa et al. [2] used this concept Curless and Levoy [15] use a visual hull concept toge- in computer visualization to define point set surfaces (PSS). ther with range images and define a space carving method. These surfaces are implicitly defined and allow points to be Thanks to the range images, background segmentation refined by reprojecting to them. Since then, a wide variety becomes a simple task. However, space carving is designed of methods based on PSS have appeared—see Cheng et al. to operate on top of volumetric methods, which means that [11] for a partial summary. In the classical formulation, issues with volumetric methods also apply here. Using Levin [35] and Alexa et al. [3] approximate local surfaces multiple depth sensors or multiple scans of scenes can around a point as a low-degree polynomial. Alexa and introduce range image misalignments. Zach et al. [52] Adamson [1] simplify the approach by formulating a counters these misalignments by merging scans with a signed distance field of the scene from oriented normals. regularization procedure, albeit with even higher memory To increase result stability, Guennebaud and Gross [22] consumption. When a scene is covered with scans at dif- formulate higher-order surface approximation while ferent scales, fine resolution surface details can be lost. Fleishman et al. [19] and Wang et al. [49] add detail-pre- Fuhrmann and Goesele [21] developed a hierarchical vol- serving MLS methods. Kuster et al. [33] introduce tem- ume approach to retain the maximum amount of details. porally stable MLS for use in dynamic scenes. However, the method is designed to combine a very high Most works to date utilize splatting [53] for visualizing number of viewpoints for off-line processing; as such, it is MLS point clouds. While fast, this approach cannot handle unsuited to real-time use. texturing without blurring, so it is not as well supported in computer graphics as traditional triangle meshes are. It has 1.1.2 Volumetric methods been considered difficult to generate meshes on top of MLS processed point clouds. Regarding MLS, Berger et al. [7] Volumetric reconstruction methods represent 3D data as note that ‘‘it is nontrivial to explicitly construct a contin- grids of voxels. Each volume element can contain space uous representation, for instance an implicit function or a occupancy data [12, 14] or samples of continuous functions triangle mesh’’. Scheidegger et al. [46] and Schreiner et al. [15]. After commodity RGB-D sensors became available, [47] propose advancing front methods to generate triangles the work of Izadi et al. [28] spawned a whole family of 3D on the basis of MLS point clouds.