<<

Geometric Primitive Extraction for 3D Model Reconstruction

Diego VIEJO, Miguel CAZORLA Dept. Ciencia de la Computación e Inteligencia Artificial Universidad de Alicante Apdo. 99, 03080 Alicante, España {dviejo,miguel}@dccia.ua.es

Abstract. 3D map building is a complex robotics task which needs mathematical robust models. From a 3D cloud, we can use the normal vectors to these points to do feature extraction. In this paper, we propose to extract geometric primitives (e.g. planes) from the cloud. We will present a robust method for normal estimation.

Keywords. Robotics, Computer Vision, 3D Map Building, Feature Extraction.

Introduction

In a previous work, [9], we have developed a method for the reconstruction of a map 3D by means of a mobile robot from a sets of points. As 3D sensor we used a commercial stereo camera that yields both 3D coordinates and grey level. The use of a stereo camera for the acquisition of 3D points supposes a handicap both for the reconstruction of a 3D environment and for the extraction of 3D features due to the fact that the data obtained by means of this type of cameras present a high level of error. In addition, we must face the absence of information when the environment presents objects without texture. Related with this task is the well-known SLAM [7] problem which consists of constructing the map of the environment simultaneously while trying to locate ourselves in it. The difficulty of this problem is rooted on the odometry error. Our previous approach, proposed in [9], consists of a modification of the algorithm ICP using the EM methodology who allows us to reduce sensitively the odometry error. The environments where our robot is moving are man-made ones. For this reason, it seems to be reasonable to extract geometric primitives of great size. In our case, these primitives will be planes corresponding to the walls of the corridors of the environment. Our approach departs from the idea of deformable template [8], from which we are going to perform the search of the planes that form the environment. We will try to fit these templates inside large and noisy 3D point clouds. As measure of the fitting of the templates, we are going to use, mainly, the normal vector of the underlying surface [5, 10, 11]. Such a normal estimation is critical for the later feature extraction process. Thus, in this paper we present a robust method for the calculation of the normal vectors.

Our approach can be summarized in the following steps: first, reduction of odometry error, using a method of pose registration; second, calculation of normal vectors from the obtained points; third, extraction of planar geometric primitives. The rest of the paper is organized as follows: In section 1, we describe the experiment set and hardware used. Then, the method for pose registration is explained in section 2. Section 3 describes the method developed for the estimation of normals. Section 4 introduces an approximation for the extraction of geometric primitives and, finally, some conclusions are drawn in section 5.

1. Description of the experiment set.

The experiment was realized in the faculty of Economics Sciences at the University of Alicante. In Figure 1b we show the building plant and the approximate path performed by our robot. Our robot Frodo (see Figure 1a) is a Magellan Pro from RWI with a tri-stereo Digiclops camera (see [6] for more details).

Figure 1. a) left: Robot and stereo camera used in the experiment. b) right: Building plant where the experiment was realized and path followed by the robot.

The camera provides us a 3D point set and the intensity value at each point. While the robot is moving it takes 3D images approximately every 1 meter or 15 degrees. We have the odometry information provided by the encoders of the robot. This information yields an approximation of the actions realized by the robot A = {a1, a2, …, an}, where each action ai = (xi, yi, θi). We can obtain a reconstruction of the environment (map building) using this odometry information. To do that, provided that the coordinates of the points are local to every position of the robot, we must apply a transformation to the coordinates of the points using such information. Figure 2 (left) shows a zenithal view of the plant of this environment. Note that the points both from the floor and the roof have been removed which allows to observe accurately the reconstruction. We can note that certain problems existing in the reconstruction come from odometry errors. The treatment of these errors will be adressed in the following section.

Figure 2. Zenithal view of the floor of the reconstructed environment. Left: using odometry. Right: Using ICP-EM

2. Rectification by pose registration.

As shown before, reconstruction with only odometry information is very noisy. We minimize the errors using a pose registration method given two consecutive point sets. The classic algorithm for 3D pose registration is ICP: Iterative Closest Point [2]. ICP calculates, iteratively, the best transformation between two point sets in two steps. The first one, given an initial transformation, it calculates correspondences between points using a distance criterion after applying the transformation. In the second one, the latter correspondences are used to estimate the best transformation in terms of minimum squares. The latter transformation is retained for a new first step. ICP ensures convergence to a local minimum. The main problem of this algorithm is that it is prone of outliers: If the difference of points between both sets is more than 10 or 20%, the algorithm will converge with, high probability, to a local minimum. However, the ICP-EM algorithm [3] provides minor dependence of outliers. This algorithm is based on ICP to perform the same steps of correspondence and transformation estimation. Nevertheless, rather than determining the correspondences by one-way form, it uses random variables to estimate the probability that a given point in one set matches another one in the other set. If we know the transformation we might calculate the correspondences simply by choosing the nearest point. This is a typical “chicken and egg” one [18] and it can be solved with an EM algorithm: Iteratively and similarly to the ICP it finds the transformation and the correspondences. In [3] the formal derivation of the algorithm is detailed. Figure 2 (right) shows the result of applying such algorithm.

3. Normal estimation

3.1 Normal classification

The aim of this work is to extract geometric primitives from a cloud of 3D points, which have been obtained from the environment in the latter step. As we set before, in our environment is assumed to be build in planar surfaces. We estimate the normal vectors of the underlying surfaces on each of the 3D points extracted from the environment. A later geometric primitive recognition will depend on the robustness of this normal estimation step. Normal estimation relies on the method described in [12]: given a it returns the normal at each vertex of this mesh. The basic idea is to select a region from the mesh around a vertex. This region is specified in terms of geodesic neighbourhood of the vertex. Each triangle Ti into this neighbourhood casts a vote Ni which depends on the normal vector of the plane that contains the triangle. To avoid the problem that normals with opposite orientation annihilate each other, Ni is represented as a covariance matrix Vi t = Ni Ni . Votes are collected as a weighted matrix sum Vv with

= w V = w N N t V v ∑ i i ∑ i i i ii

Ai  gi  wi = exp−  Amax  σ  where Ai is the area of Ti, Amax is the area of the largest triangle in the entire mesh, gi is the geodesic distance of Ti from v, and σ controls the rate of decay. With these equations we obtain knowledge about the variance instead of loosing information about normal sign orientation. This knowledge allows us to draw conclusions about the relative orientation of the vertex. So, we can decompose Vv using eigen-analysis and then classify vertex v. Since Vv is a symmetric semidefinite matrix, eigen- decomposition generates real eigenvalues λ1≥λ2≥λ3≥0 with corresponding eigenvectors E1, E2, end E3. With this information we can define the saliency map [13] as:

S s = λ1 − λ2

Sc = λ2 − λ3

S n = λ3

And so, we propose the following vertex classification scheme for the eigenvalues of Vv at each vertex:

Ss : surface matches with normal N v = E1  max{}Ss ,εSc ,εηSn = εSc : crease junction with tangent Tv = E3  εηSn : no preferred orientation

Where 0≤ε≤∞ and 0≤η≤∞ are constants that control the relative significance of the saliency measures. Given such a classification we are interested in vertices Nv = E1, and we

can filter the rest. Using vector voting, we can calculate the normal of a 3D point from a triangle mesh. However, prior to obtain normals we must compute the triangulation. Our approach is addressed to generate a Delaunay triangulation from 3D points.

3.2 Triangle mesh generation

The Delaunay triangulation of a set of 3D points is a well-known topic in the literature. Many of these methods [16, 17] obtain the triangulation for a 3D solid. Nevertheless, our problem cannot be solved by these approaches since we cannot assume that our set of points comes from a 3D solid. Mainly, our data comes from walls of corridors and rooms where the robot moves. Walls are, in fact, 2D surfaces in a 3D space. Others approaches are used to build topographic maps from a piece of terrain. These methods also build a mesh from a 2D surface in a 3D space, but really what they do is to project the 3D points over a horizontal plane, and so, final calculation is a 2D Delaunay triangulation. Our approach wants to resolve a Delaunay triangulation of clouds of 3D points without either any consideration about sensor or data coming from a closed volume. On the other hand, due to the nature of the problem that we want to solve, we introduce a constraint about the maximum size of the triangles in the mesh in order to maintain openings between walls or between a wall and any nearby objects from the environment. To solve this, we use a divide & conquer (D&C) schema that is based on the recursive partition and local triangulation of the point set, follow by a merging phase where the resulting triangulations are joined. In general, D&C methods work well in 2D spaces, but nevertheless, to do the same in 3D space is not a simple task because merging is simple in 2D [14] but hard to design in more than two . DeWall, in [15], proposed an interesting triangulation method that uses a D&C strategy by reversing the order between the solutions of sub-problems and the merging phase. Instead of merging partial results, it applies a more complex dividing phase which partitions a set and builds, as first step, the merging triangulation. The input point set is divided using a cutting plane α such that the two associated half spaces contain two point sets P1 and P2 of comparable cardinality. Analogously, the splitting plane α divides a triangulation Σ into three disjoint subsets: the triangles that are intersected by the plane and the two sets of triangles that are completely contained in the two half spaces defined by α. In order to build our mesh, we use this idea and then we incorporate the constraints imposed by the problem. First, DeWall uses a tetrahedron as geometric basic entity to build a mesh from a set of 3D points that form a closed volume. The supposition of closed volume is not fulfilled in our case and it would be a source of problems due to the high noise that we must handle. For this, we are going to use the triangle as geometric basic entity to calculate the triangulation, in spite of the tetrahedron. This idea arises from the fact that we are triangulating points from 2D planes inside a 3D environment and, in general, to build a triangulation from 2D points the geometric basic entity is the triangle. In addition, we have to consider the fact that the triangle size constraint has as consequence: it might happen that certain parts of the space remain unconnected and, therefore, not be triangulated. For this reason, before the triangulation process, we compute clusters of connected points which will be input to this triangulation process. In Figure 3 (left) we show a detail of the result of applying our proposed algorithm to do the triangulation of a large and noisy cloud of 3D points: unconnected zones appear due to the

absence of information and the constraint of the maximum size of triangle imposed. In Figure 3 (right) normal vectors calculated from the triangle mesh are shown.

Figure 3. Left: Detail of the result of applying the triangulation process over a set of points. Right: Normal vectors calculated from the triangle mesh.

4. Vertical flat surfaces extraction.

Once we project the normals of all points of the environment into a semisphere (Figure 5, left) we can observe that there are certain zones with more density. This is due to the fact that there is a majority of surfaces of the environment with similar orientations (walls). We are going to exploit this fact to extract vertical planes. We will apply the k- means algorithm [4] to find cluster of points corresponding to each of the two dominant orientations in the environment (Figure 5, right). Having the clusters of points with a similar orientation we proceed to extract planes. We think that, inside every cluster, two points will belong to the same plane if the distance between them is less that a threshold λ. What we have up to this moment is a high number of planes (patches) of relatively small size. To find wider planes we try to fuse those patches that fit to the same object in the environment. As a criterion for merging patches we use the distance between them and the relative angle of its normal. We set a threshold Ud of maximum distance of separation between patches and fuse them if dist(Πi, Πj) < Ud. On the other hand, to make the merging effective the difference of the orientations must satisfied |Orientation(Πi) - Orientation(Πj)| < Uo. The values Ud and Uo are set experimentally.

We can improve the obtained results by incorporating our knowledge about the environment. We know there are parts of an environment that remain invariant throughout the time, as for example the walls of a building. In addition, these parts present a series of relations, as the geometric ones, that we can incorporate into our model as constraints that the obtained result must satisfied. Hereby, we introduce our model constraints as “All the planes of the environment have to be horizontal” or “Two planes of the environment have to be parallel or perpendicular” [1]. With the incorporation of this information in our model we reduce the error in both measuring phase and during the process of extraction and merging planes.

Figure 5. Left: Normal vector projection. Right: k-means results.

Figure 6. Result of the whole complete process. Left: View of the floor of the building. Right: detail of a corridor.

5. Conclusions and future works

The of work we are following tries to build a 3D map of an environment. In this paper we improved the method of normals estimation, endowing it with higher robustness. For this, it has been necessary to develop a method to build a mesh of triangles on the point set. The work is completed by the previous phase of reconstruction of the environment and the later one of extraction of planes. As continuation of this work we try to address the SLAM problem in order to improve the reconstruction of the 3D map. In addition we want to build a robust system of constraints that allows us to obtain 3D models more precisely. Finally, we will extend the work to extract other geometric primitives, like , boxes, and so on.

Acknowledgments

This work has been supported by grant TIC2002-02792 funded by Ministerio de Ciencia y Tecnología and FEDER.

References

[1] P. Allen and I. Stamos. Solving architectural modelling problems using knowledge. In Proc. 4th Int. Conf. on 3-D Digital Imaging and Modeling, 2003. [2] P. Besl and N. McKay. A method for registration of 3-d . IEEE Trans. On Pattern Analysis and Machine Intelligence, 14(2):239–256, 1992. [3] M. Cazorla and B. Fisher. Characterizing local minima in 3d registration methods. In Proc of the British Machine Vision Conference (en proceso de revision), 2004. [4] R. Duda, P. Hart, and D. Stork. Pattern Classification. Wiley-Interscience, 2000. [5] S. Hakim, P. Boulanger, and F. Blais. A mobile system for indoors 3-d mapping and positioning. In Proc of Fourth Conference on Optical 3-D Measurement Techniques, 1997. [6] J.M. Saez and F. Escolano. A Global 3D Map-Building Approach Using Stereo Vision. Proceedings of IEEE International Conference on Robotics and Automation (ICRA). 2004. [7] S. Thrun, D. Fox, and W. Burgard. A probabilistic approach to concurrent mapping and localization for mobile robots. Machine Learning, 31:29–53, 1998. also appeared in Autonomous Robots 5, 253–271 (joint issue). [8] A.L. Yuille and P. Hallinan. Deformable templates. In A. Yuille and A. Blake, editors, Active Vision. MIT Press, 1992. [9] D. Viejo and M. A. Cazorla. Construcción de mapas 3D y extracción de primitivas geométricas del entorno. In Proc of 5th Workshop de Agentes Físicos, 2004. [10] C. Zhang and T. Chen. Efficent feature extraction for 2D/3D Objects in mesh representation. In proceedings of IEEE International Conference on Image Processing (ICIP), 2001 [11] Y. Wang, B. H. Peterson and L. H. Staib. -Based 3D surface correspondence using geodesics and local geometry. In proceedings of Computer Vision and Pattern Recognition (CVPR), 2000. [12] D. L. Page, Y. Sun, A. F. Koschan, J. Paik and M. A. Abidi. Normal vector voting: crease detection and curvature estimation on large, noisy meshes. Graphical Models, Special Issue on Larte Triangle Mesh Models, Vol. 64, No. 3-4, pp. 199-229, May 2002. [13] G. Medoni, M. Lee, and C. K. Tang. A Computational Framework for Segmentation and Grouping, Elsevier, Amsterdam, 2000. [14] D. T. Lee and B.J. Schchter. Two algorithms for constructing a Delaunay triangulation. Int. J. of Computer and Information Science, 9(3):219-242, 1980. [15] P. Cignoni, C. Montani and R. Scopigno. DeWall: a fast divide and conquer Delaunay triangulation algorithm. Ed. Computer-Aided Design 30, 5, pp. 333-341, 1998. [16] E. Mücke. A Robust Implementation for Three-dimensional Delaunay Triangulations. Proceedings of the 1st International Software Workshop. 1995. [17] K. Hormann and M. Reimers. Triangulating Point Clouds with Spherical Topology. Curve and Surface Design: Saint-Malo 2002, pp. 215-224. 2003. [18] Dempster, A. A. Laird, and D. rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society, Series B 39, 138. 1977.