Constructing 3D City Models by Merging Aerial and Ground Views
Total Page:16
File Type:pdf, Size:1020Kb
3D Reconstruction and Visualization Constructing 3D City Models by Merging Aerial and Christian Früh and Avideh Zakhor Ground Views University of California, Berkeley hree-dimensional models consisting of the to acquire data while driving at normal speeds on pub- Tgeometry and texture of urban surfaces lic roads. Other researchers proposed a similar system could aid applications such as urban planning, training, using 2D laser scanners and line cameras.8 In both sys- disaster simulation, and virtual-heritage conservation. tems, the researchers acquire data continuously and A standard technique used to create large-scale city quickly. In another article, we presented automated models automatically or semi-automatically is to apply methods to process this type of data efficiently to obtain stereo vision on aerial or satellite imagery.1 In recent a detailed model of the building facades in downtown years, advances in resolution and Berkeley.9 However, these facade models don’t provide accuracy have also made airborne information about roofs or terrain shape because they The authors present an laser scanners suitable for generat- consist only of surfaces visible from the ground level. ing digital surface models (DSM) In this article, we describe an approach to register and approach to automatically and 3D models.2 Although you can merge our detailed facade models with a complemen- perform edge detection more accu- tary airborne model. Figure 1 shows the data-flow dia- creating textured 3D city rately with aerial photos, airborne gram of our method. The airborne modeling process on laser scans require no camera- the left in Figure 1 provides a half-meter resolution models using laser scans and parameter estimation and feature model with a bird’s-eye view of the entire area, con- detection to obtain 3D geometry. taining terrain profile and building tops. The ground- images taken from ground Previous work has attempted to based modeling process on the right results in a detailed reconstruct polygonal models by model of the building facades. Using the DSM obtained level and a bird’s-eye using a library of predefined build- from airborne laser scans, we localize the acquisition ing shapes or by combining the DSM vehicle and register the ground-based facades to the air- perspective. with digital ground plans or aerial borne model by means of Monte Carlo localization images. While you can achieve sub- (MCL). We merge the two models with different reso- meter resolution with this technique, you can only cap- lutions to obtain a 3D model. ture the building roofs and not their facades. Several research projects have attempted to create Textured surface mesh from airborne models from ground-based views at a high level of detail laser scans to enable virtual exploration of city environments. While We use the DSM for localizing the ground-based data- most of these approaches result in visually pleasing mod- acquisition vehicle and for adding roofs and terrain to els, they involve an enormous amount of manual work, the ground-based facade models. In contrast to previ- such as importing the geometry obtained from construc- ous approaches, we don’t explicitly extract geometric tion plans or selecting primitive shapes and correspon- primitives from the DSM. While we use aerial laser scans dence points for image-based modeling or complex data to create the DSM, it’s equally feasible to use a DSM acquisition. Other attempts to acquire close-range data obtained from other sources, such as stereo vision. in an automated fashion have used image-based3 or 3D laser scanner approaches.4,5 However, these approaches Scan point resampling and DSM generation don’t scale to more than a few buildings because they During airborne laser scan acquisition using a 2D scan- acquire data in a slow, stop-and-go fashion. ner mounted on an airplane, the unpredictable roll and In previous work, we proposed an automated method tilt motion of the plane destroyed the inherent row- that would rapidly acquire 3D geometry and texture column order of the scans. Thus, the scans might be inter- data for an entire city.6,7 This method uses a vehicle preted as an unstructured set of 3D vertices in space, with equipped with 2D laser scanners and a digital camera the x, y coordinates specifying the geographical location 52 November/December 2003 Published by the IEEE Computer Society 0272-1716/03/$17.00 © 2003 IEEE and the z coordinate the altitude. To Airborne Ground based process the scans efficiently, it’s help- ful to resample the scan points to a Aerial Airborne Horizontal Vertical images laser scans laser scan laser scan row-column structure even though this step could reduce the spatial res- olution. To transfer the scans into a DSM Camera Scan matching DSM—a regular array of altitude generation and images and initial path post- values—we define a row-column computation grid in the ground plane and sort the processing scan points into the grid cells. The density of scan points is not uniform, Cleaned Image which means there are grid cells up DSM Monte Carlo Facade registration with no scan point and others with localization mesh (manual) generation multiple scan points. Because the Facade model percentage of cells without scan Mark facades points and the resolution of the DSM and depend on grid cell size, we must foreground compromise by leaving few cells Texture without a sample while maintaining mapping the resolution at an acceptable level. In our case, the scans have an Pose imagery accuracy of 30 centimeters in the Airborne Airborne mesh Fused horizontal and vertical directions model Blend mesh generation and city generation and a raw spot spacing of 0.5 meters texture mapping model or less. Measuring both the first and the last pulses of the returning laser light, we select a square cell size of 1 Dataflow diagram of our 3D city-modeling approach. Airborne modeling steps are highlight- 0.5 × 0.5 meters, resulting in about ed in green, ground-based modeling steps in yellow, and model-fusion steps in white. half the cells being occupied. We create the DSM by assigning to each cell the highest z value among its member points to pre- serve overhanging rooftops while suppressing the wall points. We fill the empty cells using nearest-neighbor interpolation to preserve sharp edges. We can interpret each grid cell as a vertex, where the x, y location is the cell center and the z coordinate is the altitude value—or as a pixel at x, y with a gray intensity proportional to z. Processing the DSM (a) The DSM contains the plain rooftops and terrain shape 2 Processing as well as objects such as cars and trees. Roofs, in partic- steps for DSM: ular, look bumpy because of a large number of smaller (a) DSM objects on them, such as ventilation ducts, antennas, and obtained from railings, which are impossible to reconstruct properly at scan-point the DSM’s resolution. Furthermore, scan points below resampling; overhanging roofs cause ambiguous altitude values, (b) DSM after resulting in jittery edges. To obtain more visually pleasing flattening roofs; roof reconstructions, we apply several processing steps. and (c) seg- We first target flattening the bumpy rooftops. We ments with apply a segmentation algorithm to all nonground pix- Ransac lines in els on the basis of the depth discontinuity between adja- (b) white. cent pixels. We replace small, isolated regions with ground-level altitude to remove objects such as cars or trees in the DSM. We subdivide the larger regions fur- ther into planar subregions by means of planar seg- mentation. Then we unite small regions and subregions with larger neighbors by setting their z values to the larg- er region’s corresponding plane. This procedure helps remove undesired small objects from the roofs and pre- vents the separation of rooftops into too many cluttered regions. Figure 2a shows the original DSM and Figure (c) 2b the resulting processed DSM. IEEE Computer Graphics and Applications 53 3D Reconstruction and Visualization building facades readily available from the ground- based acquisition. As such, we are primarily interested in adding the complementary roof and terrain geome- try, so we apply a different strategy to create a model from the airborne view. We transform the cleaned-up DSM directly into a triangular mesh and reduce the number of triangles by simplification. The advantage of this method is that we can control the mesh generation process on a per-pixel level. We exploit this property in the model fusion procedure. Additionally, this method has a low processing complexity and is robust. Because this approach requires no predefined building shapes, we can apply it to buildings with unknown shapes— even in presence of trees. Admittedly, doing so typical- ly comes at the expense of a larger number of triangles. Because the DSM has a regular topology, we can directly transform it into a structured mesh by connect- ing each vertex with its neighboring ones. The DSM for a city is large, and the resulting mesh has two triangles per cell, yielding 8 million triangles per square kilome- ter for the 0.5 × 0.5 meter grid size. Because many ver- tices are coplanar or have low curvature, we can drastically reduce the number of triangles without sig- nificant loss of quality. We use the Qslim mesh simplifi- 3 Acquisition vehicle. cation algorithm11 to reduce the number of triangles. Empirically, we have found it possible to reduce the ini- tial surface mesh to about 100,000 triangles per square The second processing step straightens the jittery kilometer at the highest level of detail without notice- edges. We resegment the DSM into regions, detect the able loss in quality.