11-051.qxd 8/17/12 6:11 PM Page 927

Evaluating the Benefits of Octree-based Indexing for Lidar Data

Abu Saleh Mohammad Mosa, Bianca Schön, Michela Bertolotto, and Debra F. Laefer

Abstract early stages. Notably, these efforts for efficient storage and Very large three-dimensional (3D) point datasets are manipulation are independent of the specific data formats increasingly common, such as from Light Detection and (e.g., LAS) that are being developed by both profit and not- Ranging (lidar). Increasingly, there are attempts to exploit for-profit entities (e.g., ASPRS) to facilitate the exchange of these 3D point data sets beyond mere visualization. lidar data while preserving the properties of the raw data. However, current Spatial Information Systems provide only In this paper, the integration of all required functionality limited 3D support. Even commercial systems advertising for storing, indexing, manipulating, and analyzing 3D point in-built, 3D data types provide only minimal functionality. clouds within an Management System Specifically, there is no effective means of indexing large 3D (SDBMS) as a viable solution is considered with respect to an point datasets, which is crucial for efficient analysis and octree index implementation atop the Oracle Extensible engineering use. Also, many datasets are information rich Indexing Framework (OEIF) (Laefer et al., 2009). This (e.g., contain color or some other associated semantic approach greatly benefits spatial queries on a variety of 3D information), which has yet to be fully exploited. This paper point clouds. The particular contribution of the method presents the implementation in a commercial spatial described within this paper is its applicability to 3D point database of a spatial indexing technique using an octree clouds of varied distributions, as well as those that contains data structure and highlights its advantages for sparse, as additional semantic information. well as uniformly distributed, aerial lidar data. The imple- mentation outperforms an existing r- index within the software, and offers additional functionality of attribute- Background based 3D grouping. A major difficulty with large lidar datasets lies in their efficient storing and indexing within conventional Spatial Information Systems (SISs). Two main efforts for storing and Introduction analysis can be identified thus far. On one side, conven- In recent years there has been an ever increasing availability tional Geographic Information Systems (GISs) store the data of three-dimensional (3D) point cloud datasets, such as those spread across several files. This approach has been followed generated from Light Detection and Ranging (lidar), also since the 1960s, when GISs were used to deal with positional known as laser scanning (e.g., Beaulieu and Clavet, 2009; data or data with spatial extent (Shekhar and Chawla, 2003). Huang et al., 2009; Kukko and Hyyppä, 2009). Lidar is a Presently, different GIS vendors utilize their own proprietary remote sensing technology that has gained widespread file formats for the representation of such data. Yet the popularity due to its use in environmental and disaster storage and management of any vast dataset in these file management scenarios (e.g., Straatsmas and Baptist, 2008; based systems continue to have the following disadvantages: Olsen, 2009; Laefer and Pradhan, 2006). The introduction of (a) data inconsistency, (b) data redundancy, (c) lack of GPS plus GLONASS and “fitting” software facilitating data multi-user concurrency, and (d) lack of data integrity. collection with increased accuracy and recent innovations in Analysis in such a scenario relies on frequent import and flight path design demonstrate new possibilities for large- export transactions of these files into various Computer scale, 3D data collection in urban environments. This Aided Design (CAD) packages or other proprietary software, increasing availability of very large lidar-based point cloud for example, Cyclone from Leica Geosystems. This process is datasets (typically containing hundreds of millions of time intensive and requires the availability of staff trained points) has challenged existing means of effective exploita- on multiple software packages. tion; support for management of these datasets is still in its Database Management Systems (DBMSs) on the other hand, provide a means for effective data handling of large data volumes, while facilitating the retrieval of information in vast datasets through Structured Query Language (SQL). Abu Saleh Mohammad Mosa is at the University of The related technology of a Spatial Database Management Missouri Informatics Institute, 241 Engineering Building System (SDBMS) relies on a DBMS. In such an arrangement, West, Columbia, MO 65211, and previously at University many vendors provide spatial extensions to their Object College Dublin. Relational Database Management System (ORDBMS). PostGIS Bianca Schön is at University College Dublin, 40 Knockmaree, Dublin, Ireland 20. Michela Bertolotto is at University College Dublin, School of Photogrammetric Engineering & Remote Sensing Computer Science and Informatics, Dublin, Ireland 4. Vol. 78, No. 9, September 2012, pp. 927–934. Debra F Laefer is at University College Dublin, School of 0099-1112/12/7809–927/$3.00/0 Civil, Environmental, and Structural Engineering, Dublin, © 2012 American Society for Photogrammetry Ireland 4 ([email protected]). and Remote Sensing

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING September 2012 927 11-051.qxd 8/17/12 6:11 PM Page 928

(http://www.postgis.org/), for example, is an open source Indexing 3D Point Cloud Data system for the storage of spatial data and conforms to the Indexing provides faster and more intelligent query execu- OGC (www.opengeospatial.org) standard. However, it does tions. Typically, data are structured into a hierarchical tree. not provide any in-built support for vast 3D point cloud Queries then need only follow certain branches and may data. Oracle Spatial, a commercial system, has recently avoid others. In principle, spatial queries on 3D point clouds included support for these data types. Their usefulness and could be performed directly on the entire dataset without capabilities are further evaluated within this paper for the indexing. In that scenario, for a particular spatial query, the purpose of comparing the efficiencies and capabilities of two corresponding spatial function would analyze an entire indexing approaches. dataset and then retrieve only relevant spatial objects. Looking forward, a scenario where many individuals However, since spatial functions are comparatively expen- and organizations are contributing data and trying to access sive, it would be rather costly to analyze an entire dataset. the subsequent combined data is easy to envision. In recent Instead, an appropriate spatial index needs to be created. calls for proposals, both Ireland’s National Road Authority1 Spatial indexes help retrieve candidate geometries for the and America’s Association for State Highway and Trans- specified spatial query, and the corresponding spatial portation Organizations2 have sought research proposals for function is then applied to this filtered dataset, which is the integration of both terrestrial and aerial remote sensing consequently reduced. In order to find the area of interest data based on increasing interest in this area. Such an and retrieve the most relevant dataset, the suitability of the environment will further strain the existing strategies to indexing method is critical. store these data in a meaningful way. Furthermore, there An effective algorithm for spatial indexing depends on will be a greater desire to exploit the 3D functionality of the the type and dimension of the spatial objects involved. For data. A key component of that is to have access to the efficient querying of point clouds, indexing must take all original data points. This will greatly facilitate the integra- three dimensions into account. The data must also be tion of multiple datasets. As such, the traditional approach processed in a timely fashion to facilitate efficient execution to store 3D point cloud data across various files or deriving of spatial queries. Some spatial indexing methods are other formats such as Digital Elevation Models (DEMs) for discussed in the following section, with a particular focus analysis purposes is likely to become less attractive. As on 3D point cloud data. such, new approaches must be considered to fully enable the increasingly rich and 3D nature of the data, such as Different Spatial Indexing Approaches better support of the raw point cloud data within SDBMSs. A spatial index organizes the spatial data and the underly- Applying an SDBMS for lidar data hosting allows for ing space in order to perform efficient execution of spatial improved data integrity, multi-user access, web access, and queries, either in an object-based or a space-based fashion. the use of SQL for spatial queries. However, such a spatial Object-based spatial indexes organize a dataset based on the system must support the data types for storing the geometry spatial objects distribution, while the space-based spatial in 3D Euclidean space (such as point, line, surface, and indexes subdivide the dataset based on a subdivision of the volume) that are based on a 3D geometric data model underlying space. (i.e., vector and/or raster data with their underlying geome- One of the most popular and enduring object-based try and topology). The query language of a 3D spatial system indexing techniques is the r-tree, which was developed by must also support operations and functions to handle 3D Guttmann (1984). A popular space-based alternative is the data types (Lee and Zlatanova, 2009). To date, support for two-dimensional (2D) and its 3D extension, the two-dimensional (2D) positional data is widely available in octree (Samet, 2006). An r-tree is a dynamic depth-balanced both GIS and SDBMS technology. However, very limited tree, which indexes the Minimum Bounding Rectangles capabilities are provided by commercial products for 3D data (MBRs) in 2D or Minimum Bounding Boxes (MBBs) in 3D of (Schön et al., 2009a). Presently, many of the benefits of spatial objects. The MBRs/MBBs of spatial objects form the these datasets remain relatively unexploited due to the leaf nodes of the tree, and multiple MBRs/MBBs are grouped inability of current systems to fully support 3D objects in together into larger rectangles/boxes to form intermediate a spatially accurate and meaningful manner. tree nodes. The process is repeated until only one rectan- The speed of data retrieval operations from a database gle/box is left that contains all the data that corresponds to table is a critical issue for handling large datasets. Indexing the root node of the tree. improves the speed with which operations are performed on In contrast, a quadtree is a space-based hierarchical tree a dataset by reducing the amount of data that needs to be structure that applies a recursive subdivision of a 2D space analyzed. In the spatial domain, indexes organize the dataset into four quadrants (also known as cells). It can be applied for efficient execution of spatial queries based on either to the indexing of spatial objects embedded in a 2D space. In objects or the underlying space. Common indexing tech- the basic quadtree structure, the subdivision of the space is niques for spatial datasets include object-based, r-tree in equally sized quadrants. Typically, a quadtree results in indexing and space-based/quadtree/octree indexing. Oracle an unbalanced tree for irregularly-distributed data. This is Spatial has provided r-tree indexing for spatial data, while beneficial, as empty patches of spaces are not stored within the previously supported quadtree has been deprecated. the structure and are, thus omitted during analysis. Several However, particular 3D spatial queries (e.g., window queries, variations of the quadtree structure have been developed for nearest neighbor) cannot currently be performed on 3D point data and linear data (Samet, 2006, p. 28). datasets using Oracle’s r-tree index, as will be further The tree-based are based on the recursive discussed. subdivision of the region into four congruent quadrants until a quadrant is homogeneous. The homogeneity condition for point data can be defined as the maximum number of points that a quadrant contains or other user-defined criteria. Alternatively, the homogeneity condition could be based on 1 http://www.nra.ie/Publications/DownloadableDocumentation/ the semantic information of the point data (e.g., color). For GeneralPublications/file,17098,en.pdf example, subdivision could occur until a single color 2 http://www.intrans.iastate.edu/reports/LIDAR_Safety.pdf percentage threshold is reached.

928 September 2012 PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING 11-051.qxd 8/17/12 6:11 PM Page 929

One adaptation of the quadtree for indexing high A standard octree implementation typically results in an volume point clouds is the so-called point region (PR) unbalanced tree. However, it can be implemented as a quadtree. It divides the underlying space up to a fixed tiling balanced tree (i.e., the entire space is subdivided only to a level. Each tile (known as the cell) is assigned with a unique specified tiling level). In this particular case, it only requires code (known as a cell code), which is used to index the storing the tiling level, as the tree structure can be rebuilt points that are covered by this tile. This process is how the during query processing by using this tiling level informa- quadtree was implemented in Oracle Spatial. The 3D tion. This approach has been adopted for the implementa- analogue of the PR quadtree is the PR octree, which has been tion of the octree structure as described later in this paper. implemented by the authors, as described in the next Another advantage of the octree is its capability to Section. This approach has the distinct advantage of being maintain the semantics of point data. Because the octree has performance efficient, as the current branch level does not the ability to store data points directly (instead of merely their need to be stored within the database, which would reduce bounding boxes), semantic information is accessible from the the efficiency of any query. On the other hand, this index and can, in fact, be used to build the index itself. As a approach loses the initial advantage of a space-based spatial practical example, this means that points with specific index to grow naturally from the underlying space and omit attributes can be grouped together in one cell. Regarding lidar empty areas. Indexing point data using a PR quadtree/octree data, an example of semantic criterion might be color based may be very useful since its quadrant/octant can contain the on red-green-blue (RGB). This criterion could allow cell data points along with their location information directly. subdivision until a given percentage of color similarity is In addition, it can store the semantic information of data reached within a cell. An r-tree index is not capable of points. The quadtree index was extended by De Floriani grouping data according to their semantics. Therefore poten- et al. (2008) to work with Triangulated Irregular Networks tially valuable information associated to individual points is (TINs). Theoretically, this approach could be generalized for lost. Figure 1 illustrates how an octree preserves the seman- Tetrahedral Irregular Networks (TENs) based on an octree tics, choosing color as a semantic criterion; for ease of illustra- structure in order to support true 3D functionality; currently, tion, all figures are presented in 2D, and the semantic of a there is no published work describing such research. region is illustrated as a fully homogenously colored region, Boubekeur et al. (2006) emphasized that hierarchical space even though the same principle applies in 3D and to regions of division based structures (e.g., octree and k-d tree) are only partial uniformity (e.g., 80 percent color similarity within critical for surface representation, as they are purely volume cells [for clarity in publishing instead of colors, different based. Therefore, they suggested a combined approach shapes were substituted to indicate distinct attributes]) called the Volume-Surface tree (VS tree), which combines an (Figure 1). A further advantage of using an octree for 3D point octree structure with a of quadtrees to describe a discrete cloud indexing lies in the possibility to optimize 3D point 3D surface. The VS-tree is constructed by switching back to cloud visualization (Koo and Shin, 2005). Rendering of 3D the quadtree during the recursive split involved in the point clouds is computationally expensive, but an octree can octree construction, as soon as a certain “height field” is be used to filter visible points for rendering a specific view reached. However, this approach was found to break down frustum, instead of rendering all points in the dataset at once. to mere octree indexing on certain surfaces. Several other Naturally, the decision of which indexing structure to strategies have been developed for efficient indexing of use depends on many factors, such as data distribution multi-dimensional data. However, there is limited vendor and type. The following Section presents the approach support for these. Most of the commercial systems provide employed within this paper, an octree implementation atop only support for 2D index creation with a simple 3D exten- the Oracle Extensible Indexing Framework (OEIF). This sion (Arens, 2005). Efficient indexing of multi-dimensional approach is further evaluated in the next section, followed data and true 3D index creation is still an ongoing research by a Discussion. problem as summarized by Schön et al. (2009b). Alterna- tively, this paper presents the implementation of a 3D space- based indexing structure based on an octree, which is not currently available commercially (Laefer et al., 2009). In the following subsection, the advantages of octree indexing over r-tree indexing for 3D point cloud data are discussed.

Advantages of an Octree Index for 3D Point Cloud Data Employing an octree structure for indexing 3D point cloud data has distinct advantages over using an r-tree data structure. One major benefit is that the octree can be applied directly on the point geometry, as opposed to merely the bounding boxes that an r-tree relies upon. As such, there is no need to decide how to implement a bounding box for a 3D point. Furthermore, the octree is a hierarchical tree where nodes are disjointed. This means that the regions correspon- ding to tree nodes are non-overlapping. On the other hand, bounding boxes in r-trees are often overlapping. If bounds overlap, more branches have to be traversed to process a query, which reduces an index’s efficiency. In Oracle Spatial, the implementation of the r-tree index stores the tree structure into a table and selects a node using internal SQL statements, while each node is visited (Kothuri, 2002). Figure 1. Octree segmentation using a For that reason, query processing using an r-tree index in semantic as a grouping parameter (regions Oracle Spatial involves processing several recursive SQL are subdivided until the tile contains only statements, which prolongs the query processing time points of one color). (Kothuri, 2002).

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING September 2012 929 11-051.qxd 8/17/12 6:11 PM Page 930

Implementation of an Octree Structure in Oracle Spatial A conventional octree is an unbalanced hierarchical This section presents an implementation of a new octree- tree, which would require storage of its logical tree structure based index structure for 3D point clouds within Oracle in a SDBMS for reconstruction of the tree structure during Spatial 11g. The extensibility capability of the Oracle SDBMS query processing. As previously described, this issue could was utilized in order to implement this index. The frame- introduce inefficient query processing due to the issuance of work for indexing is known as OEIF, and the new index is a several internal recursive SQL select statements generated domain index. The framework defines a set of interface during each node visit. This matter is resolved by construct- methods, which is required to be implemented in an object ing a balanced tree structure with a fixed tiling level. In this type, called “indextype.” The name of the interface is case, only the tiling level information (as opposed to the ODCIIndex, where ODCI stands for Oracle Data Cartridge whole tree) needs to be stored for tree reconstruction. The Interface. The methods of the ODCIIndex interface are pre-selection of an appropriate tiling level for a specific categorized into four classes: (a) index definition methods, dataset is a crucial factor, which involves considerations (b) index maintenance methods, (c) index scan methods, and regarding the dataset’s area and size. As such, this is a (d) index metadata methods. drawback of this approach, as experimentation with differ- An indextype is an object that specifies the routines that ent levels is needed in order to optimize performance for a manage a domain (application-specific) index. An indextype specific dataset. has two major components: (a) the methods that implement In the current implementation, the user is allowed to the index’s behavior, and (b) the operators that the index specify the tiling level of the octree through the parameter supports. In this paper, a new indextype was implemented OCTREE_LEVEL during index creation. The 3D region is in OEIF to enable use of the proposed octree structure for recursively divided into eight congruent cells up to the lidar data indexing in Oracle Spatial. The name of the new specified tiling level. Each cell is associated with a unique indextype is OCTREEINDEX. This index implementation is code, which is hereafter referred to as the cell code. The cell also comprised of an operator implementation called code is obtained by using z-ordering (i.e., Morton encoding) OT_CLIP_3D, which performs a window query on any given of all cells at the specified level (Morton, 1966). Each point 3D point cloud, which is stored in an SDO_GEOMETRY data is indexed by the associated cell code of the octant that type. contains the point. The ROWID of the point and the associ- Before introducing 3D data types, Oracle Spatial relied ated cell code are stored in the index storage table. The heavily on SDO_GEOMETRY. More recently Oracle Spatial metadata (e.g., tiling level, index name, index owner, max has moved to SDO_PC as the main data type employed for level, min level) for the entire index are stored as a row in the storage of multi-dimensional point cloud data. With this, the index metadata table. a set of points are grouped and stored as a Binary Large The 3D query processing using this implementation is Object (BLOB) object in a row. While there is no upper illustrated in Figure 2. To generate the result set for a bound on the number of points in an SDO_PC object, the spatial query, the octree index acts as the primary filter to current version of Oracle Spatial offers only limited place- find the area of interest or candidate geometries for this holders for the storage of information alongside locational query. Both a primary filter and a secondary filter are attributes, as only nine attributes can be stored together in used during the query process. The area of interest is the one element within SDO_PC. Another disadvantage is that union of the cells of the octree that interacts spatially Oracle Spatial does not yet offer functionality to update (e.g., intersect, touch, inside, covered-by) with the query SDO_PC objects. Consequently, the SDO_GEOMETRY data geometry, as established by the primary filter. These cells type remains highly useful, as it still provides the greatest are identified by the cell code, and candidate geometries are flexibility in storage of any geometry type including 3D data identified by the associated cell code from the index storage points. In particular, when considering 3D point clouds, it is table. These candidate geometries are passed through the desirable to store the locational and attribute information intermediate filter and divided into two sets. Cells inside or (e.g. color, intensity) in the same table as semantic informa- covered-by the query geometry are identified as exact tion often times directs feature recognition (typically applied matches. The points associated with these cells are sent at a later stage in the workflow). For these reasons, the directly to the result set. The remaining cells (those that implementation herein relies on SDO_GEOMETRY, instead intersect or touch the query window) are identified as of SDO_PC. However, the approach could easily be adapted unknown and pass through the secondary filter. The to be used with SDO_PC. secondary filter is a spatial function, which corresponds to

Figure 2. 3D query processing using octree index.

930 September 2012 PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING 11-051.qxd 8/17/12 6:11 PM Page 931

the spatial query. The required ODCIIndex interface Datasets combined from different sources are becoming methods for implementing the proposed octree index for more irregularly distributed with significantly higher lidar data atop OEIF are implemented as Java callouts. densities in some areas. To evaluate potential performance A previously available Java API was harnessed for this benefits of octree-based indexing on such irregularly- purpose (Kothuri, 2007, p. 223). It enables applications distributed lidar point clouds, portions of data were selec- written in Java to access and process geometry objects tively removed from this high density dataset. The original managed in an Oracle database with Oracle Spatial. The dataset contained nearly 66 million points over 0.25 km2, details of this implementation are available in Mosa (2010). including points on the ground and road surfaces, rooftops, tree canopies, and building facades. The initial point distribution was quite uniform, with a density of 225 points Evaluation of the Octree Spatial Index per square meter almost everywhere. The irregular dataset This section provides an evaluation of the implementation was achieved by removing points from the original data set of the octree index presented in the previous section. where the z-coordinate was larger than 20 m or less than A window query is used to compare response times of the 10 m. The resulting dataset contains points on building existing Oracle Spatial r-tree and the newly implemented facades. The dataset also contains points on rooftops and octree index. This evaluation has been conducted on a tree canopies, where the height is less than 20 m. This issue ™ computer with an Intel Core2 Duo CPU 2.53GHZ and 4GB generates an irregular distribution of points within the 3D RAM, 7,200 SATA hard drive using Oracle 11g release point cloud with a variable density in different areas. The 11.1.0.6. The 3D point cloud dataset is stored in Oracle’s original data set is shown in Figure 3 and the resulting SDO_GEOMETRY data type. In order to perform spatial irregularly distributed data set in Figure 4. Table 1 presents queries on 3D point cloud data, the dataset is indexed using the density distribution in a different area of the dataset. an r-tree and an octree in separate runs. The r-tree index Results of run times are shown in Figure 5. was created using Oracle’s existing in-built spatial index. Table 1 presents the average query response time along Presently, a 3D window query cannot be performed using with the density distribution in different regions. A square Oracle’s r-tree index, as only one 3D spatial operator can be window of size 10,000 m2 (100 m ϫ 100 m) was chosen for applied. As such, true 3D window queries cannot be per- the window query and moved around the underlying area formed using Oracle’s r-tree index. Thus, only the 2D in a random pattern. A total of 25 query windows were window query could be performed on the 2D r-tree index chosen in a 5 ϫ 5 grid. The query windows are presented in using the SDO_RELATE operator by providing “inside and Table 1 in ascending order according to their point density. touch” masks (Kothuri, 2007, p. 274). This provides func- For every query window, 10 queries were performed and the tionality similar to a general window query. Here, the 2D query report response time represents the average of these r-tree index is created on the 2D projection of the 3D point queries. Table 1 illustrates the comparison of window query cloud data. The overlap of sibling nodes and the uneven response time between the r-tree and the octree indexes. size of nodes in an r-tree are known to develop possibly The minimum density was 1.46 m2, maximum density was inefficient query execution (Zhu, 2007). 141.44 m2, and median density was 73.5 m2. With the octree index, the 3D window query can be The query response time increased with some variations performed on 3D lidar point cloud data using the operator with increased point density. Overall, the octree index OT_CLIP_3D. A tiling level of five was selected for the performs in a very consistent manner (Figure 5), and in most candidate dataset used in this example. An increased tiling cases, outperformed the in-built r-tree index by an average level would result in a decreased indexed point per cell factor of 2.4. For the window of 27.37 points/m2 density, the count, as well as decreased candidate geometries. The octree was 2.4 times faster than r-tree. Additionally, for increase of tiling levels also results in an increase of leaf window densities below the median value the octree was on nodes (e.g., the total number of leaf nodes at tiling level ‘n’ average twice as fast as the r-tree. On the other hand, for is 8n), as well as memory consumption during octree window densities over the median value the octree was on manipulation. average three times faster than the r-tree. For a highly dense The comparison of a 2D (x- and y- coordinates) window region (141.44 m2), the octree was about five times faster than query using an r-tree index with a 3D (x, y, and z) window the r-tree. The octree in most cases outperformed the r-tree. query using octree index requires the same number of resulting geometry for a window query, as query response time increases with increases with window size, as well as Discussion the number of resulting geometries (Kothuri, 2002). To The research presented in this paper implemented an octree ensure the same number of resulting geometries for both the index for 3D point cloud data, employing Oracle’s extensible octree and r-tree indexes, the minimum and maximum value indexing framework. An operator was implemented to of the z-coordinates of the query window were set to the perform a 3D window query. The implementation is minimum and maximum values of the underlying space in described along with some optimizations. The newly the case of the octree index. To this end, two randomly implemented octree index and Oracle’s inbuilt r-tree index selected data subsets from a dense aerial lidar flyover of are compared using a high density aerial lidar point cloud Dublin’s city center (Hinks et al., 2009) were selected for dataset. The evaluation highlighted distinct advantages of this experiment. One contained almost 2.9 million points using an octree based index for both regularly and irregu- and the other nearly 66 million points. Query response larly-distributed point cloud data. For regularly-distributed times were compared for the two index types for a variety data, the octree index consistently outperformed an r-tree of window sizes. For the smaller dataset (2.9 million), the index by two to eight times for almost every window size. octree was twice as fast as r-tree for the small window of For irregularly-distributed data, the octree index consistently 25 m2 and 8 times as fast for the large window of 2,500 m2. outperformed the r-tree index by 2 to 5 times, except for low For the larger dataset (66 million), the r-tree outperformed density areas (1.46 points/m2). However, the current imple- the octree for the small window of 400 m2 size, but for large mentation can be optimized further to improve performance windows of 1,600 m2 and above, the octree performed in lower density areas. distinctly better, with a six-fold improvement for a In this work, Oracle Spatial was chosen due to its 40,000 m2 window. current ability to store 3D data and the opportunity to

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING September 2012 931 11-051.qxd 8/17/12 6:12 PM Page 932

Figure 3. Original distribution of 3D point cloud (65,562,235 points).

Figure 4. Irregular distribution of 3D point cloud (17,517,406 points).

932 September 2012 PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING 11-051.qxd 8/17/12 6:12 PM Page 933

Figure 5. Window query of r-tree versus octree indexing 17,517,406 irregularly distributed points.

TABLE 1. DENSITY DISTRIBUTION OF QUERY WINDOWS (17,517,406 IRREGULARLY DISTRIBUTED POINTS)

Avg. Query Avg. Query Response Response Window Total Density Time in sec. Time in sec. No. X1 Y1 X2 Y2 Area (m2) Point (points/m2) (r-tree) (octree)

1 233,900 316,300 234,000 316,400 10,000 14,579 1.46 15.58 61.72 2 233,700 316,200 233,800 316,300 10,000 273,688 27.37 144.22 59.96 3 233,700 316,400 233,800 316,500 10,000 321,059 32.11 101.42 60.42 4 233,600 316,400 233,700 316,500 10,000 337,032 33.7 93.88 59.26 5 233,800 316,200 233,900 316,300 10,000 345,263 34.53 96.51 59.44 6 233,700 316,100 233,800 316,200 10,000 381,920 38.19 155.5 63.32 7 233,800 316,300 233,900 316,400 10,000 406,465 40.65 102.41 60.42 8 233,700 316,300 233,800 316,400 10,000 433,147 43.31 109.32 59.66 9 233,900 316,100 234,000 316,200 10,000 451,694 45.17 120.44 59.68 10 233,500 316,400 233,600 316,500 10,000 507,923 50.79 103.09 60.87 11 233,700 316,000 233,800 316,100 10,000 531,261 53.13 140.68 67.38 12 233,900 316,200 234,000 316,300 10,000 563,964 56.4 114.76 68.58 13 233,800 316,100 233,900 316,200 10,000 734,972 73.5 187.72 61.45 14 233,900 316,000 234,000 316,100 10,000 797,447 79.74 116.39 62.11 15 233,900 316,400 234,000 316,500 10,000 809,561 80.96 154.43 62.68 16 233,600 316,300 233,700 316,400 10,000 812,631 81.26 214.75 61.67 17 233,800 316,000 233,900 316,100 10,000 854,245 85.42 135.05 62.2 18 233,800 316,400 233,900 316,500 10,000 862,660 86.27 193.59 62.89 19 233,500 316,300 233,600 316,400 10,000 936,929 93.69 201.62 64.33 20 233,600 316,100 233,700 316,200 10,000 1,018,365 101.84 246.1 64.98 21 233,500 316,000 233,600 316,100 10,000 1,139,836 113.98 234.01 112.83 22 233,600 316,200 233,700 316,300 10,000 1,166,117 116.61 261.84 65.9 23 233,500 316,100 233,600 316,200 10,000 1,168,042 116.8 226.73 102.16 24 233,500 316,200 233,600 316,300 10,000 1,236,741 123.67 190.47 65.49 25 233,600 316,000 233,700 316,100 10,000 1,414,439 141.44 331.8 66.44

PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING September 2012 933 11-051.qxd 8/17/12 6:12 PM Page 934

compare indexing options. However, the implementation model generation, Journal of Computing in Civil Engineering, could be adapted for other SDBMS. Currently, in Oracle ASCE, 23(6), 330–339. Spatial, the r-tree or quadtree index can be applied on the Huang, S.L., S.A. Hager, K.Q. Halligan, I.S. Fairweather, block extent column of the SDO_PC data type. However, it A.K. Swanson, and L.A. Crabtree, 2009. A comparison of is possible to implement an octree index on the block extent individual tree and forest plot height derived from LiDAR and of the SDO_PC. Currently, using Oracle’s SDO_PC, only the InSAR, Photogrammetric Engineering & Remote Sensing, block extents are indexed rather the actual point geometries, 75(2):159-167. or one could access these points directly from the block Koo, Y.-M., and B.-S., Shin, 2005. An efficient point rendering using column and index them. Furthermore, it is also possible to octree and texture lookup, Proceedings of Computational Science and Its Applications - ICCSA 2005, Lecture Notes in implement a two-step index, where an octree indexes the Computer Science, No.3482, Springer-Verlag, Berlin, Germany, points inside a block and, therefore, maintains the semantic pp. 1187–1196. information inside that block, and an r-tree index serves as a Kothuri, R.K., S. Ravada, and D. Abugov, 2002. Quadtree and r-tree higher level index that is applied to the block extents, as indexes in Oracle Spatial: A comparison using GIS data, r-tree indexing is more suitable for polygon objects than Proceedings of the ACM SIGMOD International Conference on would be octree indexing. Further work will fully evaluate Management of Data, 02-06 June, Madison, Wisconsin, ACM, this approach. pp. 546–557. Finally, the method presented in this paper employs Kothuri, R., A. Godfrind, and E. Beinat, 2007. Pro Oracle Spatial for only one operator, which implements a window query. Oracle Database 11g, APress. A more comprehensive evaluation is needed in order to Kukko, A., and J. Hyyppä, 2009. Small-footprint laser scanning assess the octree index’s full potential for other query simulator for system validation, error assessment, and algorithm operators, such as nearest neighbor or within distance. Since development, Photogrammetric Engineering & Remote Sensing, octree indexes are useful for solving the visibility of points 75(10):1177–1189. during 3D point cloud rendering, an operator can be imple- Laefer, D.F., M. Bertolotto, B. Schoen, and A.S.M. Mosa, 2009. mented based on the octree index implemented for this Enablement of three-dimensional hosting, indexing, analysing, purpose. The operator should have the view frustum or and querying structure for spatial databases, Patent Application window as an input parameter and return only the visible No. 09177926, Europe (full filing December 2009). points for this window to facilitate the extraction of visual Laefer, D.F., and A. Pradhan, 2006. Evacuation route selection based on tree-based hazards using LiDAR and GIS, Journal of points for a window specified through SQL. In the prototype, Transportation Engineering, ASCE, 132 (4): 312–20. the tiling level was user selected. Further work will enhance the prototype by incorporating a feature for automatic tiling Morton, G.M., 1966. A Computer Oriented Geodetic Data Base and a New Technique in File Sequencing, IBM Technical Report, level determination. Ottawa, Canada. Mosa, A.S.M., 2010. Hosting, Indexing and Visualization of Three- Acknowledgments Dimensional (3D) Point Clouds, M.Sc. thesis, University College Dublin, National University of Ireland. This work was generously supported by the National Digital Research Centre Grant EoI/0701/008 and data was provided Olsen, M.J., E. Johnstone, N. Driscoll, S.A. Ashford, and F. Kuester, 2009. Terrestrial laser scanning of extended cliff sections in through funding from Science Foundation Ireland Grant dynamic environments: Parameter analysis, Journal of Surveying 05/PICA/I830 and Ireland’s Environmental Protection Engineering, ASCE, 135(4):161–169. Agency Grant 2005-CD-U1-M1. Samet, H., 2006. Object-based and image-based image representa- tions, Foundations of Multidimensional and Metric Data Structures (H. Samet, and A. Palmeiro editors), Morgan- References Kaufmann, San Francisco, pp. 211–220. Arens, C., J. Stoter, and P. van Oosterom, 2005. Modelling 3D Schön, B., D.F. Laefer, S.W. Morrish, and M. Bertolotto, 2009a. spatial objects in a Geo-DBMS using a 3D primitive, Computers Three-dimensional spatial information systems: State of the art & Geosciences, Elsevier, 31(2):165–177. review, Recent Patents in Computer Science, Bentham Science Beaulieu, A., and D. Clavet, 2009. Accuracy assessment of Canadian Publishers, 2(1):21–31. digital elevation data using ICESat, Photogrammetric Engineer- Schön, B., M. Bertolotto, and D.F. Laefer, 2009b. Storage, manipula- ing & Remote Sensing, 75(1):81–86. tion, and visualization of LiDAR data, Proceedingsof the 3rd Boubekeur, T., W. Heidrich, G. Xavier, and S. Christophe, 2006. International Workshop, 3D-ARCH-2009: 3D Virtual Reconstruc- Volume-surface trees, Computer Graphics Forum, Wiley- tion and Visualization of Complex Architectures (F. Remondino, Blackwell, 25(3):399–406. S. El-Hakim, and L. Gonzo editors)., 25-28 February, Trento, De Floriani, L., M. Facinoli, P. Magillo, and D. Dimitri, 2008. Italy, International Archives of Photogrammetry, Remote Sensing A hierarchical spatial index for triangulated surfaces, and Spatial Information Sciences, Volume XXXVIII-5/W1, ISSN Proceedings International Conference on Computer Graphics 1682–1777. Theory and Applications, 22-25 January, Funchal, Madeira, Shekhar, S., and S. Chawla, 2003. Spatial databases - A tour, Portugal, INSTICC Press, pp. 86–91. Prentice Hall. London. Lee, J., and S. Zlatanova (editors), 2009. 3D Geo-Information Straatsma M.W., and M.J. Baptist, 2008. Floodplain roughness Sciences, Lecture Notes in Geoinformation and Cartography, parameterization using airborne laser scanning and spectral Springer-Verlag, Berlin, Germany. remote sensing, Remote Sensing of Environment, 112(3): Guttman, A., 1984. R-trees: A dynamic index structure for spatial 1062–1080. searching, Proceedings of the 1984 ACM SIGMOD International Zhu, Q., J. Gong, and Z. Yeting, 2007. An efficient 3D r-tree spatial Conference on Management of Data, 18-21 June, Boston, index method for virtual geographic environments, ISPRS Massachusetts, ACM, pp. 47–57. Journal of Photogrammetry and Remote Sensing, 62(3):217–224. Hinks, T., H. Carr, and D.F. Laefer, 2009. Flight optimization (Received 05 July 2011; accepted 28 October 2011; final version algorithms for aerial LIDAR capture for urban infrastructure 26 January 2012)

934 September 2012 PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING