Using Foreign Data Wrapper in Postgresql to Expose Point Clouds on File System

MSc thesis in Geomatics Using Foreign Data Wrapper in PostgreSQL to Expose Point Clouds on File System Mutian Deng November 2020 A thesis submitted to the Delft University of Technology in partial fulfillment of the requirements for the degree of Master of Science in Geomatics Mutian Deng: Using Foreign Data Wrapper in PostgreSQL to Expose Point Clouds on File System (2020) cb This work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. The work in this thesis was carried out in the: Geo-Database Management Centre Faculty of Architecture & the Built Environment Delft University of Technology Supervisors: Dr. Martijn Meijers Prof.dr. Peter van Oosterom Co-reader: Dr. Hans Hoogenboom Abstract In the recent years, the point clouds data becomes an important source of geographic information thanks to the rapid development of data acquisition techniques and the wide range of the applications. Light Detection And Ranging (LiDAR) is the main survey campaign that generates highly accurate point clouds data in an efficient manner. LiDAR point clouds data has two main features: on the one hand, a point clouds data has a big volume containing million and even billion point records, on the other hand, it is multi-dimensional which means each point records consist of several fields, there can be Red,Green and Blue (RGB) values, intensity and gps time next to the 3D position. In order to release the great potential of this kind of geospatial data, the point clouds data should be made full and deep use. In this context, an efficient solution for handling LiDARdata is required. There are two main solution to manage LiDARdata, one is file-based solution using files for storage and file tools for processing; another one is Database Management System (DBMS) solution storing data in the database and supporting capabilities on LiDAR data. Both solutions have it own benefit and problems, thus a hybrid solution is proposed in this research. This research is aimed to use a hybrid solution to combine the advantages of file system solution and DBMS solution, therefore the Foreign Data Wrapper (FDW) is introduced and developed in this research. FDW is an extension of PostgreSQL that can access data residing outside the PostgreSQL using Structured Query Language (SQL) language. Thus be means of FDW, we can use LiDAR point clouds on file system directly in PostgreSQL. Since in the real-world applications, different LiDAR files are collected in a file system rather than using only massive file, it is necessary to develop a FDW of Point Cloud Data Man- agement System. The aim of this research is to answer the main research question: to what extent we can use LiDAR point clouds directly in the PostgreSQL by means of FDW, and thus a FDW supporting the Point Cloud Data Management System is implemented. Then, the range and performance of its functionality are evaluated. The results shows this FDW solution is feasible while the querying time is relevant to the number of returned points. The benefit and problem of this FDW are analyzed. v Acknowledgements My first gratitude is expressed to my mentors: Martijn and Peter who help me a lot in the thesis. The graduation cannot be finished without their support. Martijn provides me many technical support and direction when I got lost in dealing with troubles and when I missed some important points to consider in the research. Peter gives me a lot helpful advice from his broad knowledge which give me direction when I am unsure about the method. Due to the Corona, the research of graduation thesis is carried out online and it became super difficult to have face-to-face meetings. Martijn, Peter and I hold the progress meeting once two weeks and they both attended the meeting and gave me very valuable supervise during every meeting. Finally I want to thanks to my parents who gives me financial support that let me gain this great study and life experience in Delft. And I also want to thanks to my friends for their accompany and help. vii Contents 1 Introduction 1 1.1 Background . .1 1.2 Problem statement . .2 1.3 Research question . .4 1.4 Thesis outline . .5 2 Related work 7 2.1 Point Clouds Data Management . .7 2.2 File system . .7 2.2.1 File formats . .8 2.2.2 File tools . 11 2.2.3 Pros and Cons . 12 2.3 Database Management System . 13 2.3.1 Databases . 13 2.3.2 Storage model . 15 2.3.3 Pros and Cons . 16 2.4 SQL Management of External Data . 16 2.4.1 Accessing . 16 2.4.2 Querying . 18 3 Methodology 21 3.1 Foreign Data Wrapper . 21 3.1.1 Principle of FDW . 21 3.1.2 Using FDW . 21 3.1.3 Writing FDW . 22 3.2 Data access . 24 3.2.1 Data storage . 24 3.2.2 Data organization . 25 3.2.3 Data display . 26 3.3 Data functionality . 27 3.3.1 Query . 27 3.3.2 Data manipulation . 30 3.3.3 Type conversion . 31 3.4 System Architecture . 31 3.4.1 System components . 33 3.4.2 System process . 34 4 Implementation 37 4.1 Tools and datasets . 37 4.1.1 Hardware . 37 4.1.2 Software . 37 4.1.3 Datases . 38 4.2 Point Clouds Data Management System . 41 4.2.1 Metadata file . 41 4.2.2 Data access . 41 4.2.3 Querying . 42 4.3 Benchmark . 45 ix Contents 5 Results and Analysis 49 5.1 Feasibility test . 49 5.2 Efficiency test . 52 5.3 Scalability test . 54 6 Conclusion 57 6.1 Conclusion . 57 6.2 Future work . 58 x List of Figures 2.1 Management of External Data (MED) components . 18 2.2 Shortened title for the list of figures . 18 3.1 Foreign data wrapper solution for LiDAR data . 32 3.2 System Architecture . 33 4.1 Datasets from AHN3 . 38 4.3 Design of dataset scales . 46 4.4 Design of Design of querying levels on small scale datasets . 46 4.5 Feasibility test . 47 4.6 Efficiency test . 47 4.7 Scalability test . 48 5.1 Shortened title for the list of figures . 52 xi List of Tables 1.1 Advantages and disadvantages of DBMS solution . .4 1.2 Advantages and disadvantages of file system solution . .4 2.1 LAS file Point Data Record Format 0 . 10 2.2 LAS classification values (the first 10) . 10 4.1 Metadata info about each Acteel Hoogtebestand Nederland (AHN) dataset . 39 4.2 Sub datasets splitted from AHN3 C 37EN1.LAZ ........................ 40 4.3 Design of dataset scales . 46 4.4 Design of querying levels . 46 4.5 Feasibility test . 47 4.6 Scalability test . 48 5.1 Test design of number of relevant files . 49 5.2 Querying rectangles of mini level on large scale data . ..

Using Foreign Data Wrapper in Postgresql to Expose Point Clouds on File System

Having Clause in Sql Server with Example

Querying Graph Databases: What Do Graph Patterns Mean?

Sql Create Table Variable from Select

Alias for Case Statement in Oracle

Look out the Window Functions and Free Your SQL

How to Get Data from Oracle to Postgresql and Vice Versa Who We Are

SQL and Management of External Data

Max in Having Clause Sql Server

Chapter 11 Querying

Firebird 3 Windowing Functions

Relational Algebra and SQL Relational Query Languages

SQL/PSM Stored Procedures Basic PSM Form Parameters In