Integration of Spatial Vector Data in Enterprise Relational Database Environments

Integration of Spatial Vector Data in Enterprise Relational Database Environments Dissertation zur Erlangung des akademischen Grades Doktor-Ingenieur (Dr.-Ing.) vorgelegt dem Rat der Fakultät für Mathematik und Informatik der Friedrich-Schiller-Universität Jena im Juli 2006 von Dipl.-Inf. Knut Stolze geboren am 8. Januar 1975 in Jena, Thüringen Gutachter: 1. Prof. Dr. Klaus Küspert, Friedrich-Schiller-University Jena 2. Prof. Johann-Christoph Freytag, Ph.D., Humboldt-Universität zu Berlin 3. Dr. Albert Maier, IBM Deutschland Entwicklung GmbH, Böblingen Tag der letzten Prüfung des Rigorosums: 7. November 2006 Tag der öffentlichen Verteidigung: 10. November 2006 ii To My Wife Annett iv Abstract Spatial and geometric data can be found in virtually every relational database. Such data is often not explicitly modeled but rather hidden in addresses, for example. Unless spatial data is treated as first class citizen in the relational database management systems (RDBMSs), spatial features cannot be leveraged and exploited by applications to their full extent. Several products provide the capabilities to store and query geometric data directly in relational databases. Those products include dedicated data types along with associated spatial functions and indexing mechanisms. The SQL/MM spatial standard exists to ensure that the spatial interface provided by products are compatible. The standard covers basic spatial functionality only. Data types are defined and include a set of routines that are used to construct spatial values from external data formats, compare two geometries regarding their spatial relationship, generate new geometries, or to extract spatial properties. However, more advanced functionality is crucial for applications and their business logic is still lacking. Therefore, applications either disregard spatial information completely or noticeable effort is invested for the addition of customized spatial logic. This thesis is dedicated to closing the most significant gaps in the standard that we identified to further broaden the user base. The first issue addressed in the thesis is the insufficient support of spatial data types in programming languages and the respective database connectivity mechanisms, most prominently Java and JDBC. We extend the JDBC interface to incorporate a spatial class hierarchy. Instances of these classes can be directly retrieved from the RDBMS or passed to it, avoiding the current clumsy and time-consuming stream-based interface. Other major enhancements to the standard that are contributed by this thesis are graph and network related routines. We exploit the topological properties of geometries for functions that implement graph algorithms while treating graphs transparently as indexing structures for geometries only. Shortest path or traveling salesman problems can be solved directly on street networks. The results are in a relational fashion so that they can even be combined with data from other tables in the database. The SQL/MM spatial standard is only concerned with two-dimensional data. This is insufficient. Many applications require at least a third dimension, for instance CAD/CAM. We develop and implement the concepts for the integration of 3D geometries into the standard. This work also comprises the necessary considerations for operations in true three-dimensional data space. Enterprises operate diverse database systems and architectures. It is important to es- tablish transparent access to spatial and non-spatial data alike in such environments. Federation and replication are two mechanisms to provide such access in distributed systems. We develop the techniques to integrate spatial data in federation and replication setups while relying on the available products only. This restriction is based on the fact that new and independent solutions and prototypes are usually not an option in more complex configurations. Kurzfassung (German Abstract) Räumliche und geometrische Daten existieren in praktisch jeder relationalen Datenbank. Diese Daten sind jedoch oft nicht explizit modelliert, sondern z. B. in Adressen versteckt. Aber nur die explizite Modellierung erlaubt die Nutzung der räumlichen Eigenschaften durch Applikationen. Verschiedene Produkte stellen Funktionalität zur direkten Verwal- tung von räumlichen Daten in relationalen Datenbanken zur Verfugung.¨ Es sind dedi- zierte Datentypen, sowie zugehörige Routinen und Indexmechanismen implementiert. Zur Harmonisierung der Schnittstellen besagter Produkte wurde die SQL/MM Spati- al Norm entwickelt. Allerdings definiert die Norm nur Grundfunktionalität. So existieren Datentypen und zugehörige Routinen fur¨ die Konvertierung zwischen Geometrien und zeichenbasierten externen Datenformaten, Routinen zum Vergleich zweier Geome- trien gemäß ihrer räumlichen Beziehung, Routinen zum Erzeugen neuer Geometrien und Routinen zum Extrahieren von räumlichen Eigenschaften. Fur¨ Anwendungen sind aber höhere und komplexere Funktionen notwendig, damit räumlichen Daten nahtlos integriert werden können. Diese Arbeit widmet sich den wichtigsten dieser Lucken¨ und macht Vorschläge, um sie baldigst in der Norm zu schließen. Ein vordringliches Problem besteht bereits bei der direkten Einbindung der räumlichen Daten in die Schnittstellen zwischen Datenbanksystem und Anwendung, z. B. Java und JDBC. JDBC wird um eine Klassenhierarchie fur¨ räumlichen Daten erweitert. Instanzen dieser Klassen können direkt zwischen Anwendung und RDBMS ausgetauscht werden. Weitere Modifikationen der SQL/MM Spatial Norm beziehen sich auf Graphen und Netzwerkoperationen. Die topologischen Eigenschaften von Geometrien werden fur¨ neue Funktionen ausgenutzt, wobei die Graphen transparent als Indexstrukturen Anwendung finden. Somit können kurzeste¨ Pfade berechnet oder das Traveling-Salesman-Problem direkt auf den gespeicherten Straßendaten gelöst werden. Die Ergebnisse werden in relationaler Form aufgebaut, so dass sie direkt mit anderen Daten in der Datenbank verbunden werden können. Die SQL/MM Spatial Norm bezieht sich ausschließlich auf zwei-dimensionale räumliche Daten. Fur¨ CAD/CAM Applikationen ist dies z. B. unzureichend, und zumindest eine weitere Dimension ist notwendig. Die Konzepte fur¨ die Integration von 3D Objekten in die Norm werden entwickelt und erprobt. Dies umfasst auch die Erweiterung aller bereits vorhandener Operationen fur¨ den drei-dimensionalen Raum. Viele verschiedene Datenbanksysteme und Architekturen finden in kommerziellen Pro- duktionsumgebungen Einsatz. Transparenter Zugriff auf räumliche und nicht-räumliche Daten ist in solchen Umgebungen essentiell. Föderation und Replikation sind zwei Me- chanismen, um Daten zwischen den Komponenten verteilter Systeme auszutauschen. Diese Arbeit entwickelt die Techniken, um räumliche Daten in diese Prozesse zu inte- grieren, da sie bisher vernachlässigt wurden. Dabei wird gezielt auf die Möglichkeiten existierender Produkte gebaut, weil neue Lösungen und Prototypen ublicherweise¨ nicht sicher in solch komplexen Konfiguration einzusetzen sind. Contents List of Figures xiii List of Tables xvii List of Listings xix Preface xxi I Basics 1 1 Introduction 3 2 Concerning Spatial Data 9 2.1 SpatialData.................................. 10 2.1.1 TypesofGeometricPrimitives. 10 2.1.2 SpatialReferenceSystems . 11 2.1.3 PropertiesofGeometries . 15 2.2 Object-RelationalFeaturesinSQL:2003 . ..... 16 2.3 TheSQL/MMSpatialStandard . 19 2.3.1 History................................. 20 2.3.2 SpatialTypeHierarchy. 22 2.3.3 MethodsontheSpatialTypes . 25 2.3.4 SupportingDataTypes. 28 2.3.5 SpatialInformationSchema . 29 2.3.6 SampleUsageScenarios . 31 2.3.7 DiscussionoftheStandard. 35 2.3.8 Improved Spatial Type Hierarchy . 39 2.3.9 MajorGapsintheStandard . 43 2.4 GeographyMarkupLanguage . 44 2.5 Products Implementing Spatial Extensions . ...... 46 2.5.1 IBM DB2 Spatial and Geodetic Extenders . 46 2.5.2 IBMInformixSpatialDataBlade . 48 vii Contents 2.5.3 MapInfoSpatialWare. 49 2.5.4 MySQLSpatial ............................ 50 2.5.5 OracleSpatial............................. 51 2.5.6 PostgreSQL&PostGIS. 52 II Functional Enhancements for SQL/MM 53 3 Spatial Application Development 55 3.1 SpatialSupportinJDBC........................... 56 3.2 SpatialTransformGroups . 58 3.2.1 ST WellKnownText TransformGroup. 59 3.2.2 ST WellKnownBinary TransformGroup . 60 3.2.3 ST GML TransformGroup. .. .. 61 3.2.4 Transforms for Application Bindings . .. 63 3.3 EnhancementstoJDBC ........................... 63 3.3.1 SpatialClassHierarchy. 64 3.3.2 Extending the Interface ResultSet .................. 66 3.3.3 Extending the Interface PreparedStatement ............ 66 3.3.4 Extending the Interface CallableStatement ............. 68 3.3.5 Prototypical Implementation . 69 3.3.6 Conclusions for the Extended JDBC Driver . 74 3.4 Summary ................................... 75 4 Integration of Graph Functionality 77 4.1 GraphConcepts................................ 79 4.1.1 Definitions............................... 79 4.1.2 Algorithms .............................. 80 4.2 Literature&ProductOverview . 83 4.3 Mappings Between Geometries and Graphs . ... 85 4.3.1 Building Graphs from Linestrings . 85 4.3.2 Building Graphs from Point Geometries . 95 4.3.3 BuildingGraphsfromPolygons . 98 4.3.4 ImpactonGraphAlgorithms . 100 4.4 TheSpatialGraphExtenderforDB2 . 102 4.4.1 Requirements ............................. 102 4.4.2 Architecture.............................. 103 4.4.3 GraphsConstruction . 106 4.4.4 UpdatingGraphs . .. .. .. 111

Integration of Spatial Vector Data in Enterprise Relational Database Environments

Mapinfo Professional Data Sheet

Filemaker 16 ODBC and JDBC Guide

Chapter 3 DB Gateways.Pptx

Mapinfo Professional®

Mapbasic User Guide, Mapbasic’S Documentation Set Includes the Mapbasic Reference, and Online Help System

SWUTC/11/161142-1 Evaluation of Mobile Source Greenhouse Gas Emissions for Assessment of Traffic Management Strategies August 20

Preview JDBC Tutorial

PRODUCT DATA Sheet

Mapinfo Professional

Java Database Connectivity (JDBC)

Mapinfo Mapbasic V17.0 User Guide

Java Database Connectivity (JDBC) J2EE Application Model