® ®

Bringing Location Analysis to the Semantic Web with the OGC GeoSPARQL Standard

Matthew Perry Xavier Lopez Agenda

• About the GeoSPARQL SWG • Use Cases & Requirements • From SPARQL to GeoSPARQL • GeoSPARQL Technical Details • Implementation Considerations • Live Demo

® OGC 2 Group Members

• Open Geospatial Consortium standards working group – 13 voting members, 36 observers – Editors: Matthew Perry and John Herring – Chairs: John Herring and Dave Kolas • Submitting Organizations

Traverse Technologies, Inc. ® OGC 3 Standardization Process

OAB vote Process on comments Publish candidate and update standard Form SWG standard document (May (June 2010) (June 2011) (Feb. 2012) 2012)

Release 30-day TC/PC candidate public vote standard comment (March 22, (May 2011) period 2012) (July 2011)

® OGC 4 Linked Data

• Many LOD datasets have geospatial components

LinkedSensorData • Barriers to integration What DBPedia Historic – Vendor-specific geometry support Buildings are within walking distance? – Different vocabularies • W3C Basic Geo, GML XMLLiteral, What OpenStreetMap Dog Vendor-specific Parks are inside Ordnance – Different spatial reference systems Survey Southampton • WGS84 Lat-Long, British National Grid Administrative District?

® OGC 5 Gazetteers and Linked Open Data Services • Provide common terms (place names) to link across existing spatial data resources • Enable consolidated view across the map layers • Reconcile differences in data semantics so that they can all “talk”and interoperate • Resolving semantic discrepancies across databases gazetteers and applications • Integrate full breath of enterprise content continuum (structured, spatial, email, documents, web services)

OGC ® Semantic GIS

• GIS applications with semantically complex thematic aspects – Logical reasoning to classify features • land cover type, suitable farm land, etc. – Complex Geometries • Polygons and Multi-Polygons with 1000’s of points – Complex Spatial Operations • Union, Intersection, Buffers, etc.

Find parcels with an area of at least 3 sq. miles that touch a local feeder road and are inside an area of suitable farm land.

® OGC 7 Towards Qualitative Spatial Reasoning

• Don’t always have geometry data – Textual descriptions • Next to Hilton hotel • Inside Union Square – Incomplete geometry data • Only have geometries for some features • Hybrid quantitative and qualitative spatial reasoning

• GeoSPARQL takes some steps in this direction – Vocabulary for asserting topological relations – Same query specification for qualitative and quantitative systems

® OGC 8 Requirements for GeoSPARQL

• Provide a common target for implementers & users – Representation and query • Work within SPARQL’s extensibility framework • Simple enough for general users – Keep the common case simple (WGS 84 point data) • Capable enough for GIS professionals – Multiple SRSs, complex geometries, complex operators • Don’t re-invent the wheel!

Simple Features ISO 19107 – Spatial Schema Well Known Text (WKT) ISO 13249 – SQL/MM GML KML ® GeoJSON OGC 9 GEOSPARQL TECHNICAL DETAILS

® OGC 10 FROM SPARQL TO GEOSPARQL

® OGC 11 SPARQL QUERY

RDF Data SPARQL Query :res1 rdf:type :House . SELECT ?r ?ba ?br :res1 :baths “2.5”^^xsd:decimal . WHERE { ?r rdf:type :House . :res1 :bedrooms “3”^^xsd:decimal . ?r :baths ?ba . ?r :bedrooms ?br } :res2 rdf:type :Condo . :res2 :baths “2”^^xsd:decimal . :res2 :bedrooms “2”^^xsd:decimal .

:res3 rdf:type :House Result Bindings :res3 :baths “1.5”^^xsd:decimal . ?r | ?ba | ?br :res3 :bedrooms “3”^^xsd:decimal . ======:res1 | “2.5” | “3” :res3 | “1.5” | “3”

® OGC 12 SPARQL QUERY

RDF Data SPARQL Query :res1 rdf:type :House . SELECT ?r ?ba ?br :res1 :baths “2.5”^^xsd:decimal . WHERE { ?r rdf:type :House . :res1 :bedrooms “3”^^xsd:decimal . ?r :baths ?ba . ?r :bedrooms ?br :res2 rdf:type :Condo . FILTER (?ba > 2)} :res2 :baths “2”^^xsd:decimal . :res2 :bedrooms “2”^^xsd:decimal .

:res3 rdf:type :House Result Bindings :res3 :baths “1.5”^^xsd:decimal . ?r | ?ba | ?br :res3 :bedrooms “3”^^xsd:decimal . ======:res1 | “2.5” | “3”

® OGC 13 Spatial SPARQL QUERY Spatial RDF Data This is what we :res1 rdf:type :House . :res1 :baths “2.5”^^xsd:decimal . are standardizing :res1 :bedrooms “3”^^xsd:decimal . :res1 ogc:hasGeometry :geom1 . :geom1 ogc:asWKT “POINT(-122.25 37.46)”^^ogc:wktLiteral .

:res3 rdf:type :House Vocabulary & :res3 :baths “1.5”^^xsd:decimal . Datatypes :res3 :bedrooms “3”^^xsd:decimal . :res3 ogc:hasGeometry :geom3 . :geom3 ogc:asWKT “POINT(-122.24 37.47)”^^ogc:wktLiteral .

GeoSPARQL Query Find houses SELECT ?r ?ba ?br within a search WHERE { ?r rdf:type :House . polygon ?r :baths ?ba . ?r :bedrooms ?br . Extension ?r ogc:hasGeometry ?g . Functions ?g ogc:asWKT ?wkt FILTER(ogcf:sfWithin(?wkt, ® “POLYGON(…)”^^ogc:wktLiteral)) } OGC 14 Components of GeoSPARQL

• Vocabulary for Query Patterns – Classes • Spatial Object, Feature, Geometry – Properties • Topological relations • Links between features and geometries – Datatypes for geometry literals • ogc:wktLiteral, ogc:gmlLiteral • Query Functions – Topological relations, distance, buffer, intersection, … • Entailment Components – RDFS entailment – RIF rules to compute topological relations

® OGC 15 GeoSPARQL Vocabulary: Basic Classes and Relations

ogc:SpatialObject

Same as ISO GM_Object ogc:hasGeometry 0 .. * ogc:Feature ogc:Geometry

0 .. 1 metadata ogc:hasDefaultGeometry ogc:dimension : xsd:int Same as ISO ogc:coordinateDimension : xsd:int GFI_Feature ogc:spatialDimension : xsd:int ogc:isEmpty : xsd:boolean ogc:isSimple : xsd:boolean

serializations Geometry encoded ogc:asWKT : ogc:wktLiteral as a Literal ogc:asGML : ogc:gmlLiteral … ® OGC 16 Details of ogc:wktLiteral

All RDFS Literals of type ogc:wktLiteral shall consist of an optional IRI identifying the followed by Well Known Text (WKT) describing a geometric value [ISO 19125-1].

POINT(-122.4192 37.7793)”^^ogc:wktLiteral

WGS84 longitude – latitude is the default CRS

“POINT(-122.4192 37.7793)”^^ogc:wktLiteral

European Petroleum Survey Group (EPSG) maintains a set of CRS identifiers. ® OGC 17 Topological Relations between ogc:SpatialObject

B A/B A B A B A

ogc:sfEquals ogc:sfTouches ogc:sfOverlaps ogc:sfContains

A A B A B A B B

ogc:sfWithin ogc:sfDisjoint ogc:sfIntersects ogc:sfCrosses

• Assumes Simple Features Relation Family • Also support Egenhofer and ® RCC8 OGC 18 Example Data

:City rdfs:subClassOf ogc:Feature . Meta Information :Park rdfs:subClassOf ogc:Feature . :exactGeometry rdfs:subPropertyOf ogc:hasGeometry .

:SanFrancisco rdf:type :City . Non-spatial Properties :UnionSquarePark rdf:type :Park . :UnionSquarePark :commissioned “1847-01-01”^^xsd:date .

Spatial Properties :UnionSquarePark :exactGeometry :geo1 . :geo1 ogc:asWKT “Polygon((…))”^^ogc:wktLiteral .

:SanFrancisco :exactGeometry :geo2 . :geo2 ogc:asWKT “Polygon((…))”^^ogc:wktLiteral .

:UnionSquarePark ogc:sfWithin :SanFrancisco .

® OGC 20 Why don’t you have ogc:myFavoriteProperty?

• GeoSPARQL vocabulary is not comprehensive – Just enough to define a reasonable set of query patterns

• There are other efforts for more comprehensive vocabularies – ISO / TC 211 – SOCoP – GeoVocamps

• GeoSPARQL vocabulary can easily be extended with other application/domain-specific vocabularies

® OGC 21 Why Encode Geometry Data as a Literal?

Advantage: single self-contained unit Consistent way to select geometry information

Find all water bodies that are within 1 km of Route 3 SELECT ?water ?wWKT WHERE { ?water rdf:type :WaterBody . ?water :hasExactGeometry ?wGeo . ?wGeo ogc:asWKT ?wWKT . :Route_3 :hasExactGeometry ?r3Geo . :r3Geo ogc:asWKT ?r3WKT . FILTER(ogcf:distance(?r3WKT, ?wWKT,…) <= 1000) }

Consistent way to pass geometry information around

® OGC 22 Why don’t you support W3C Basic Geo?

• Too simple to meet our requirements – Can’t use different datums and coordinate systems – Limited number of geometry types

• W3C Basic Geo data can easily be converted to wktLiteral

PREFIX geo: PREFIX ogc: SELECT (STRDT(CONCAT("POINT(",?long," ",?lat,")"), ogc:wktLiteral) AS ?wktLit) WHERE { ?point geo:long ?long . ?point geo:lat ?lat }

® OGC 23 GeoSPARQL Query Functions

– ogcf:distance(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral, units: xsd:anyURI): xsd:double

geom1 geom2

– ogcf:buffer(geom: ogc:wktLiteral, radius: xsd:double, units: xsd:anyURI): ogc:wktLiteral

geom

– ogcf:convexHull(geom: ogc:wktLiteral): ogc:wktLiteral

geom

® OGC 24 GeoSPARQL Query Functions

– ogcf:intersection(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): ogc:wktLiteral

geom2

geom1

– ogcf:union(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): ogc:wktLiteral

geom2

geom1

® OGC 25 GeoSPARQL Query Functions

– ogcf:difference(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): ogc:wktLiteral

geom2

geom1

– ogcf:symDifference(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): ogc:wktLiteral

geom2

geom1

® OGC 26 GeoSPARQL Query Functions

– ogcf:envelope(geom: ogc:wktLiteral): ogc:wktLiteral

geom

– ogcf:boundary(geom1: ogc:wktLiteral): ogc:wktLiteral

geom

– ogcf:getSRID(geom: ogc:wktLiteral): xsd:anyURI

® OGC 27 GeoSPARQL Topological Query Functions

– ogcf:relate(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral, patternMatrix: xsd:string): xsd:boolean

DE-9IM Intersection Matrix

geom2 geom2

Interior Boundary Exterior geom1 Interior T T T geom1 Boundary F F T ogc:contains Exterior F F T

patternMatrix: TTTFFTFFT

® OGC 28 GeoSPARQL Topological Query Functions

– ogcf:sfEquals(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean – ogcf:sfDisjoint(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean – ogcf:sfIntersects(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean – ogcf:sfTouches(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean – ogcf:sfCrosses(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean – ogcf:sfWithin(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean – ogcf:sfContains(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean – ogcf:sfOverlaps(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean

Assumes Simple Features ® Relation Family OGC 29 GeoSPARQL Query Rewrite Extension

Find all water bodies within New Hampshire SELECT ?water WHERE { ?water rdf:type :WaterBody . ?water ogc:rcc8Within :NH }

Qualitative RCC8 Backward Same Query Specification Chaining Quantitative SELECT ?water WHERE { ?water rdf:type :WaterBody . ?water ogc:hasDefaultGeometry ?wGeo . ?wGeo ogc:asWKT ?wWKT . Query :NH ogc:hasDefaultGeometry ?nGeo . Rewrite ?nGeo ogc:asWKT ?nWKT . FILTER(ogcf:rcc8Within(?wWKT, ?nWKT)) } Specified with a RIF rule

® OGC 31 IMPLEMENTATION CONSIDERATIONS

® OGC 33 Implementing Spatial Operations

• These are standard OGC operators that have been around for some time • Lots of infrastructure available – Open Source

– Commercial

® OGC 34 Other Considerations

• Have to handle geometries from multiple Spatial Reference Systems simultaneously – Normalize to common SRS on-the-fly during computation – Pre-normalize ahead of time

• Spatial Indexing very important for performance – Normalize to common SRS during indexing

® OGC 35 Summary

• GeoSPARQL Defines: – Basic vocabulary, Query functions, Entailment component • Based on existing OGC/ISO standards – WKT, GML, Simple Features, ISO 19107 • Uses SPARQL’s built-in extensibility framework • Modular specification – Allows flexibility in implementations – Easy to extend • Accommodates qualitative and quantitative systems – Same query specification for qualitative (core + topology vocabulary) and quantitative (all components, incl. query rewrite)

® OGC 36 GEOSPARQL DEMO WITH ORACLE DATABASE SEMANTIC TECHNOLOGIES

® OGC 37 THE FOLLOWING IS INTENDED TO OUTLINE OUR GENERAL PRODUCT DIRECTION. IT IS INTENDED FOR INFORMATION PURPOSES ONLY, AND MAY NOT BE INCORPORATED INTO ANY CONTRACT. IT IS NOT A COMMITMENT TO DELIVER ANY MATERIAL, CODE, OR FUNCTIONALITY, AND SHOULD NOT BE RELIED UPON IN MAKING PURCHASING DECISION. THE DEVELOPMENT, RELEASE, AND TIMING OF ANY FEATURES OR FUNCTIONALITY DESCRIBED FOR ORACLE'S PRODUCTS REMAINS AT THE SOLE DISCRETION OF ORACLE.

38 Demo Setup

US Census Polygon Data California Cities, Counties, School Districts Linked GeoData • 6 k triples Relevant Nodes Dataset • 2 k (Multi)Polygons (NAD83 Long Lat) • 70 m triples • Avg 206 points per polygon • 5.9 m points (WGS84 Long Lat) • Max 7707 points per polygon

GeoSPARQL Prototype *

1 Load RDF from Linked GeoData

1) Load into staging table 2) Replace virtrdf:Geometry with ogc:wktLiteral 3) Bulk Load into RDF Store

* Equivalent functionality is available in Oracle Database 11g Release 2 using proprietary datatypes and extension functions. 39 Demo Setup

US Census Polygon Data California Cities, Counties, School Districts Linked GeoData • 6 k triples Relevant Nodes Dataset • 2 k (Multi)Polygons (NAD83 Long Lat) • 70 m triples • Avg 206 points per polygon • 5.9 m points (WGS84 Long Lat) • Max 7707 points per polygon

GeoSPARQL Prototype * 2 Generate and Load RDF from Census Data

1) Load .shp file into Oracle Spatial 2) Convert to RDF using SDO_GEOMETRY.GET_WKT() 3) Bulk load into RDF Store

* Equivalent functionality is available in Oracle Database 11g Release 2 using proprietary datatypes and extension functions. 40 Demo Setup

US Census Polygon Data California Cities, Counties, School Districts Linked GeoData • 6 k triples Relevant Nodes Dataset • 2 k (Multi)Polygons (NAD83 Long Lat) • 70 m triples • Avg 206 points per polygon • 5.9 m points (WGS84 Long Lat) • Max 7707 points per polygon

GeoSPARQL Prototype * Create spatial index with native 3 SDO_GEOMTERY object type (normalize to common SRS)

exec sem_apis.add_datatype_index( 'http://www.opengis.net/geosparql#wktLiteral', options=>'TOLERANCE=1 SRID=8307 DIMENSIONS=((LONGITUDE,-180,180) (LATITUDE,-90,90))', parallel=>4);

* Equivalent functionality is available in Oracle Database 11g Release 2 using proprietary datatypes and extension functions. 41 Map Creation

Map Viewer

create table createcreate tabletable lgd_restaurantslgd_restaurants asas selectlgd_restaurants l, g as selectselect l,l, gg fromfrom table(table(sem_matchsem_match(( from ‘SELECT table( ?l,sem_match ?g ( GeoSPARQL ‘SELECT ?l, ?g ‘SELECT WHERE ?l,{…}’ ?g,…)); Prototype * WHEREWHERE {…}’,…));{…}’,…));

Create geometry tables Create Map Viewer layers 1 from sem_match 2 from geometry tables GeoSPARQL queries

* Equivalent functionality is available in Oracle Database 11g Release 2 using proprietary datatypes and extension functions. 42 Thanks to all members of the GeoSPARQL SWG !

Questions?

® OGC 43 44 45 46 Q1: Open Street Map restaurants near Hilton Hotel

SELECT ?x ?l ?g WHERE { ?x rdf:type lgd:Restaurant . ?x rdfs:label ?l . ?x geo:asWKT ?g FILTER (geof:distance(?g, "POINT(-122.41 37.7858)"^^geo:wktLiteral, uom:KM) <= 0.5) }

47 48 Q2: Open Street Map hotels within 1 KM of Hilton Hotel

SELECT ?x ?l ?g WHERE { ?x rdf:type lgd:TourismHotel . ?x rdfs:label ?l . ?x geo:asWKT ?g FILTER (geof:distance(?g, "POINT(-122.41 37.7858)"^^geo:wktLiteral, uom:KM) <= 1) }

49 50 Q3: 20 closest Open Street Map hotels to Hilton Hotel

SELECT ?x ?l ?g WHERE { ?x rdf:type lgd:TourismHotel . ?x rdfs:label ?l . ?x geo:asWKT ?g FILTER (geof:distance(?g, "POINT(-122.41 37.7858)"^^geo:wktLiteral, uom:KM) <= 1) } ORDER BY ASC(geof:distance(?g, "POINT(-122.41 37.7858)"^^geo:wktLiteral, uom:KM)) LIMIT 21

51 52 Q4: Open Street Map restaurants inside a query window

SELECT ?x ?l ?g WHERE { ?x rdf:type lgd:Restaurant . ?x rdfs:label ?l . ?x geo:asWKT ?g FILTER (geof:sfWithin(?g, "POLYGON((-122.25 37.46, -123.27 37.46, -123.27 37.60, -122.25 37.60, -122.25 37.46))"^^geo:wktLiteral)) }

53 54 Q5: What US Census areas contain SFO airport

SELECT ?x ?l ?t WHERE { ?x rdf:type ?t . ?x dc:title ?l . ?x geo:asWKT ?g FILTER (geof:sfContains(?g, "POINT(-122.2230 37.3708)"^^geo:wktLiteral)) }

55 56 Q6: Pairs of US Census cities and Open Street Map parks

SELECT ?t ?l WHERE { ?sf rdf:type ucs:city . ?sf dc:title ?t . ?sf geo:asWKT ?sgeo . ?x rdf:type lgd:Park . ?x rdfs:label ?l . ?x geo:asWKT ?g FILTER ( geof:sfWithin(?g, "POLYGON((-124.40959 32.534156, -114.13443 32.534156, -114.13443 42.009518, -124.40959 42.009518, -124.40959 32.534156))"^^geo:wktLiteral) && geof:sfContains(?sgeo, ?g) ) } LIMIT 100

57 58 59 60 61