Undefined 0 (0) 1 1 IOS Press

GeoSPARQL: Enabling a Geospatial Semantic Web

Robert Battle, Dave Kolas Knowledge Engineering Group, Raytheon BBN Technologies 1300 N 17th Street, Suite 400, Arlington, VA 22209, USA E-mail: {rbattle,dkolas}@bbn.com

Abstract. As the amount of Linked Open Data on the web increases, so does the amount of data with an inherent spatial context. Without spatial reasoning, however, the value of this spatial context is limited. Over the past decade there have been several vocabularies and query languages that attempt to exploit this knowledge and enable spatial reasoning. These attempts provide varying levels of support for fundamental geospatial concepts. In this paper, we look at the overall state of geospatial data in the Semantic Web, with a focus on the upcoming OGC standard GeoSPARQL. GeoSPARQL attempts to unify data access for the geospatial Semantic Web. We first describe the motivation for GeoSPARQL, then the current state of the art in industry and research, followed by an example use case, and the implementation of GeoSPARQL in the Parliament triple store.

Keywords: GeoSPARQL, SPARQL, RDF, geospatial, geospatial data, query language, geospatial query language, geospatial index

1. Introduction In this paper, we discuss an emerging standard, - SPARQL [24] from the Open Geospatial Consortium Geospatial data is increasingly being made avail- (OGC)1. This standard aims to address the issues of able on the Web in the form of datasets described us- geospatial data representation and access. It provides ing the Resource Description Framework (RDF). The a common representation of geospatial data described principles of Linked Open Data, detailed in [5], en- using RDF, and the ability to query and filter on the courage a set of best practices for publishing and con- relationships between geospatial entities. First, we in- necting structured data on the Web. Linked Open Data troduce some geospatial concepts that are critical to promotes the use of the SPARQL Protocol and RDF understanding some of the design choices for Geo- Query Language (SPARQL) and RDF to query and SPARQL. We then describe topological relationships model data. While this is useful for querying for re- that are important to understand when designing a lan- lationships that are explicitly represented in data, im- guage for querying between spatial entities. Next, we plicit relationships, such as geospatial relationships, describe the motivation for GeoSPARQL, the current cannot easily be queried. For instance, datasets may state of the art in modeling and querying geospatial exist that describe monuments and parks, but being data in the Semantic Web, and we introduce Geo- able to link these datasets based on their undeclared SPARQL. A discussion of the Parliament2 triple store, relationships is difficult. The ability to answer a mean- ingful query, like "What parks are within 3km of the Washington Monument?", depends on how the data is 1Dave Kolas is a co-chair for the GeoSPARQL Standards Work- represented, whether all of the resources are related to ing Group, and Robert Battle has worked on the Parliament imple- mentation and provided feedback to the development of the stan- the Washington Monument, and if that relationship is dard. explicit. 2http://parliament.semwebcentral.org

0000-0000/0-1900/$00.00 c 0 – IOS Press and the authors. All rights reserved 2 its GeoSPARQL spatial index, and a use-case that il- system uses a spherical surface to determine locations. lustrates a simple example of data integration with In such a system, a point is defined by angles measured GeoSPARQL follows. from the center of the Earth to a point on the surface. These are also known as latitudes (horizontal) and lon- gitudes (vertical). A Cartesian coordinate system is a 2. Geospatial Concepts flat coordinate system on the surface. It enables quick and accurate measurements over small distances and is Some basic understanding of geospatial concepts is useful for applications such as surveying. required for discussion of GeoSPARQL. The follow- An ellipsoid defines an approximation for the center ing sections define some of the terms used throughout and shape of the Earth. A datum defines the position the rest of this paper. of an ellipsoid relative to the center of the earth. This provides a frame of reference for measuring locations 2.1. Features and Geometries and, for local datums, allows for accurate locations to be defined for the valid area of the datum. WGS84 is a Features and geometries are two fundamental con- datum that is widely used by GPS devices that approxi- cepts of geospatial science. A feature is simply any en- mates the entire world. Geographic coordinate systems tity in the real world with some spatial location. This use an Earth based datum that transforms an ellipsoid could be a park, an airport, a monument, a restaurant, into a representation of the Earth. etc. A feature can have a spatial location that cannot In order to create a map of the Earth, it must be pro- be precisely defined, such as a swamp or a mountain jected from a curved surface onto the plane. This pro- range. A geometry is any geometric shape, such as a jection will distort the surface in some fashion which point, polygon, or line, and is used as a representation will mean that the coordinates for some locations are of a feature’s spatial location. For instance, Reagan more accurate than those in another. Some projections National Airport is a geospatial feature because it is an will preserve area, so the size of all objects is relative, entity that has a specific location in the world. It has a while others preserve angles, and others try to do both. geometry that is a point with coordinates 38.852222, - A coordinate system projected onto a plane enables 3 77.037778 (in the WGS84 datum). Geometries can be faster performance, as Cartesian calculations require measured at varying resolutions, from a simple point fewer resources than Spherical calculations. Computa- in the center of a feature to a complex, precise mea- tions across the plane, however, are inaccurate when surement of a feature’s entire border. Spatial data typi- they deal with large areas as the curvature of the Earth cally separates features from geometries, although that is not taken into account. is not always the case. The combination of these elements defines a CRS. One common source of well defined coordinate refer- 2.2. Coordinate Reference Systems ence systems is the European Petroleum Survey Group (EPSG)4. An important part of the metadata associated with a geometry is its coordinate reference system (CRS) (al- 2.3. Topological Relationships ternatively known as its ). The elements of a coordinate reference system provide con- text for the coordinates that define a geometry in order All spatial entities are inherently related to some to accurately describe their position and establish rela- other spatial entity. Whether two entities intersect tionships between sets of coordinates. There are four somehow or are thousands of miles apart, the relation- parts that make up a CRS: a coordinate system, an el- ship that they share can be described and evaluated. lipsoid, a datum, and a projection. In [28], Randell et al. describe an interval logic for A coordinate system describes a location relative to reasoning about space using a simple ontology that de- some center. A geocentric coordinate system places the fines functions and relations for expressing and rea- center at the center of the Earth and uses standard X, soning over spatial regions. This logic is referred to as Y, Z ordinates. A geographic (or geodetic) coordinate Region Connection Calculus (RCC). A subset of RCC, RCC8, defines eight mutually exhaustive pairwise dis-

3http://en.wikipedia.org/wiki/World_ Geodetic_System 4http://www.epsg-registry.org 3

Table 1 Geospatial reasoning is critical in a large number , Egenhofer and RCC8 relations equivalence of application domains (emergency response, trans- Simple Features Egenhofer RCC8 portation planning, hydrology, land use, etc.). Users in equals equal EQ these domains have long utilized relational databases disjoint disjoint DC with spatial extensions [9]. These spatially extended intersects ¬disjoint ¬DC databases have given the combination of efficient, sta- touches meet EC ble storage and retrieval of data with geospatial calcu- within inside + coveredBy NTPP + TPP lation and indexing. This allows questions like "Which contains contains + covers NTPPi + TPPi students live within 2km of the school they attend?" to overlaps overlap PO be answered efficiently. Within the last decade, RDF storage solutions have joint relations which can be used to imply the rest of become increasingly popular. These knowledge bases, the relations in RCC. These eight base relations are: sometimes called triple stores, are capable of better handling several types of problems at which relational 1. DC(x, y) (x is disconnected from y) databases struggle or are not intended to perform: 2. x = y (x is identical with y) queries with many joins across entities [32], queries 3. PO(x,y) (x partially overlaps y) with variable properties [32], and ontological inference 4. EC(x,y) (x is externally connected with y) on datasets. These features lend themselves towards 5. TPP(x,y) (x is a tangential proper part of y) problems that involve data exploration, linkage across 6. NTPP(x,y) (x is a non-tangential proper part of datasets, and abstraction from low level data. y) Because of RDF stores’ ability to do inference and 7. TPPi(x,y) (y is a tangential proper part of x) easily link data sets, they have been of growing inter- 8. NTPPi(x,y) (y is a non-tangential proper part of est to the geospatial data community. Often geospa- x) tial domains have complicated type hierarchies which cannot be fully expressed in current geospatial infor- The same set of eight geospatial topological rela- mation systems. For instance, a river is both a water- tions is described with different names by Egenhofer in way and a transportation route. Also, geospatial do- [8], and includes the capacity to describe the relation- main problems often require marrying multiple data ship between different dimensioned geometries. This sources together to solve a particular problem. In emer- model was later generalized in the Nine Intersection gency response scenarios, population data, transporta- Model [11]. The Open Geospatial Consortium Sim- tion data, and even realtime police and fire data must ple Feature Access Common Architecture specifica- be combined to deliver a timely result. Combining data tion [25] uses the Nine Intersection Model introduced sources on the web is useful to consumers as well; by Egenhofer to describe spatial relations for use in ge- geospatial data about points of interest combined with ographic access systems. Table 1 illustrates the equiv- hotel information and travel route information could alence between all of these spatial relations. lead to significantly more sophisticated travel planning than currently exists. As such, it was inevitable that those people inter- 3. Motivation for GeoSPARQL ested in expressing geospatial data on the Semantic Web would want to combine the spatial indexing and The Open Geospatial Consortium is a non-profit calculation of spatial databases with the inferential standards organization focused on geospatial data. The power and data linkage of RDF triple stores [16]. This OGC is composed of members of industry, academic has been done by many groups in varying ways for institutions, and government organizations. By stan- varying purposes, as will be discussed later. dardizing GeoSPARQL within the OGC, we seek to To provide geospatial reasoning and querying in leverage the experience of its members to ensure that a triple store, the implementors must define both an geospatial Semantic Web data is represented in a con- ontology for representing spatial objects and query sistent logical way. With the input and acceptance of predicates for retrieving these spatial objects. How- all of both knowledge base vendors and data produc- ever, each organization that has attempted this task ers and consumers, GeoSPARQL has the potential to has approached it in a slightly different way. As a re- unify geospatial RDF data access. sult, spatial RDF data that would be properly indexed 4 and queryable in one implementation would simply be 4. Geospatial RDF: State of the Art and Related treated as plain RDF data in another implementation. Work This is the primary problem which the standard- ization of GeoSPARQL attempts to solve. The Geo- Over the past decade, there have been many dif- SPARQL language defines both a small ontology for ferent attempts to create a geospatial RDF standard. representing features and geometries and a number of Several different organizations, including the W3C, re- SPARQL query predicates and functions. All of these search groups, and triple store vendors have created are derived from other OGC standards so that they are their own ontologies and strategies for representing well grounded and understood. Using the new standard and querying geospatial data. should ensure two things: (1) if a data provider uses In 2003, a W3C Semantic Web Interest Group cre- the spatial ontology in combination with an ontology ated the Basic Geo Vocabulary [31] which provided a way to represent WGS84 points in RDF. This work of their domain, that data can be properly indexed and was extended from 2005 through 2007 by the W3C queried in spatial RDF stores; and (2) compliant RDF Geospatial Incubator Group [21] to follow the GeoRSS triple stores should be able to properly process the ma- [23] feature model to allow for the description of jority of spatial RDF data. points, lines, rectangles, and polygon geometries and Aside from providing the ability to perform spa- their associated features. This group produced the tial queries, the small ontology portion of the Geo- GeoOWL ontology5 which provides a detailed and SPARQL specification is intended to provide an in- flexible model for representing geospatial concepts. terchange format for geospatial data in a wide variety These ontologies were produced as products from their of use cases. The ontology is meant to be attached to respective groups within the W3C as the result of col- other ontologies for various domains, providing only laborations across university and industry partners. the bare spatial aspects. It is intended to be simple There are published datasets that use the W3C vo- enough to cover the most light-weight uses, and scale cabularies [3] and a variety of triple stores that support in complexity for complex use cases. GeoSPARQL data represented by these ontologies [17,22]. However, goes a long way to solving one of the research issues the associated working groups never moved beyond posed by Egenhofer in [10], namely that "we need a the incubator state and the respective ontologies never plausible canonical form in which to pose geospatial became official W3C recommendations. These ontolo- data queries". gies also only work with data in the WGS84 datum. GeoSPARQL is intended to inter-operate with both In order to be valid, data in any other CRS must be quantitative and qualitative spatial reasoning systems. projected which can lead to inaccuracy in the data. A quantitative spatial reasoning system involves con- Support for spatial data in triple stores is mixed. Several vendors support spatial data, but not all ven- crete geometries for features. With these concrete ge- dors support the same representation of data or share ometries present, distances and topological relations the same support for relational queries. Some triple can be explicitly calculated. Qualitative geospatial rea- stores use the aforementioned W3C ontologies, while soning systems allow RCC type topological inferences others have invented their own. for features where the geometries are either unknown Parliament, the high performance [18] triple store or cannot be made concrete [13]. For example, if there from Raytheon BBN Technologies, provided the first are assertions that a monument is inside a park, and geospatial index for semantic web data in 2007 [16, the park is inside a city, a qualitative reasoning sys- 17]. This index supports data in the GeoOWL ontology tem should be able to infer that the monument is in and introduces ontologies for querying spatial data via the city through transitivity. Hybrid versions of these RDF properties that correspond to the RCC spatial re- systems are also possible, where some features have lations and OGC Simple Features relations. However, concrete geometries and others have abstract geome- data in the W3C Basic Geo vocabulary is unsupported. tries. By sharing a set of terms for topological rela- tions, GeoSPARQL allows conclusions from quantita- 5http://www.w3.org/2005/Incubator/geo/ tive applications to be used by qualitative systems and XGR-geo-20071023/W3C_XGR_Geo_files/geo_2007. a single query language for both types of reasoning. owl 5

Ontotext’s OWLIM-SE6 triple store can index point focus on describing data and metadata such as the CRS data represented in the W3C Basic Geo Vocabulary for geometries allows SPARQL-ST to operate with [22]. The only spatial relationship that can be queried data of different systems which is something that is is whether a point is contained within a circle or poly- lacking from many vocabularies such as GeoOWL and gon. The ability to query for relationships between the Basic Geo vocabulary. This increased flexibility higher order geometries such as lines and polygons is with data is hindered, however, by the proposed query not supported. syntax which deviates from the standard SPARQL lan- OpenLink Virtuoso7 also has support for the W3C guage. Any data that is in the SPARQL-ST format can Basic Geo Vocabulary. A SPARQL function is pro- only be accessed by a SPARQL-ST system. vided to convert a pair of latitude and longitude prop- Taking a simpler approach, Zhai et al., in [34] and erty values into a point geometry. A special literal [33], discuss the need for adding topological predicates datatype, virtrdf:Geometry, is also provided for to SPARQL. The OGC Simple Features relations and a indexing point literals. Support for testing intersection subset of geometries are used as the basis for their on- and containment relationships is provided via property tologies. Unfortunately, the relations have to be specif- functions. Through a combination of these relations ically encoded in RDF and the data cannot support and their negations, most of the Egenhofer relations multiple coordinate reference systems. can be tested. However, some relations, such as testing Another approach described by Koubarakis and for overlap, cannot be supported in Virtuoso. Kyzirakos in [19] is based on research from the con- Other triple stores take a completely different ap- straint database community. Constraint databases are proach. As described in [12], Franz AllegroGraph8 de- a promising technology for integrating spatial data fines a custom datatype to represent geospatial data [4]. The authors propose to enrich the Semantic Web and map it to a "strip" of space in the index that con- with spatial and temporal data by extending RDF and tains the data. A modified SPARQL syntax provides a SPARQL with constraints. The proposed extension to new GEO operator where the geospatial aspect of the RDF, stRDF, uses typed literals to describe a semi- query is defined. linear point set. The query language, stSPARQL, ex- OpenSahara9 provides a service for adding exter- tends SPARQL to includes additional operators for nal indexing and geospatial querying capabilities to querying RCC relationships and introduces a new syn- any triple store with a Sesame10 Sail layer. The im- tax for specifying spatial variables. Support for mul- plentation is a wrapper for the PostgreSQL11 database tiple coordinate reference systems is not discussed. with PostGIS spatial extensions12 and as such, Open- Compatibility with existing data is also a problem as Sahara supports all of the geometries and relations de- not all data is represented as a semi-linear point set. fined in the OGC Simple Features Access [25]. Lit- stSPARQL is intentionally not compatible with OGC eral datatypes are introduced for Well-Known Text standards as the authors believe that the semi-linear (WKT), Well Known-Binary (WKB), and their com- point set can be used to describe different geometries pressed forms while the spatial relations that PostGIS without forcing a hierarchy of datatypes on users. In supports are implemented as SPARQL filter functions. [20], stSPARQL and stRDF are described with an ad- In addition to vendor supported options, there has ditional context of GIS applications. The authors con- been significant community and research interest in- trast stRDF and stSPARQL with prior work (such as representing and querying geospatial data in the last GeoOWL) and note that it is hard to find information few years. Perry proposed an extension to SPARQL, about geometries when you do not know what type of SPARQL-ST in [26]. This introduces a modified geometries will be in the answer set and what prop- SPARQL syntax for posing spatial queries to data that erties to ask about. This is a problem with represen- is modeled in an upper ontology based on GeoRSS. A tations like GeoOWL where different geometries have different properties. The semi-linear point set avoids this by encapsulating the representation into a single 6http://www.ontotext.com/owlim/editions 7http://www.openlinksw.com datatype. 8http://www.franz.com/agraph/allegrograph The NeoGeo Vocabulary [29] is a vocabulary that 9http://www.opensahara.com arose from the NeoGeo community13, the 10http://www.openrdf.org 11http://www.postgresql.org 12http://postgis.refractions.net 13http://sites.google.com/site/neogswvocs/ 6 community, and several universities. They recognized 5. Introduction to GeoSPARQL the need for a well formed standard representation for geospatial data and provided one that is based on As illustrated above, many groups have created on- the Geography Markup Language (GML) Simple Fea- tologies and query predicates to make indexing and tures Profile14. To describe spatial relations, an ontol- query of geospatial data possible. However, since there ogy based on RCC8 is also provided. A limitation of are many of these and they all differ slightly, data that the NeoGeo approach is the need to represent each co- can be spatially queried in one knowledge base may ordinate as a resource. In particular, polygons and lines not be able to be spatially queried in another. Geo- are represented with an RDF collection of Basic Geo SPARQL provides a standard for geospatial RDF data points. While this does allow points to be shared across insertion and query, which covers the use cases of the geometries, it significantly increases the verbosity of other previous approaches. the data. Unfortunately, this extra verbosity does not The GeoSPARQL specification attempts to enable a result in particular gains in expressive power, since wide range of geospatial query use cases, from sim- each point in a polygon provides little value in isola- ple points of interest knowledge bases to detailed au- tion and RDF list contents are difficult to query for in thoritative geospatial data sources for transportation. SPARQL 1.0. Moreover this prevents geometry liter- Moreover, use of GeoSPARQL for both of these types als from being compared easily in SPARQL filter func- of data should enable the data sets to be easily used tions. The ability to use content negotiation to retrieve together. In order to achieve this goal, different con- other formats for geometries mitigates these issues for formance classes are provided. This means that a sim- some applications. A nice thing about this approach is ple knowledge base implementation intended for sim- that it does not require any complex literals to parse, ple use cases need not implement all of the more ad- and thus may be attractive to Semantic Web practition- vanced reasoning capabilities of GeoSPARQL, such as ers not familiar with geospatial data formats. quantitative reasoning or query rewriting. In [15], Hu and Du compose a three level hierar- There are also several sets of terminology for the chical spatiotemporal model: a meta level for abstract topological relationships between geometries. Rather space-time knowledge, a schema level for well-known than mandate that all implementations use the same models in spatial and temporal reasoning (e.g., RCC, set of terminology, each implementation can choose Allen time[1]), and an instantiations level that pro- which sets of terms to support. This is discussed fur- vides mappings and formal descriptions of the various ther in the section on GeoSPARQL relationships. ground spatiotemporal statements in the Linked Data GeoSPARQL attempts to address the problems with clouds. This meta approach provides a convenient way the disparate and incompatible implementations for to abstract out spatial knowledge from its underlying representing and querying spatial data. It achieves this representation. Mappings, however, have to be defined by defining an ontology that closely follows the exist- for each dataset at the instantiations level. ing standards work from the OGC with regard to spa- A complete implementation of integrating spatial tial indexing in relational databases. data and queries into an RDF triple store is described The GeoSPARQL specification contains three main by Brodt et al. in [7]. By typing WKT string repre- components: sentations of geometry literals with a spatial datatype, 1. The definition of a vocabulary to represent fea- the triple store is able to efficiently store and query tures, geometries, and their relationships. data. Once again, the OGC Simple Features relations 2. A set of domain-specific, spatial functions for are used as the basis for posing queries for the spatial use in SPARQL queries. relation. These relations are mapped to SPARQL filter 3. A set of query transformation rules. functions which allow for direct comparison between spatial literals. This implementation is similar to the 5.1. GeoSPARQL Ontology approach taken by Parliament and OpenSahara in the way that the SPARQL language itself does not need to The ontology for representing features and geome- be modified in order to query for the spatial relations tries is fundamental to being able to build and query between entities. spatial data. The ontology is based on the OGC’s Sim- ple Features model, with some adaptations for RDF. 14http://www.ogcnetwork.net/gml-sf Note that prefix definitions are omitted from exam- 7 ples for clarity; a definition of all of the prefixes used Features and geo:Geometrys, if GeoSPARQL’s is at the end of the paper in listing 15. The ontol- query rewrite rules are supported (discussed in the ogy includes a class geo:SpatialObject, with next section). The properties can be expressed using two primary subclasses, geo:Feature and geo: three distinct vocabularies: the OGC’s Simple Fea- Geometry. These classes are meant to be connected tures, Egenhofer’s 9-intersection model, and RCC8. to an ontology representing a domain of interest. Fea- Which of these vocabularies is supported can be de- tures can connect to their geometries via the geo: pendent on the triple store implementation, though it is hasGeometry property. likely that implementations will support all three. The For example, an airport is a geo:Feature. It is Simple Features topological relations include equals, a conceptual thing that exists in the real world in a disjoint, intersects, touches, within, contains, overlaps, particular place. In the real world, it has a location and crosses. that corresponds to the coordinates of all of the infi- The filter functions provide two different types nite points along the border of the airport area. This of functionality. First, there are operator functions real-world location has to be measured and estimated which take multiple geometries as predicates and pro- in some way. It is possible to do this measurement at duce either a new geometry or another datatype as various resolutions, each of which may serve well for a result. An example of this is the function ogcf: different purposes. A representation of the real world intersection. This function takes two geometries location which has been measured becomes a geo: and returns a geometry that is their spatial intersec- Geometry. Thus the airport may have several geo: tion. Other functions like ogcf:distance produce Geometrys, ranging from a single point in the cen- an xsd:double as a result. The second type of func- ter of the airport to an extremely detailed polygon that tionality is boolean topological tests of geometries. closely follows the airport’s outside border. A geome- These come in the same three vocabulary sets as the try that will function for most purposes within a dataset topological binary properties: simple features topo- can be specified as the geo:defaultGeometry. logical relations, Egenhofer relations, and RCC8 re- GeoSPARQL includes two different ways to repre- lations. These functions are partially redundant with sent geometry literals and their associated type hier- the topological binary properties; however, the topol- archies: WKT and GML. An implementor of a spa- ogy functions take the geometry literals as parameters, tial triple store may choose to support either or both while the binary properties relate geo:Geometry of these representations. GeoSPARQL provides differ- and geo:Feature entities. This means that quanti- ent OWL classes for the geometry hierarchies associ- tative and qualitative applications can both make use ated with both of these geometry representations. This of the binary properties, but only quantitative applica- provides classes for many different geometry types tions can make use of the topology functions. Also, such as point, polygon, curve, arc, and multicurve. comparisons to concrete geometries provided in the The geo:asWKT and geo:asGML properties link query can only be made via the functions. An example the geometry entities to the geometry literal represen- of the topological functions is ogcf:intersects, tations. Values for these properties use the geo-sf: which returns true if two geometries intersect. WKTLiteral and geo-gml:GMLLiteral data types respectively. 5.3. Query Transformation Rules

5.2. GeoSPARQL Relationships The query rewrite rules allow for an additional layer of abstraction in SPARQL queries. While only con- GeoSPARQL also includes a standard way to ask for crete geometry entities can be quantitatively com- topological relationships, such as overlaps, between pared, it nonetheless sometimes makes sense to discuss spatial entities. These come in the form of binary prop- whether two features have a particular topological re- erties between the entities and geospatial filter func- lationship. This is represented in the natural language tions. question, "Is Reagan National Airport within Washing- The topological binary properties can be used in ton, DC?". Although Reagan National is referred to as SPARQL query triple patterns like a normal prop- a Washington DC airport, it is actually across the Po- erty. Primarily they are used between objects of the tomac river in Virginia. In GeoSPARQL, feature to fea- geo:Geometry type. However, they can also be ture and feature to geometry topological relations are used between geo:Features, or between geo: achieved by the combination of the use of the geo: 8 defaultGeometry property and the query rewrite rules. If a geo:Feature is used as the subject or ob- Listing 2: Example Ontology ject of a topological relation, the query is automatically rewritten to compare the geo:Geometry linked as a ex:Restaurant a owl:Class; default, thus removing the abstraction for processing. rdfs:subClassOf ex:Service . Listing 1 shows a query with a relationship between ex:Park a owl:Class; geo:Feature objects before and after rewrite. rdfs:subClassOf ex:Attraction . ex:Museum a owl:Class; rdfs:subClassOf ex:Attraction . Listing 1: Query Rewrite Example ex:Monument a owl:Class; #Before rdfs:subClassOf ex:Attraction . ASK { ex:Service a owl:Class; ex:DCA a geo:Feature; rdfs:subClassOf ex: geo:within ex:WashingtonDC . PointOfInterest . ex:WashingtonDC a geo:Feature . ex:Attraction a owl:Class; } rdfs:subClassOf ex: PointOfInterest . #After ex:PointOfInterest a owl:Class; ASK { rdfs:subClassOf geo:Feature . ex:DCA a geo:Feature; geo:defaultGeometry ?g1 . ex:WashingtonDC a geo:Feature ; geo:defaultGeometry ?g2 . Listing 3: Washington Monument ?g1 geo:within ?g2 . } ex:WashingtonMonument a ex:Monument; rdfs:label "Washington Monument"; The goal of this feature is to provide a more intu- geo:hasGeometry ex:WMPoint . itive approach to geospatial querying for use cases that ex:WMPoint a geo:Point; do not require many different geometries, while still geo:asWKT "POINT(-77.03524 maintaining a concrete definition of this intuitive un- 38.889468)"^^geo-sf:WKTLiteral derstanding. Compliant GeoSPARQL triple stores are . not required to implement this feature.

longitude latitude order (CRS:84), a record for the 6. Using GeoSPARQL Washington Monument would look like listing 3. If a coordinate reference system other than CRS:84 is Consider an example using a points of interest ontol- desired, that can be included within the WKTLiteral. ogy in listing 2. We seek to represent points of interest Listing 4 expresses the same point in WGS84 with lat- of various types (Monuments, Parks, Restaurants, Mu- itude longitude order. seums, etc.). These types of landmarks are represented in a class hierarchy with a ex:PointOfInterest Listing 4: Point in WGS84 datum class at the top. These classes of course may include many non-spatial attributes, but only a label is included " POINT(38.889468 GeoSPARQL, and thus give its classes a geospatial ref- -77.03524)^^"geo-sf:WKTLiteral erence, is to make ex:PointOfInterest a sub- class of geo:Feature. This may of course result in While this representation may seem verbose, and the the class having two parent classes, but this is compat- literal string is no longer standard WKT, it has the ad- ible with RDFS and OWL reasoning. vantage of encoding the CRS information directly into If compliance with WKT is chosen, and latitudes the literal. This is all of the data needed to define a and longitudes are expressed in WGS84 datum in a geo:Geometry; without the CRS, another property 9

Listing 5: Example Query 1 Listing 6: Example Query 2 SELECT ?m ?p SELECT ?m ?p WHERE { WHERE { ?m a ex:Monument ; ?p a ex:Park . geo:hasGeometry ?mgeo . ?m a ex:Monument ; ?p a ex:Park ; geo:within ?p . geo:hasGeometry ?pgeo . } ?mgeo geo:within ?pgeo . }

Listing 7: Example Query 3 would need to be added onto the geo:Geometry in- stance, and that property would need to be read and SELECT ?a passed into filter functions as well. While an argument WHERE { ?a a ex:Attraction; can be made that it produces a "cleaner" model if the geo:hasGeometry ?ageo . CRS was associated with a geometry via a separate FILTER(geof:within(?ageo, predicate, GeoSPARQL focused on producing a com- "POLYGON(( pact representations that could contain the entire de- -77.089005 38.913574, scription of a geometry within a single literal. For the -77.029953 38.913574, case of WKT, this necessitates adding the CRS specifi- -77.029953 38.886321, cation to the WKT literal string. For the GML confor- -77.089005 38.886321, mance class, the CRS information is already encoded -77.089005 38.913574 in the GML string so no changes are required. ))"^^geo-sf:WKTLiteral)) Now we will look at a few example GeoSPARQL } queries using this data. One potential query over this dataset would be, "Which monuments are contained within a park?" This query requires a topological com- This query is shown in listing 7. Note that the bound- parison between the geometries of the monuments and ing box is expressed as a polygon. the geometries of parks. We show it in listing 5 using Queries looking for entities within a particular dis- the binary topology property geo:within. The two tance of either other entities or a current location are geo:hasGeometry entities have type statements, extremely useful as well. "Which parks are within properties to tie them to their geometries, and then the 3km of the Washington Monument?" can be easily ex- geo:within function to tie them together. pressed in GeoSPARQL. We assume the same URI for If the knowledge base being used supported the the Washington Monument in the data example above. query rewriting rules, and the data set included de- We need to retrieve the two geometries and use the fault geometries, the first query could be rewritten even function geof:distance to calculate the distance more simply using a feature-to-feature topological re- between them. A standard SPARQL less than func- lationship. This method is demonstrated in listing 6. tion is then applied. This query is shown in listing 8. Spatial user interfaces often need to look for entities These example queries require relatively little in terms of a particular type that fall within an explicit bounding of non-spatial constraints, but serve to illustrate some box. Consider the query, "What attractions are within basic functionality with GeoSPARQL. With a more the bounding box defined by (-77.089005, 38.913574) complex ontology, queries could include more compli- and (-77.029953, 38.886321)?" Because we need to cated thematic elements as well. specify an explicit geometry in the query, we need to compare to it using the topological filter functions as 6.1. Exploiting Geospatial Data opposed to the binary properties. We have the attrac- tion entity and its attached geometry, and the geometry Storing geospatial data just as RDF triples does not is compared with the filter function geof:within. allow for the spatial exploitation of that data. In order 10

queried [16,17]. We are currently implementing Geo- Listing 8: Example Query 4 SPARQL based on the public candidate draft standard, and we describe this implementation in the following SELECT ?p section. WHERE { ?p a ex:Park ; 6.2.1. Index Specification geo:hasGeometry ?pgeo . The index interfaces in the Parliament API include ?pgeo geo:asWKT ?pwkt . methods for building a record from data, adding and removing records, finding a record by URI, and finding ex:WashingtonMonument records by value. Existing indexes include the afore- geo:hasGeometry ?wgeo . mentioned GeoOWL spatial index, a temporal index ?wgeo geo:asWKT ?wwkt . (for indexing OWL Time15), and a basic numeric in- dex (for optimizing range queries on numeric property FILTER(geof:distance(?pwkt, ?wwkt, values). units:m) < 3000) The Parliament triple store is built with support for } Jena’s RDF Application Programming Interface (API) and ARQ SPARQL query engine16. An implementa- tion of Jena’s graph interface17 provides access to the base graph store with support for adding, removing, to be able to efficiently query for the relationships be- 18 tween spatial entities, the data must be indexed. This and finding triples. By attaching a listener to the allows only those resources that match the spatial com- graph, the addition and removal of triples can be de- ponent of a query to be retrieved, rather than spatially tected and forwarded to any associated indexes. Parlia- filtering all bindings that match a given query. The rel- ment’s GeoSPARQL index listens for triples that con- geo:asWKT geo:asGML ative performance advantages of using a spatial index tain the or predicates. Any versus spatially filtering a result set is discussed by triple that is added to the graph is checked to see if it Brodt et al. in [7]. matches. At this point, the index can create a record for the geometry that is represented in the object of the 6.2. Parliament and GeoSPARQL triple and insert it into the index. For instance, when adding the triples in listing 3 to Parliament, the index In order to take advantage of data represented in will generate a single record that contains a reference ex:WMPoint GeoSPARQL, a SPARQL endpoint needs to under- to the resource and its WKT value. stand the GeoSPARQL ontology and provide support In order to query for data in an index, we have for one (or more) of the conformance classes. The Par- extended the ARQ query engine to support property liament triple store provides one such implementation functions that can access indexes. When ARQ parses (based on a draft version of the specification) that al- a SPARQL query, it detects the different operators in lows GeoSPARQL data to be indexed and provides a the query. A particularly useful feature of ARQ is the query engine that supports GeoSPARQL queries. support for property functions. Instead of matching a Parliament has a modular architecture that enables triple in a graph, property functions can execute cus- indexes to be built on top of the storage engine. One of tom code in the context of the SPARQL query. Con- these indexes is a spatial index based on a standard R- sider the query in listing 9. It will yield a result set ?x tree implementation [14]. In a similar approach to [7], with a single binding for by concatenating all of the the spatial index is integrated in a way that provides na- arguments together to form the literal "Hello World" tive support for spatial relational queries and efficient instead of attempting to match the triple pattern. Par- storage of the data. The general goal for this index is to liament implements the GeoSPARQL spatial relations, geo:intersects split SPARQL queries with geospatial information into such as , as property functions. multiple parts, allowing for an optimized query plan between the spatial components of the query and the 15http://www.w3.org/TR/owl-time/ 16 components with non-spatial triples to be executed. http://www.openjena.org 17http://openjena.org/javadoc/com/hp/hpl/ Before the emergence of GeoSPARQL, Parliament jena/graph/Graph.html indexed data represented in GeoOWL and allowed 18http://openjena.org/javadoc/com/hp/hpl/ RCC8 and OGC Simple Features relations to be jena/graph/GraphListener.html 11

Listing 9: Property Function Query Listing 11: Example Query 5 SELECT ?x SELECT ?x WHERE { WHERE { ?x apf:concat( "Hello", " ", " GRAPH { World") . } geo:hasGeometry ?geo1 . ?geo1 geo:asWKT ?wkt1 . ?x geo:hasGeometry ?geo2 . ?geo2 geo:asWKT ?wkt2 .

Listing 10: Optimization Example Query BIND (geof:distance(?wkt1, ?wkt2, SELECT ?m units:m) as ?distance) . WHERE { FILTER (?distance < 10000) ?m a ex:Monument ; } geo:hasGeometry ?mgeo . } ?mgeo geo:within ex: NationalMallGeometry . } example, the index will look up the bounding box for ex:NationalMallGeometry and calculate how many items it contains. For basic graph patterns, Par- 6.3. Optimizing Query Execution liament keeps statistics on the triples it contains and can quickly estimate how many matches are in the By using an index, a query can be optimized such store for a given triple pattern. Sub patterns containing that the most selective part is executed first. Con- basic graph patterns use these statistics to estimate how sider the query in listing 10. This query is asking for many triples will be bound by the pattern. After each all monuments within the National Mall. Parliament’s sub pattern has an estimate, they are ordered in ascend- query optimizer splits the query into blocks for execu- ing order executed accordingly. In this case, if there tion based on how the variables in the query are used were 500 monuments with geometries, but only 100 and what special operations occur in the query. In this geometries within the bounding box, the index prop- instance, since the predicate, geo:within, is an in- erty function would be executed first. If, however, there dex property function, the triple containing the predi- were 500 geometries within the bounding box, but only cate is considered as one query block. The rest of the 10 monuments exist in the triple store, the basic graph query is a simple basic graph pattern containing two pattern would execute first and the spatial relationship triples describing ?m. The basic graph pattern is ana- would be tested for each of the 10 results. lyzed and partitioned so that no variable crosses par- In addition to optimizing pattern order, optimiz- titions. For this query, this generates two partitions. ing spatial operations can provide significant per- When the query is executed, the Parliament query op- formance benefits. Consider the query in listing 11. timizer has two choices: (1) it can execute the spatial This query is deceptive in that it appears to be ask- part first and then match to the graph pattern, or (2) ing a simple geospatial question: "What are all the it can execute the graph pattern first, then execute the geometries within 10km of the feature described by spatial operation. The optimizer will decide what to ?". do based on which path estimates it will provide the However, there is no way for Parliament’s query op- fewest result bindings. timizer to know a priori what the distances between In this example, there are two sub patterns: the in- geometries are. While some implementations of Geo- dex property function, and the basic graph pattern for SPARQL may be optimized to handle cases like this, in ?m. Each sub pattern estimates how many results that Parliament this query would not use the spatial index it will be able to provide. For operations within the at all; the geometries for all parks would be found be- spatial index, a bounding box query can be performed fore checking their distance. Table 2 shows that there to estimate how many results will be returned. In this are nine features matching this query. Listing 12 shows 12

of structured data available with some sort of geospa- Listing 12: Example Query 5 - Optimized tial context attached. However, the vast majority of this geospatial context cannot be utilized for spatial queries SELECT ?x because the hosting SPARQL endpoints cannot per- WHERE { form them. In the following section, we discuss four GRAPH { datasets that are representative of the disparate types of data that can be integrated together via there spatial geo:hasGeometry ?geo1 . ?geo1 geo:asWKT ?wkt1 . context. We then demonstrate this integration on two BIND (geof:buffer(?wkt1, 10000, datasets using Parliament. units:m) as ?buff) . 6.4.1. Geospatial Data Sets ?x geo:hasGeometry ?geo2 . GeoNames19 provides information for over eight [ a geo:Geometry ; million geospatial features. The data is exposed via an geo:asWKT ?buff ] geo:sf- RDF webservice20 that exposes information on a per contains ?geo2 . resource basis. The geospatial aspect is represented us- } ing the W3C Basic Geo vocabulary. There is no abil- } ity, however, to perform any type of SPARQL query to retrieve data. Another significant source of geospatial data is DB- Table 2 pedia21. As described in [6], data from Wikipedia22 Example Query 5 Results is extracted into RDF. Many of these extracted enti- x ties are geospatial in nature (cities, counties, countries, http://sws.geonames.org/4199542/ landmarks, etc. . . ) and many of these entities already http://sws.geonames.org/7242246/ contain some geospatial location information. The data http://sws.geonames.org/4183291/ also contains owl:sameAs links between the DBpe- http://sws.geonames.org/4201877/ dia and GeoNames resources. However, without any http://sws.geonames.org/4224307/ spatial computation predicates, this geospatial infor- http://sws.geonames.org/4192596/ mation can only be retrieved "as is." A query like http://sws.geonames.org/4212826/ "Show all cities within 50 miles of Arlington, VA with http://sws.geonames.org/4192776/ a population of at least 100,000 people in which at http://sws.geonames.org/4194405/ least one famous person was born" is not possible, even a more version of this query that is Parliament can op- though the data exists to support it. timize. This version buffers the geometry for the for The linked data community has released the Linked- by GeoData data set [2]. This data set is a spatial knowl- 23 10000 meters and then uses that buffer to test the edge base, derived from Open Street Map and is geo:sf-contains relationship. Parliament can linked to DBpedia and GeoNames resources. It con- take this query and run the spatial component against tains over 200 million triples describing the nodes and the spatial index in order to reduce the amount of re- paths from OpenStreetMap. The data set is accessible sults that need to match the rest of the query. Run- via SPARQL endpoints running on the OpenLink Vir- ning the un-optimized query on our Parliament Geo- tuoso platform. A REST interface to LinkedGeoData SPARQL endpoint (described in the next section) takes is also provided. 6.967 seconds. The optimized version, however, runs Another source of geospatial data is the United in 0.047 seconds. A future version of the Parliament States Geological Survey (USGS). The USGS has re- query optimizer will be able to optimize queries like listing 11 automatically. 19http://www.geonames.org 20 6.4. GeoSPARQL and Linked Data http://www.geonames.org/ontology/ documentation.html 21http://www.dbpedia.org As the Semantic Web materializes on the internet in 22http://www.wikipedia.org the form of Linked Data, there is an increasing amount 23http://www.openstreetmap.org 13 leased a SPARQL endpoint24 for triple data derived from The National Map[30], a collaborative effort to Listing 13: GeoNames Conversion Query deliver usable topographic information for the United States. This dataset is much more specific and special- CONSTRUCT { ized than the data that is provided by DBpedia and ?feature a geo:Feature ; GeoNames. It includes geographic names, hydrogra- geo:hasGeometry [ a geo:Point ; phy, boundaries, transportation, structures, and land geo:asWKT ?wkt cover. The group has attempted to follow the forthcom- ]. ing GeoSPARQL specification, though some aspects } of GeoSPARQL have changed slightly since the data WHERE { has been published. Due to a lack of available Geo- ?feature a gn:Feature ; SPARQL triple stores, the published dataset includes wgs84_pos:lat ?lat ; a pre-computation of all of the topological relation- wgs84_pos:long ?long . ships between entities. A query such as "Show all rail BIND (spatial:toWKTPoint(?lat, ? lines that cross rivers" is in fact possible to answer by long) as ?wkt) . looking at the current precomputed data. However, this } means that if a new entity is added, the knowledge base needs to compute everything that the entity is related to and update those entities as well. If this data was not precomputed, the only way to answer the query would As so much data is represented as individual lat- itude and longitude data using the W3C Basic Geo be via indexing the data and querying the relations with vocabulary, including that provided by GeoNames, it a relationship predicate. is desirable to be able to convert it easily into Geo- 6.4.2. Integration in Parliament via GeoSPARQL SPARQL. It is trivial to provide functions that take GeoSPARQL provides the means to link geospa- a latitude and longitude pair and convert them into tial datasets together, resulting in the possibility of a GeoSPARQL point. Parliament provides SPARQL new, meaningful entailments. As it is simply not pos- property functions, spatial:toWKTPoint and sible to pre-compute all of the relations between all of spatial:toGMLPoint which take a latitude, lon- the available geospatial datasets, enriching the existing gitude, and optional spatial reference system identi- datasets with GeoSPARQL representations, and creat- fier as arguments and return a geo:WKTLiteral or ing indexes for the data is one way that data providers geo:GMLLiteral representation of a point. This can share data while providing access to geospatial se- makes it possible to load and index existing spatial data mantics. sets without having to regenerate existing RDF. Af- For the following examples, we have processed a ter loading the GeoNames data into Parliament, it was subset of the GeoNames RDF dataset25 and the USGS aligned with GeoSPARQL using the query in listing GeoSPARQL data for Atlanta, GA. This data is acces- 13. This query explicitly assigns GeoNames features to sible via a Parliament SPARQL endpoint with a Geo- be GeoSPARQL features, creates a new geo:Point SPARQL spatial index26. The queries in listings 11, resource containing the point information, and links 12, and 14 can be executed against this data. The in- the feature to the geometry. dex supports indexing data conforming to WKT and The USGS provides their RDF data in a format GML serialization and GeoSPARQL queries including that is similar to GeoSPARQL. The geospatial data those with Simple Features, Egenhofer, and RCC8 re- from The National Map contains polygon, polyline, lations, the non-topological query functions and com- and point data. The data conforms to an earlier revision mon topological query functions. Query rewriting is of the GeoSPARQL standard and contains features not supported at this time. and geometries, but lacks typed literals for the geo: asWKT and geo:asGML property values. For this data, the constructor functions for the GeoSPARQL lit- 24 http://usgs-ybother.srv.mst.edu:8890/ eral datatypes can be used to create correctly typed val- 25http://download.geonames.org/ ues at query time. The existing spatial relations state- all-geonames-rdf.zip ments in the dataset were ignored when processing the 26http://geosparql.bbn.com data. 14

Table 3 Atlanta Schools Near Rail Lines Results school distance http://sws.geonames.org/4183400/ 44.09248276546987 Listing 14: Atlanta Schools Near Rail Lines Query http://sws.geonames.org/7242795/ 80.33873752510539 SELECT DISTINCT ?school ?distance http://sws.geonames.org/7242287/ 91.9176767544314 WHERE { GRAPH { 6.4.3. GeoSPARQL Data Query # approximate rectangle of Atlanta Once GeoSPARQL data exists for both GeoNames BIND ("POLYGON((-84.445 33.7991, and the USGS and is loaded into a GeoSPARQL -84.445 33.7069,-84.331 capable knowledge base, such as Parliament, inter- 33.7069,-84.331 esting geospatial questions can be posed. Consider 33.7991,-84.445 33.7991))"^^ geo-sf:WKTLiteral AS ?place) . the following query: "What are all the schools near Atlanta, GA that are within 100 meters of a rail- ?school a gn:Feature ; way?" GeoNames provides point data for buildings, in- geo:hasGeometry ?school_geo ; cluding schools, while the USGS data contains poly- gn:featureCode gn:S.SCH . line data for rail lines as well as polygons for dif- ferent regions. Listing 14 shows the GeoSPARQL ?school_geo geo:sf-within [ a geo: formulation for this question. This query takes ad- Geometry; geo:asWKT ?place ] ; vantage of several features of the language. First, geo:asWKT ?school_wkt . the geo:sf-within predicate is used to determine what schools exist within a boundary for Atlanta (as # buffer schools 100m defined by a polygon in the query). Each school geom- BIND (geof:buffer(?school_wkt, etry is then buffered by 100 meters using the geof: 100, units:m) AS ?s_buff) . buffer function. The rail features are retrieved and checked to see if they fall within this buffer. Finally, for # find rail links within buffer all rail line segments that intersect the buffer, the actual ?rail a trans:railFeature ; distance to the school is calculated using the geof: geo:hasGeometry ?rail_geo . distance function. Table 3 displays the school re- sources and the distance to the nearest rail line seg- # only get railroads within Atlanta ment. This query, however, could be made more effec- ?rail_geo geo:sf-within [ a geo: tive if there was an equivalent to the ST_DWithin from Geometry; geo:asWKT ?place ] ; OGC Simple Features Specification [25]. geo:asWKT ?rail_wkt_s . While the sample query assumes everything is con- tained in a single graph, it is not unusual for the data to # convert string to WKT literal exist in several different graphs or endpoints. In fact, it BIND (geo-sf:WKTLiteral(? would be ideal to not have to replicate data and to be rail_wkt_s) AS ?rail_wkt) . able to query it remotely through the dataset provider. FILTER (geof:sf-intersects(? Until the means for query federation, such as the mech- rail_wkt,?s_buff)) . anism discussed in [27], are widely supported, query- BIND (geof:distance(?school_wkt,? ing across remote linked datasets will be difficult. Geo- rail_wkt,units:m) AS ?distance SPARQL queries should be compatible with query fed- ). eration, though there will likely be performance im- } plications. In lieu of query federation, querying across } graphs is possible. The sample data for the above ex- ORDER BY ASC(?distance) amples is actually contained in two different graphs. A union graph that virtually combines both graphs, how- ever, enables a shorter and simpler query. 15

7. Conclusion geo-sf: GeoSPARQL is the genesis of a significant amount geo-gml: interchange within the Semantic Web can take place os: This, by extension, should lead to users’ ability to fi- ose: text available in RDF datasets. osg: GeoSPARQL, support similar functionality or a subset owl: tial applications are. Hopefully when the GeoSPARQL rdfs: unify their implementations, allowing users to consis- spatial: store Parliament to support GeoSPARQL, and we hope time: that others will find it useful both for working on trans: ing the specification. units: wgs84_pos: xsd: ter of Excellence for Geospatial Information Science virtrdf: SPARQL even before the specification has been final- ized. This has allowed us to identify and resolve addi- tional issues with the specification earlier. We would also like to thank the OGC and the other References members of the GeoSPARQL working group for their continued efforts to bring this marriage of the geospa- [1] J. F. Allen. Maintaining knowledge about temporal intervals. tial community and the Semantic Web to life. Communications of the ACM, 26:832–843, November 1983. ISSN 0001-0782. [2] S. Auer, J. Lehmann, and S. Hellmann. LinkedGeoData: Listing 15: RDF Prefixes Adding a Spatial Dimension to the Web of Data. In A. Bern- stein, D. Karger, T. Heath, L. Feigenbaum, D. Maynard, apf: International Semantic Web Conference (ISWC-2009), volume ex: Springer Berlin / Heidelberg, 2009. ISBN 978-3-642-04929-3. gn:

[5] C. Bizer, T. Heath, and T. Berners-Lee. Linked Data – The [19] M. Koubarakis and K. Kyzirakos. Modeling and Querying Story So Far. International Journal on Semantic Web and Metadata in the Semantic Sensor Web: The Model stRDF and Information Systems, Special Issue on Linked Data, 5:1–22, the Query Language stSPARQL. In L. Aroyo, G. Antoniou, 2009. ISSN 1552-6283. E. Hyvönen, A. ten Teije, H. Stuckenschmidt, L. Cabral, and [6] C. Bizer, J. Lehmann, G. Kobilarov, S. Auer, C. Becker, R. Cy- T. Tudorache, editors, Proceedings of the Extended Semantic ganiak, and S. Hellmann. DBpedia – A Crystallization Point Web Conference 2010, volume 6088 of Lecture Notes in Com- for the Web of Data. Journal of Web Semantics: Science, Ser- puter Science, pages 425–439. Springer Berlin / Heidelberg, vices and Agents on the World Wide Web, 7:154–165, 2009. 2010. ISBN 978-3-642-13485-2. [7] A. Brodt, D. Nicklas, and B. Mitschang. Deep integration of [20] K. Kyzirakos, Z. Kaoudi, and M. Koubarakis. Data models and spatial query processing into native RDF triple stores. In Pro- languages for registries in SemsorGrid4Env, 2009. Deliverable ceedings of the 18th SIGSPATIAL International Conference on D3.1, SemSorGrid4Env. Advances in Geographic Information Systems, GIS ’10, pages [21] J. Lieberman, R. Singh, and C. Goad. W3C Geospatial Ontolo- 33–42, New York, NY, USA, 2010. ACM. ISBN 978-1-4503- gies, October 2007. URL http://www.w3.org/2005/ 0428-3. Incubator/geo/XGR-geo-ont/. [Online; accessed 25- [8] M. J. Egenhofer. A formal definition of binary topological re- May-2011]. lationships. In 3rd International Conference, FODO 1989 on [22] Ontotext AD. Geo-spatial indexing in OWLIM, 2011. URL Foundations of Data Organization and Algorithms, pages 457– http://www.ontotext.com/owlim/geo-spatial. 472, New York, NY, USA, 1989. Springer-Verlag New York, [Online; accessed 25-May-2011]. Inc. ISBN 0-387-51295-0. [23] Open Geospatial Consortium. An Introduction to GeoRSS: A [9] M. J. Egenhofer. Spatial SQL: A query and presentation lan- Standards Based Approach for Geo-enabling RSS feeds. White guage. IEEE Transactions on Knowledge and Data Engineer- paper, 2006. URL http://www.opengeospatial. ing, 6:86–95, 1994. org/pt/06-050r3. Document 06-050r3. [10] M. J. Egenhofer. Toward the semantic geospatial web. In [24] Open Geospatial Consortium. OGC GeoSPARQL - A Geo- Proceedings of the 10th ACM international symposium on Ad- graphic Query Language for RDF Data. Open Geospatial Con- vances in geographic information systems, GIS ’02, pages 1–4, sortium, 2011. URL http://www.opengeospatial. New York, NY, USA, 2002. ACM. ISBN 1-58113-591-2. org/standards/requests/80. Document 11-052r3. [11] M. J. Egenhofer and J. R. Herring. Categorizing binary topo- [25] Open Geospatial Consortium. OpenGIS R Implementation logical relations between regions, lines, and points in geo- Standard for Geographic information - Simple feature access graphic databases. Technical report, Department of Surveying - Part 1: Common architecture. Open Geospatial Consortium, Engineering, University of Maine, 1990. 2011. URL http://www.opengeospatial.org/pt/ [12] Franz, Inc. Geospatial support in SPARQL queries, April 06-103r4. Document 06-103r4. 2011. URL http://www.textriples.com/agraph/ [26] M. Perry. A Framework to Support Spatial, Temporal and The- support/documentation/4.2/sparql-geo.html. matic Analytics over Semantic Web Data. PhD thesis, Wright [Online; accessed 25-May-2011]. State University, 2008. [13] R. Grütter and B. Bauer-Messmer. Combining owl [27] E. Prud’hommeaux. SPARQL 1.1 federation ex- with rcc for spatioterminological reasoning on envi- tensions. W3C working draft, W3C, June 2010. ronmental data. In OWLED 2007 Workshop on OWL: http://www.w3.org/TR/sparql11-federated-query/. Experiences and Directions, volume 258, 2007. URL [28] D. A. Randell, Z. Cui, and A. G. Cohn. A spatial logic based http://sunsite.informatik.rwth-aachen.de/ on regions and connection. In Proceedings of the 3rd Interna- Publications/CEUR-WS/Vol-258/paper17.pdf. tional Conference on Knowledge Representation and Reason- [14] A. Guttman. R-trees: A dynamic index structure for spatial ing, 1992. searching. In International Conference on Management of [29] J. M. Salas, A. Harth, B. Norton, L. M. Vilches, A. D. León, Data, pages 47–57. ACM, 1984. J. Goodwin, C. Stadler, S. Anand, and D. Harries. Neo- [15] H. Hu and X. Du. Linking Open Spatiotemporal Data in the Geo Vocabulary: Defining a shared RDF representation for Data Clouds. In J. Yu, S. Greco, P. Lingras, G. Wang, and GeoData. Public draft, NeoGeo, May 2011. URL http: A. Skowron, editors, Rough Set and Knowledge Technology, //geovocab.org/doc/neogeo.html. volume 6401 of Lecture Notes in Computer Science, pages [30] D. Varanka. National topographic modeling, ontology-driven 304–309. Springer Berlin / Heidelberg, 2010. geographic information in the context of the national map. In [16] D. Kolas. Supporting spatial semantics with SPARQL. In First International Workshop on Information Semantics and Proceedings of the Terra Cognita Workshop, collocated with its Implications for Geographical Analysis (ISGA ’08) at GI- the 7th International Semantic Web Conference (ISWC-2008), Science 2008, the 5th International Conference on Geographic 2008. Information Science, September 2008. [17] D. Kolas and T. Self. Spatially augmented knowledgebase. In [31] W3C. W3C Semantic Web Interest Group: Basic Geo (WGS84 Proceedings of the 6th International Semantic Web Conference lat/long) Vocabulary, 2006. URL http://www.w3.org/ (ISWC-2007), volume 4825 of Lecture Notes in Computer Sci- 2003/01/geo/. [Online; accessed 25-May-2011]. ence. Springer, 2007. [32] C. Weiss, P. Karras, and A. Bernstein. Hexastore: sextuple in- [18] D. Kolas, I. Emmons, and M. Dean. Efficient linked-list RDF dexing for semantic web data management. Proc. VLDB En- indexing in Parliament. In 5th International Workshop on Scal- dow., 1:1008–1019, August 2008. ISSN 2150-8097. able Semantic Web Knowledge Base Systems (SSWS 2009), [33] Z. Xiao, L. Huang, and X. Zhai. Spatial information semantic 2009. query based on sparql. In Proceedings of SPIE, volume 7492, October 2009. 17

[34] X. Zhai, L. Huang, and Z. Xiao. Geo-spatial query based on Geoinformatics, pages 1–4, June 2010. extended sparql. In 2010 18th International Conference on