Undefined 0 (0) 1 1 IOS Press

GeoSPARQL: Enabling a Geospatial Semantic Web

Robert Battle, Dave Kolas Knowledge Engineering Group, Raytheon BBN Technologies 1300 N 17th Street, Suite 400, Arlington, VA 22209, USA E-mail: {rbattle,dkolas}@bbn.com

Abstract. As the amount of Linked Open Data on the web increases, so does the amount of data with an inherent spatial context. Without spatial reasoning, however, the value of this spatial context is limited. Over the past decade there have been several vocabularies and query languages that attempt to exploit this knowledge and enable spatial reasoning. These attempts provide varying levels of support for fundamental geospatial concepts. In this paper we look at the overall state of geospatial data in the Semantic Web, with a focus on the upcoming OGC standard GeoSPARQL. GeoSPARQL attempts to unify data access for the geospatial Semantic Web. We describe the motivation for GeoSPARQL, the current state of the art in industry and research, an example use case, and the implementation of GeoSPARQL in the Parliament triple store.

Keywords: GeoSPARQL, SPARQL, RDF, geospatial, geospatial data, query language, geospatial query language, geospatial index

1. Introduction In this paper, we discuss an emerging standard, GeoSPARQL[24] from the Open Geospatial Consor- Geospatial data is increasingly being made avail- tium (OGC). This standard aims to address the issues able on the Web in the form of datasets described us- of geospatial data representation and access. It pro- ing the Resource Description Framework (RDF). The vides a common representation of geospatial data de- principles of Linked Open Data, detailed in [6], en- scribed using the RDF, and the ability to query and courage a set of best practices for publishing and con- filter on the relationships between geospatial entities. necting structured data on the Web. Linked Open Data First, we introduce some geospatial concepts that are promotes the use of the SPARQL Protocol and RDF critical to understanding some of the design choices for Query Language (SPARQL) and RDF to query and GeoSPARQL. We then describe topological relation- model data. While this is useful for querying for re- ships that are important to understand when designing lationships that are explicitly represented in data, im- a language for querying between spatial entities. Next, plicit relationships, such as geospatial relationships, we describe the motivation for GeoSPARQL, the cur- cannot easily be queried. For instance, datasets may rent state of the art in modeling and querying geospa- exist that describe monuments and parks, but being tial data in the Semantic Web, and we introduce - able to link these datasets based on their undeclared SPARQL. A discussion of the Parliament1 triple store, relationships is difficult. The ability to answer a mean- its GeoSPARQL spatial index, and a use-case that il- ingful query, like "What parks are within 3km of the lustrates a simple example of data integration with Washington Monument?", depends on how the data is GeoSPARQL follows. represented, whether all of the resources are related to the Washington Monument, and if that relationship is explicit. 1http://parliament.semwebcentral.org

0000-0000/0-1900/$00.00 c 0 – IOS Press and the authors. All rights reserved 2

Dave Kolas is a co-chair for the GeoSPARQL Stan- Y, Z ordinates. A geographic (or geodetic) coordinate dards Working Group, and Robert Battle has worked system uses a spherical surface to determine locations. on the Parliament implementation and provided feed- In such a system, a point is defined by angles measured back to the development of the standard. from the center of the Earth to a point on the surface. These are also known as latitudes (horizontal) and lon- gitudes (vertical). A Cartesian coordinate system is a 2. Geospatial Concepts flat coordinate system on the surface. It enables quick and accurate measurements over small distances and is Some basic understanding of geospatial concepts is useful for applications such as surveying. required for discussion of GeoSPARQL. The follow- An ellipsoid defines an approximation for the center ing sections define some of the terms used throughout and shape of the Earth. A datum defines the position the rest of this paper. of an ellipsoid relative to the center of the earth. This provides a frame of reference for measuring locations 2.1. Features and Geometries and, for local datums, allows for accurate locations to be defined for the valid area of the datum. WGS84 is a Features and geometries are two fundamental con- datum that is widely used by GPS devices that approxi- cepts of geospatial science. A feature is simply any en- mates the entire world. Geographic coordinate systems tity in the real world with some spatial location. This use an Earth based datum that transforms an ellipsoid could be a park, an airport, a monument, a restaurant, into a representation of the Earth. etc. A feature can have a spatial location that cannot In order to create a map of the Earth, it must be pro- be precisely defined, such as a swamp or a mountain jected from a curved surface onto the plane. This pro- range. A geometry is any geometric shape, such as a jection will distort the surface in some fashion which point, polygon, or line, and is used as a representation will mean that the coordinates for some locations are of a feature’s spatial location. For instance, Reagan are more accurate than those in another. Some pro- National Airport is a geospatial feature because it is an jections will preserve area, so the size of all objects entity that has a specific location in the world. It has a is relative, while others preserve angles, and others geometry that is a point with coordinates 38.852222, - try to do both. A coordinate system projected onto a 2 77.037778 (in the WGS84 datum). Geometries can be plane enables faster performance, as Cartesian calcu- measured at varying resolutions, from a simple point lations require fewer resources than Spherical calcula- in the center of a feature to a complex, precise mea- tions. Computations across the plane, however, are in- surement of a feature’s entire border. Spatial data typi- accurate when they deal with large areas as the curva- cally separates features from geometries, although that ture of the Earth is not taken into account. is not always the case. The combination of these elements defines a CRS. One common source of well defined coordinate refer- 2.2. Coordinate Reference Systems ence systems is the European Petroleum Survey Group (EPSG)3. An important part of the metadata associated with a geometry is its coordinate reference system (CRS) (al- 2.3. Topological Relationships ternatively known as its ). The elements of a coordinate reference system provide con- All spatial entities are inherently related to some text for the coordinates that define a geometry in order other spatial entity. Whether two entities intersect to accurately describe their position and establish rela- somehow or are thousands of miles apart, the relation- tionships between sets of coordinates. There are four ship that they share can be described and evaluated. parts that make up a CRS: a coordinate system, an el- lipsoid, a datum, and a projection. In [28], Randell et al. describe an interval logic for A coordinate system describes a location relative to reasoning about space using a simple ontology that de- some center. A geocentric coordinate system places the fines functions and relations for expressing and rea- center at the center of the Earth and uses standard X, soning over spatial regions. This logic is referred to as Region Connection Calculus (RCC). A subset of RCC,

2http://en.wikipedia.org/wiki/World_ Geodetic_System 3http://www.epsg-registry.org 3

Table 1 Geospatial reasoning is critical in a large number , Egenhofer and RCC8 relations equivalence of application domains (emergency response, trans- Simple Features Egenhofer RCC8 portation planning, hydrology, land use, etc.). Users in equals equal EQ these domains have long utilized relational databases disjoint disjoint DC with spatial extensions [9]. These spatially extended intersects ¬disjoint ¬DC databases have given the combination of efficient, sta- touches meet EC ble storage and retrieval of data with geospatial calcu- within inside + coveredBy NTPP + TPP lation and indexing. This allows questions like "Which contains contains + covers NTPPi + TPPi students live within 2km of the school they attend?" to overlaps overlap PO be answered efficiently. Within the last decade, RDF storage solutions have RCC8, defines eight mutually exhaustive pairwise dis- become increasingly popular. These knowledge bases, joint relations which can be used to imply the rest of sometimes called triple stores, are capable of better the relations in RCC. These eight base relations are: handling several types of problems at which relational databases struggle or are not intended to perform: 1. DC(x, y) (x is disconnected from y) queries with many joins across entities [32], queries 2. x = y (x is identical with y) with variable properties [32], and ontological inference 3. PO(x,y) (x partially overlaps y) on datasets. These features lend themselves towards 4. EC(x,y) (x is externally connected with y) problems that involve data exploration, linkage across 5. TPP(x,y) (x is a tangential proper part of y) datasets, and abstraction from low level data. 6. NTPP(x,y) (x is a non-tangential proper part of Because of RDF stores’ ability to do inference and y) easily link data sets, they have been of growing inter- 7. TPPi(x,y) (y is a tangential proper part of x) est to the geospatial data community. Often geospa- 8. NTPPi(x,y) (y is a non-tangential proper part of tial domains have complicated type hierarchies which x) cannot be fully expressed in current geospatial infor- The same set of eight geospatial topological rela- mation systems. For instance, a river is both a water- tions is described with different names by Egenhofer in way and a transportation route. Also, geospatial do- [8], and includes the capacity to describe the relation- main problems often require marrying multiple data ship between different dimensioned geometries. This sources together to solve a particular problem. In emer- model was later generalized in the Nine Intersection gency response scenarios, population data, transporta- Model [11]. The Open Geospatial Consortium Sim- tion data, and even realtime police and fire data must ple Feature Access Common Architecture specifica- be combined to deliver a timely result. Combining data tion [25] uses the Nine Intersection Model introduced sources on the web is useful to consumers as well; by Egenhofer to describe spatial relations for use in ge- geospatial data about points of interest combined with ographic access systems. Table 1 illustrates the equiv- hotel information and travel route information could alence between all of these spatial relations. lead to significantly more sophisticated travel planning than currently exists. As such, it was inevitable that those people inter- 3. Motivation for GeoSPARQL ested in expressing geospatial data on the Semantic Web would want to combine the spatial indexing and The Open Geospatial Consortium is a non-profit calculation of spatial databases with the inferential standards organization focused on geospatial data. The power and data linkage of RDF triple stores [16]. This OGC is composed of members of industry, academic has been done by many groups in varying ways for institutions, and government organizations. By stan- varying purposes, as will be discussed later. dardizing GeoSPARQL within the OGC, we seek to To provide geospatial reasoning and querying in leverage the experience of its members to ensure that a triple store, the implementors must define both an geospatial Semantic Web data is represented in a con- ontology for representing spatial objects and query sistent logical way. With the input and acceptance of predicates for retrieving these spatial objects. How- all of both knowledge base vendors and data produc- ever, each organization that has attempted this task ers and consumers, GeoSPARQL has the potential to has approached it in a slightly different way. As a re- unify geospatial RDF data access. sult, spatial RDF data that would be properly indexed 4 and queryable in one implementation would simply be 4. Geospatial RDF: State of the Art and Related treated as plain RDF data in another implementation. Work This is the primary problem which the standard- ization of GeoSPARQL attempts to solve. The Geo- Over the past decade, there have been many dif- SPARQL language defines both a small ontology for ferent attempts to create a geospatial RDF standard. representing features and geometries and a number of Several different organizations, including the W3C, re- SPARQL query predicates and functions. All of these search groups, and triple store vendors have created are derived from other OGC standards so that they are their own ontologies and strategies for representing well grounded and understood. Using the new standard and querying geospatial data. should ensure two things: (1) if a data provider uses In 2003, a W3C Semantic Web Interest Group cre- the spatial ontology in combination with an ontology ated the Basic Geo Vocabulary [31] which provided a way to represent WGS84 points in RDF. This work of their domain, that data can be properly indexed and was extended from 2005 through 2007 by the W3C queried in spatial RDF stores; and (2) compliant RDF Geospatial Incubator Group [21] to follow the GeoRSS triple stores should be able to properly process the ma- [23] feature model to allow for the description of jority of spatial RDF data. points, lines, rectangles, and polygon geometries and Aside from providing the ability to perform spa- their associated features. This group produced the tial queries, the small ontology portion of the Geo- GeoOWL ontology4 which provides a detailed and SPARQL specification is intended to provide an in- flexible model for representing geospatial concepts. terchange format for geospatial data in a wide variety These ontologies were produced as products from their of use cases. The ontology is meant to be attached to respective groups within the W3C as the result of col- other ontologies for various domains, providing only laborations across university and industry partners. the bare spatial aspects. It is intended to be simple There are published datasets that use the W3C vo- enough to cover the most light-weight uses, and scale cabularies [4] and a variety of triple stores that support in complexity for complex use cases. GeoSPARQL data represented by these ontologies [17,22]. However, goes a long way to solving one of the research issues the associated working groups never moved beyond posed by Egenhofer in [10]. Namely that "we need a the incubator state and the respective ontologies never plausible canonical form in which to pose geospatial became official W3C recommendations. These ontolo- data queries". gies also only work with data in the WGS84 datum. GeoSPARQL is intended to inter-operate with both In order to be valid, data in any other CRS must be quantitative and qualitative spatial reasoning systems. projected which can lead to inaccuracy in the data. A quantitative spatial reasoning system involves con- Support for spatial data in triple stores is mixed. Several vendors support spatial data, but not all ven- crete geometries for features. With these concrete ge- dors support the same representation of data or share ometries present, distances and topological relations the same support for relational queries. Some triple can be explicitly calculated. Qualitative geospatial rea- stores use the aforementioned W3C ontologies, while soning systems allow RCC type topological inferences others have invented their own. for features where the geometries are either unknown Parliament, the high performance [18] triple store or cannot be made concrete [13]. For example, if there from Raytheon BBN Technologies, provided the first are assertions that a monument is inside a park, and geospatial index for semantic web data in 2007[16,17]. the park is inside a city, a qualitative reasoning sys- This index supports data in the GeoOWL ontology tem should be able to infer that the monument is in and introduces ontologies for querying spatial data via the city through transitivity. Hybrid versions of these RDF properties that correspond to the RCC spatial re- systems are also possible, where some features have lations and OGC Simple Features relations. However, concrete geometries and others have abstract geome- data in the W3C Basic Geo vocabulary is unsupported. tries. By sharing a set of terms for topological rela- tions, GeoSPARQL allows conclusions from quantita- 4http://www.w3.org/2005/Incubator/geo/ tive applications to be used by qualitative systems and XGR-geo-20071023/W3C_XGR_Geo_files/geo_2007. a single query language for both types of reasoning. owl 5

Ontotext’s OWLIM-SE5 triple store can index point A focus on describing data and metadata such as the data represented in the W3C Basic Geo Vocabulary CRS for geometries allows SPARQL-ST to operate [22]. The only spatial relationship that can be queried with data of different system which is something that is is whether a point is contained within a circle or poly- lacking from many vocabularies such as GeoOWL and gon. The ability to query for relationships between the Basic Geo vocabulary. This increased flexibility higher order geometries such as lines and polygons is with data is hindered, however, by the proposed query not supported. syntax which deviates from the standard SPARQL lan- OpenLink Virtuoso6 also has support for the W3C guage. Any data that is in the SPARQL-ST format can Basic Geo Vocabulary. A SPARQL function is pro- only be accessed by a SPARQL-ST system. vided to convert a pair of latitude and prop- Taking a simpler approach, Zhai et al., in [34] and erty values into a point geometry. A special literal [33], discuss the need for adding topological predicates datatype, virtrdf:Geometry, is also provided for to SPARQL. The OGC Simple Features relations and a indexing point literals. Support for testing intersection subset of geometries are used as the basis for their on- and containment relationships is provided via property tologies. Unfortunately, the relations have to be specif- functions. Through a combination of these relations ically encoded in RDF and the data cannot support and their negations, most of the Egenhofer relations multiple coordinate reference systems. can be tested. However, some relations, such as testing Another approach described by Koubarakis and for overlap, cannot be supported in Virtuoso. Kyzirakos in [19] is based on research from the con- Other triple stores take a completely different ap- straint database community. Constraint databases are proach. As described in [12], Franz AllegroGraph7 de- a promising technology for integrating spatial data fines a custom datatype to represent geospatial data [5]. The authors propose to enrich the Semantic Web and map it to a "strip" of space in the index that con- with spatial and temporal data by extending RDF and tains the data. A modified SPARQL syntax provides a SPARQL with constraints. The proposed extension to new GEO operator where the geospatial aspect of the RDF, stRDF, uses typed literals to describe a semi- query is defined. linear point set. The query language, stSPARQL, ex- OpenSahara8 provides a service for adding exter- tends SPARQL to includes additional operators for nal indexing and geospatial querying capabilities to querying RCC relationships and introduces a new syn- any triple store with a Sesame9 Sail layer. The im- tax for specifying spatial variables. Support for mul- plentation is a wrapper for the PostgreSQL10 database tiple coordinate reference systems is not discussed. with PostGIS spatial extensions11 and as such, Open- Compatibility with existing data is also a problem as Sahara supports all of the geometries and relations de- not all data is represented as a semi-linear point set. fined in the OGC Simple Features Access [25]. Lit- stSPARQL is intentionally not compatible with OGC eral datatypes are introduced for Well-Known Text standards as the authors believe that the semi-linear (WKT), Well Known-Binary (WKB), and their com- point set can be used to describe different geometries pressed forms while the spatial relations that PostGIS without forcing a hierarchy of datatypes on users. In supports are implemented as SPARQL filter functions. [20], stSPARQL and stRDF are described with an ad- In addition to vendor supported options, there has ditional context of GIS applications. The authors con- been significant community and research interest in- trast stRDF and stSPARQL with prior work (such as representing and querying geospatial data in the last GeoOWL) and note that it is hard to find information few years. Perry proposed an extension to SPARQL, about geometries when you do not what type of ge- SPARQL-ST in [26]. This introduces a modified ometries will be in the answer set and what properties SPARQL syntax for posing spatial queries to data that to ask about. This is a problem with representations is modeled in an upper ontology based on GeoRSS. like GeoOWL where different geometries have differ- ent properties. The semi-linear point set avoids this by encapsulating the representation into a single datatype. 5http://www.ontotext.com/owlim/editions 6http://www.openlinksw.com The NeoGeo Vocabulary [29] is a vocabulary that 12 7http://www.franz.com/agraph/allegrograph arose from the NeoGeo community , the 8http://www.opensahara.com community, and several universities. They recognized 9http://www.openrdf.org 10http://www.postgresql.org 11http://postgis.refractions.net 12http://sites.google.com/site/neogswvocs/ 6 the need for a well formed standard representation for are many of these and they all differ slightly, data that geospatial data and provided one that is based on the can be spatially queried in one knowledge base may Geography Markup Language (GML) Simple Features not be able to be spatially queried in another. Geo- Profile13 . To describe spatial relations, an ontology SPARQL provides a standard for geospatial RDF data based on RCC8 is also provided. A limitation of the insertion and query, which covers the use cases of the NeoGeo approach is the need to represent each coor- other previous approaches. dinate as a resource. In particular, polygons and lines The GeoSPARQL specification attempts to enable a are represented with an RDF collection of Basic Geo wide range of geospatial query use cases, from sim- points. While this does allow points to be shared across ple points of interest knowledge bases to detailed au- geometries, it significantly increases the verbosity of thoritative geospatial data sources for transportation. the data. Unfortunately this extra verbosity does not re- Moreover, use of GeoSPARQL for both of these types sult in particular gains in expressive power, since each of data should enable the data sets to be easily used point in a polygon provides little value in isolation and together. In order to achieve this goal, different con- RDF list contents are difficult to query for in SPARQL. formance classes are provided. This means that a sim- Moreover this prevents geometry literals from being ple knowledge base implementation intended for sim- compared easily in SPARQL filter functions. ple use cases need not implement all of the more ad- In [15], Hu and Du compose a three level hierar- vanced reasoning capabilities of GeoSPARQL, such as chical spatiotemporal model: a meta level for abstract quantitative reasoning or query rewriting. space-time knowledge, a schema level for well-known There are also several sets of terminology for the models in spatial and temporal reasoning (e.g., RCC, topological relationships between geometries. Rather Allen time[1]), and an instantiations level that pro- than mandate that all implementations use the same vides mappings and formal descriptions of the various set of terminology, each implementation can choose ground spatiotemporal statements in the Linked Data which sets of terms to support. This is discussed fur- clouds. This meta approach provides a convenient way ther in the section on GeoSPARQL relationships. to abstract out spatial knowledge from its underlying GeoSPARQL attempts to address the problems with representation. Mappings, however, have to be defined the disparate and incompatible implementations for for each dataset at the instantiations level. representing and querying spatial data. It achieves this A complete implementation of integrating spatial by defining an ontology that closely follows the exist- data and queries into an RDF triple store is described ing standards work from the OGC with regard to spa- by Brodt et al. in [7]. By typing WKT string repre- tial indexing in relational databases. sentations of geometry literals with a spatial datatype, The GeoSPARQL specification contains three main the triple store is able to efficiently store and query components: data. Once again, the OGC Simple Features relations 1. The definition of a vocabulary to represent fea- are used as the basis for posing queries for the spatial tures, geometries, and their relationships relation. These relations are mapped to SPARQL filter 2. A set of domain-specific, spatial functions for functions which allow for direct comparison between use in SPARQL queries spatial literals. This implementation is similar to the 3. A set of query transformation rules approach taken by Parliament and OpenSahara in the way that the SPARQL language itself does not need to 5.1. GeoSPARQL Ontology be modified in order to query for the spatial relations between entities. The ontology for representing features and geome- tries is fundamental to being able to build and query spatial data. The ontology is based on the OGC’s Sim- 5. Introduction to GeoSPARQL ple Features model, with some adaptations for RDF. Note that prefix definitions are omitted from exam- As illustrated above, many groups have created on- ples for clarity; a definition of all of the prefixes used tologies and query predicates to make indexing and is at the end of the paper in listing 14. The ontol- query of geospatial data possible. However, since there ogy includes a class geo:SpatialObject, with two primary subclasses, geo:Feature and geo: 13http://www.ogcnetwork.net/gml-sf Geometry. These classes are meant to be connected 7 to an ontology representing a domain of interest. Fea- can be dependent on the triple store implementation, tures can connect to their geometries via the geo: though it is likely that implementations will support hasGeometry property. all three. The Simple Features topological relations in- For example, an airport is an geo:Feature. It is clude equals, disjoint, intersects, touches, within, con- a conceptual thing that exists in the real world in a tains, overlaps, and crosses. particular place. In the real world, it has a geometry The filter functions provide two different types that corresponds to the coordinates of all of the points of functionality. First, there are operator functions along the border of the airport area. This real-world which take multiple geometries as predicates and pro- geometry has to be measured and estimated in some duce either a new geometry or another datatype as way. It is possible to do this measurement at various a result. An example of this is the function ogcf: resolutions, each of which may serve well for differ- intersection. This function takes two geometries ent purposes. A representation of the real world ge- and returns a geometry that is their spatial intersec- ometry which has been measured becomes an geo: tion. Other functions like ogcf:distance produce Geometry. Thus the airport may have several geo: an xsd:double as a result. The second type of func- Geometrys, ranging from a single point in the cen- tionality is boolean topological tests of geometries. ter of the airport to an extremely detailed polygon that These come in the same three vocabulary sets as the closely follows the airport’s outside border. A geome- topological binary properties: simple features topo- try that will function for most purposes within a dataset logical relations, Egenhofer relations, and RCC8 re- can be specified as the geo:defaultGeometry. lations. These functions are partially redundant with GeoSPARQL includes two different ways to repre- the topological binary properties; however, the topol- sent geometry literals and their associated type hier- ogy functions take the geometry literals as parameters, archies: WKT and GML. An implementor of a spa- while the binary properties relate Geometry and Fea- tial triple store may choose to support either or both ture entities. This means that quantitative and qual- of these representations. GeoSPARQL provides differ- itative applications can both make use of the binary ent OWL classes for the geometry hierarchies associ- properties, but only quantitative applications can make ated with both of these geometry representations. This use of the topology functions. Also, comparisons to provides classes for many different geometry types concrete geometries provided in the query can only be such as point, polygon, curve, arc, and multicurve. made via the functions. An example of the topological The geo:asWKT and geo:asGML properties link functions is ogcf:intersects, which returns true the geometry entities to the geometry literal represen- if two geometries intersect. tations. Values for these properties use the geo-sf: WKTLiteral and geo-gml:GMLLiteral data types respectively. 5.3. Query Transformation Rules

5.2. GeoSPARQL Relationships The query rewrite rules allow for an additional layer of abstraction in SPARQL queries. While only con- GeoSPARQL also includes a standard way to ask for crete Geometry entities can be quantitatively com- topological relationships, such as overlaps, between pared, it nonetheless sometimes makes sense to discuss spatial entities. These come in the form of binary prop- whether two features have a particular topological re- erties between the entities and geospatial filter func- lationship. This is represented in the natural language tions. question, "Is Reagan National Airport within Wash- The topological binary properties can be used in ington, DC?". Although Reagan National is referred SPARQL query triple patterns like a normal prop- to as a Washington DC airport, it is actually across erty. Primarily they are used between objects of the the Potomac river in Virginia. In GeoSPARQL, Fea- Geometry type. However, they can also be used be- ture to Feature and Feature to Geometry topological tween Features, or between Features and Geometries, relations are achieved by the combination of the use if GeoSPARQL’s query rewrite rules are supported of the geo:defaultGeometry property and the (discussed in the next section). The properties can be query rewrite rules. If a feature is used as the subject or expressed using three distinct vocabularies: the OGC’s object of a topological relation, the query is automat- Simple Features, Egenhofer’s 9-intersection model, ically rewritten to compare the Geometry linked as a and RCC8. Which of these vocabularies is supported default, thus removing the abstraction for processing. 8

Listing 1 shows a query with a relationship between Feature objects before and after rewrite. Listing 2: Example Ontology ex:Restaurant a owl:Class; Listing 1: Query Rewrite Example rdfs:subClassOf ex:Service . ex:Park a owl:Class; #Before rdfs:subClassOf ex:Attraction . ASK { ex:Museum a owl:Class; ex:DCA a geo:Feature; rdfs:subClassOf ex:Attraction . geo:within ex:WashingtonDC . ex:Monument a owl:Class; ex:WashingtonDC a geo:Feature . rdfs:subClassOf ex:Attraction . } ex:Service a owl:Class; rdfs:subClassOf ex: #After PointOfInterest . ASK { ex:Attraction a owl:Class; ex:DCA a geo:Feature; rdfs:subClassOf ex: geo:defaultGeometry ?g1 . PointOfInterest . ex:WashingtonDC a geo:Feature ; ex:PointOfInterest a owl:Class; geo:defaultGeometry ?g2 . rdfs:subClassOf geo:Feature . ?g1 geo:within ?g2 . }

The goal of this feature is to provide a more intu- Listing 3: Washington Monument itive approach to geospatial querying for use cases that do not require many different geometries, while still ex:WashingtonMonument a ex:Monument; maintaining a concrete definition of this intuitive un- rdfs:label "Washington Monument"; derstanding. Compliant GeoSPARQL triple stores are geo:hasGeometry ex:WMPoint . not required to implement this feature. ex:WMPoint a geo:Point; geo:asWKT "POINT(-77.03524 5.4. Using GeoSPARQL 38.889468)"^^geo-sf:WKTLiteral . Consider an example using a points of interest ontol- ogy in listing 2. We seek to represent points of interest of various types (Monuments, Parks, Restaurants, Mu- Listing 4: Point in WGS84 datum seums, etc.). These types of landmarks are represented " POINT(38.889468 class at the top. These classes of course may include -77.03524)^^"geo-sf:WKTLiteral many non-spatial attributes, but only a label is included here. All that is required to link this ontology with While this representation may seem verbose, and the GeoSPARQL, and thus give its classes a geospatial ref- literal string is no longer standard WKT, it has the erence, is to make ex:PointOfInterest a sub- advantage of encoding the CRS information directly class of geo:Feature. into the literal. This is all of the data needed to de- If compliance with WKT is chosen, and latitudes fine the Geometry; without the CRS, another prop- and are expressed in WGS84 datum in a erty would need to be added onto the Geometry which longitude latitude order (CRS:84), a record for the would increase storage requirements and make sharing Washington Monument would look like listing 3. If data more cumbersome. a coordinate reference system other than CRS:84 is Now we will look at a few example GeoSPARQL desired, that can be included within the WKTLiteral. queries using this data. One potential query over this Listing 4 expresses the same point in WGS84 with lat- dataset would be, "Which monuments are contained itude longitude order. within a park?" This query requires a topological com- 9

Listing 5: Example Query 1 Listing 7: Example Query 3 SELECT ?m ?p SELECT ?a WHERE { WHERE { ?m a ex:Monument ; ?a a ex:Attraction; geo:hasGeometry ?mgeo . geo:hasGeometry ?ageo . ?p a ex:Park ; FILTER(geof:within(?ageo, geo:hasGeometry ?pgeo . "POLYGON(( ?mgeo geo:within ?pgeo . -77.089005 38.913574, } -77.029953 38.913574, -77.029953 38.886321, -77.089005 38.886321, -77.089005 38.913574 ))"^^geo-sf:WKTLiteral)) Listing 6: Example Query 2 } SELECT ?m ?p WHERE { ?p a ex:Park . ?m a ex:Monument ; Listing 8: Example Query 4 geo:within ?p . } SELECT ?p WHERE { ?p a ex:Park ; parison between the geometries of the monuments and geo:hasGeometry ?pgeo . the geometries of parks. We show it in listing 5 using the binary topology property geo:within. The two ex:WashingtonMonument entities have type statements, geo:hasGeometry geo:hasGeometry ?wgeo . properties to tie them to their geometries, and then the geo:within function to tie them together. FILTER(geof:distance(?pgeo, ?wgeo, If the knowledge base being used supported the units:m) < 3000) query rewriting rules, and the data set included de- } fault geometries, the first query could be rewritten even more simply using a feature-to-feature topological re- lationship. This method is demonstrated in listing 6. pressed in GeoSPARQL. We assume the same URI for Spatial user interfaces often need to look for entities the Washington Monument in the data example above. of a particular type that fall within an explicit bounding We need to retrieve the two geometries and use the box. Consider the query, "What attractions are within function geof:distance to calculate the distance the bounding box defined by (-77.089005, 38.913574) between them. A standard SPARQL less than func- and (-77.029953, 38.886321)?" Because we need to tion is then applied. This query is shown in listing 8. specify an explicit geometry in the query, we need to These example queries require relatively little in terms compare to it using the topological filter functions as of non-spatial constraints, but serve to illustrate some opposed to the binary properties. We have the attrac- basic functionality with GeoSPARQL. With a more tion entity and its attached geometry, and the geometry complex ontology, queries could include more compli- is compared with the filter function geof:within. cated thematic elements as well. This query is shown in listing 7. Note that the bound- ing box is expressed as a Polygon. Queries looking for entities within a particular dis- 6. Spatial Indexing of RDF Data in Parliament tance of either other entities or a current location are extremely useful as well. "Which parks are within Storing geospatial data just as RDF triples does not 3km of the Washington Monument?" can be easily ex- allow for the spatial exploitation of that data. In order 10 to be able to efficiently query for the relationships be- tween spatial entities, the data must be indexed. This Listing 9: Property Function Query allows only those resources that match the spatial com- ponent of a query to be retrieved, rather than spatially SELECT ?x filtering all bindings that match a given query. The rel- WHERE { ative performance advantages of using a spatial index ?x apf:concat( "Hello", " ", " versus spatially filtering a result set is discussed in [7]. World") . } Parliament uses a spatial index to not only find the data that matches a spatial query, but to also decide whether the spatial part of a query is more selective than the 16 other. This can be used to optimize how a query is ex- An implementation of Jena’s graph interface pro- ecuted and is discussed below. vides access to the base graph store with support for adding, removing, and finding triples. By attaching a 17 6.1. Enabling GeoSPARQL listener to the graph, the addition and removal of triples can be detected and forwarded to any associated Parliament has a modular architecture that enables indexes. Parliament’s GeoSPARQL index listens for indexes to be built on top of the storage engine. Parlia- triples that contain the geo:asWKT or geo:asGML ment already includes a spatial index based on a stan- predicates. Any triple that is added to the graph is dard R-tree implementation [14]. In a similar approach checked to see if it matches. At this point, the index can to [7], the spatial index is integrated in a way that create a record for the geometry that is represented in provides native support for spatial relational queries the object of the triple and insert it into the index. For and efficient storage of the data. The general goal for instance, when adding the triples in listing 3 to Parlia- this index is to split SPARQL queries with geospatial ment, the index will generate a single record that con- ex:WMPoint information into multiple parts, allowing for an opti- tains a reference to the resource and its mized query plan between the spatial components of WKT value. the query and the components with non-spatial triples In order to query for data in an index, we have to be executed. extended the ARQ query engine to support property Before the emergence of GeoSPARQL, Parliament functions that can access indexes. When ARQ parses indexed data represented in GeoOWL and allowed a SPARQL query, it detects the different operators in RCC8 and OGC Simple Features relations to be the query. A particularly useful feature of ARQ is the support for property functions. Instead of matching a queried [16,17]. We are currently implementing Geo- triple in a graph, property functions can execute cus- SPARQL based on the public candidate draft standard, tom code in the context of the SPARQL query. Con- and we describe this implementation in the following sider the query in listing 9. It will yield a result set with section. a single binding for ?x by concatenating all of the ar- guments together to form the literal "Hello World" in- 6.2. Index Specification stead of attempting to match the triple pattern. Parliament utilizes property functions to define the The index interfaces in the Parliament API include relations that can be queried via SPARQL. The Geo- methods for building a record from data, adding and SPARQL spatial relations, such as geo:intersects, removing records, finding a record by URI, and finding are implemented as property functions. records by value. Existing indexes include the afore- mentioned GeoOWL spatial index, a temporal index 6.3. Optimizing Query Execution (for indexing OWL Time14), and a basic numeric in- dex (for optimizing range queries on numeric property By using an index, a query can be optimized such values). that the most selective part is executed first. Con- The Parliament triple store is built with support for Jena’s RDF API and ARQ SPARQL query engine15. 16http://openjena.org/javadoc/com/hp/hpl/ jena/graph/Graph. 14http://www.w3.org/TR/owl-time/ 17http://openjena.org/javadoc/com/hp/hpl/ 15http://www.openjena.org jena/graph/GraphListener.html 11

Listing 10: Optimization Example Query Listing 11: Example Query 4 - Optimized SELECT ?m SELECT ?p WHERE { WHERE { ?m a ex:Monument ; ?p a ex:Park . geo:hasGeometry ?mgeo . ex:WashingtonMonument geo: ?mgeo geo:within ex: hasGeometry ?wgeo . NationalMallGeometry . LET (?buff := geof:buffer(?wgeo, } 3000, units:m)) . ?p geo:within [ a geo:Point ; sider the query in listing 10. This query is asking for geo:asWKT ?buff all monuments within the National Mall. Parliament’s ]. query optimizer splits the query into blocks for execu- tion based on how the variables in the query are used and what special operations occur in the query. In this 10 monuments exist in the triple store, the basic graph instance, since the predicate, geo:within, is an in- pattern would execute first and the spatial relationship dex property function, the triple containing the predi- would be tested for each of the 10 results. cate is considered as one query block. The rest of the Consider again the query in listing 8. This query query is a simple basic graph pattern containing two is deceptive in that it appears to be asking a simple triples describing ?m. The basic graph pattern is ana- geospatial question: "What are all the parks within lyzed and partitioned so that no variable crosses par- 3km of the Washington Monument?". However, there titions. For this query, this generates two partitions. is no way for Parliament’s query optimizer to know When the query is executed, the Parliament query op- a priori what the distances between geometries are. timizer has two choices: (1) it can execute the spatial While some implementations of GeoSPARQL may be part first and then match to the graph pattern, or (2) optimized to handle cases like this, in Parliament this it can execute the graph pattern first, then execute the query would not use the spatial index at all; the ge- spatial operation. The optimizer will decide what to ometries for all parks would be found before check- do based on which path estimates it will provide the ing their distance. Listing 11 shows a more efficient fewest result bindings. version of this query. This version buffers the geome- In this example, there are two sub patterns: the in- try for the Washington Monument by 3000 meters and dex property function, and the basic graph pattern for then uses that buffer to test the geo:within relation- ?m. Each sub pattern estimates how many results that ship. Parliament can take this query and run the spa- it will be able to provide. For operations within the tial component against the spatial index in order to re- spatial index, a bounding box query can be performed duce the amount of results that need to match the rest to estimate how many results will be returned. In this of the query. A future version of the Parliament query example, the index will look up the bounding box for optimizer will be able to optimize queries like listing 8 ex:NationalMallGeometry and calculate how automatically. many items it contains. For basic graph patterns, Par- liament keeps statistics on the triples it contains and can quickly estimate how many matches are in the 7. GeoSPARQL and Linked Data store for a given triple pattern. Sub patterns containing basic graph patterns use these statistics to estimate how As the Semantic Web materializes on the internet in many triples will be bound by the pattern. After each the form of Linked Data, there is an increasing amount sub pattern has an estimate, they are ordered in ascend- of structured data available with some sort of geospa- ing order executed accordingly. In this case, if there tial context attached. However, the vast majority of this were 500 monuments with geometries, but only 100 geospatial context cannot be utilized for spatial queries geometries within the bounding box, the index prop- because the hosting SPARQL endpoints cannot per- erty function would be executed first. If, however, there form them. In the following section, we discuss four were 500 geometries within the bounding box, but only datasets that are representative of the disparate types 12 of data that can be integrated together via there spatial deliver usable topographic information for the United context. We then demonstrate this integration on two States. This dataset is much more specific and special- datasets using Parliament. ized than the data that is provided by DBPedia and GeoNames. It includes geographic names, hydrogra- 7.1. Geospatial Data Sets phy, boundaries, transportation, structures, and land cover. The group has attempted to follow the forthcom- GeoNames18 provides information for over eight ing GeoSPARQL specification, though some aspects million geospatial features. The data is exposed via an of GeoSPARQL have changed slightly since the data RDF webservice19 that exposes information on a per has been published. Due to a lack of available Geo- resource basis. The geospatial aspect is represented us- SPARQL triple stores, the published dataset includes ing the W3C Basic Geo vocabulary. There is no abil- a pre-computation of all of the topological relation- ity, however, to perform any type of SPARQL query to ships between entities. A query such as "Show all rail retrieve data. lines that cross rivers" is in fact possible to answer by Another significant source of geospatial data is DB- looking at the current precomputed data. However, this Pedia20. As described in [2], data from Wikipedia21 means that if a new entity is added, the knowledge base is extracted into RDF. Many of these extracted en- needs to compute everything that the entity is related tities are geospatial in nature (cities, counties, coun- to and update those entities as well. If this data was not tries, landmarks, etc. . . ) and many of these entities al- precomputed, the only way to answer the query would ready contain some geospatial location information. In be via indexing the data and querying the relations with fact, DBPedia contains information from GeoNames a relationship predicate. and goes so far as to include the latitude and longi- tude information for many entities. It also provides 7.2. Integration in Parliament via GeoSPARQL owl:sameAs links between the DBPedia and GeoN- ames resources. However, without any spatial compu- GeoSPARQL provides the means to link geospatial tation predicates, this geospatial information can only datasets together and extract more meaning from the be retrieved "as is." A query like "Show all cities relations between the data. As it is simply not possi- within 50 miles of Arlington, VA with a population of ble to pre-compute all of the relations between all of at least 100,000 people in which at least one famous the available geospatial datasets, enriching the existing person was born" is not possible, even though the data datasets with GeoSPARQL representations, and creat- exists to support it. ing indexes for the data is one way that data providers The linked data community has released the Linked- can share data while providing access to geospatial se- GeoData data set[3]. This data set is a spatial knowl- mantics. edge base, derived from Open Street Map22 and is For the following examples, we have processed linked to DBpedia and GeoNames resources. It con- a subset of the GeoNames RDF dataset24 and the tains over 200 million triples describing the nodes and USGS GeoSPARQL data for Atlanta, GA. This data paths from OpenStreetMap. The data set is accessible is accessible via a Parliament SPARQL endpoint with via SPARQL endpoints running on the OpenLink Vir- a GeoSPARQL spatial index25. The GeoSPARQL tuoso platform with all the benefits and limitations that index supports indexing data conforming to WKT this platform provides. and GML serialization. The supported GeoSPARQL Another source of geospatial data is the United queries include those with Simple Features, Egen- States Geological Survey (USGS). The USGS has re- hofer, and RCC8 relations, the non-topological query leased a SPARQL endpoint23 for triple data derived functions and common topological query functions. from The National Map[30], a collaborative effort to Query rewriting is not supported at this time. As so much data is represented as individual lat- 18http://www.geonames.org itude and longitude data using the W3C Basic Geo 19http://www.geonames.org/ontology/ vocabulary, including that provided by GeoNames, it documentation.html is desirably to be able to convert it easily into Geo- 20http://www.dbpedia.org 21http://www.wikipedia.org 22http://www.openstreetmap.org 24http://download.geonames.org/ 23http://usgs-ybother.srv.mst.edu:8890/ all-geonames-rdf.zip 25http://geosparql.bbn.com 13

Table 2 Schools in Georgia within 50m of a Rail Line Listing 12: GeoNames Conversion Query school distance CONSTRUCT { http://sws.geonames.org/4183400/ 13.87504515671802 ?feature a geo:Feature ; http://sws.geonames.org/7146774/ 39.795020615173165 geo:hasGeometry [ http://sws.geonames.org/4183400/ 44.09248276546987 a geo:Point ; http://sws.geonames.org/7146435/ 46.6898774805717 geo:asWKT ?wkt ]. pable knowledge base, such as Parliament, interesting } geospatial questions can be posed. Consider the fol- WHERE { lowing query: "What are all the schools in Georgia that ?feature a gn:Feature ; are within 50 meters of a railway". GeoNames pro- wgs84_pos:lat ?lat ; vides point data for buildings, including schools, while wgs84_pos:long ?long . the USGS data contains polyline data for rail lines LET (?wkt := spatial:toWKTPoint(? as well as polygons for different regions. Listing 13 lat, ?long)) . shows the GeoSPARQL formulation for this question. } This query takes advantage of several features of the language. First, the geo:within predicate is used to determine what schools exist within the boundary of SPARQL. It is trivial to provide functions that take Georgia (as defined by a specific USGS resource for a latitude and longitude pair and convert them into Georgia). Each school geometry is then buffered by a GeoSPARQL point Parliament provides SPARQL 50 meters using the geof:buffer function. The rail property functions, spatial:toWKTPoint and features are retrieved and checked to see if they fall spatial:toGMLPoint which take a latitude, lon- within this buffer. Finally, for all rail line segments that gitude, and optional spatial reference system identi- intersect the buffer, the actual distance to the school is fier as arguments and return a geo:WKTLiteral or calculated using the geof:distance function. Ta- geo:GMLLiteral representation of a point. This ble 2 displays the school resources and the distance to makes it possible to load and index existing spatial data the nearest rail line segment. sets without having to regenerate existing RDF. Af- While the sample query assumes everything is con- ter loading the GeoNames data into Parliament, it was tained in a single graph, it is not unusual for the data to aligned with GeoSPARQL using the query in listing exist in several different graphs or endpoints. In fact, it 12. This query explicitly assigns GeoNames features to would be ideal to not have to replicate data and to be be GeoSPARQL features, creates a new geo:Point able to query it remotely through the dataset provider. resource containing the point information, and links Until the means for query federation, such as the mech- the feature to the geometry. anism discussed in [27], are widely supported, query- The USGS provides their RDF data in a format ing across remote linked datasets will be difficult. Geo- that is similar to GeoSPARQL. The geospatial data SPARQL queries should be compatible with query fed- from The National Map contains polygon, polyline, eration, though there will likely be performance im- and point data. The data conforms to an earlier revision plications. In lieu of query federation, querying across of the GeoSPARQL standard and contains features graphs is possible. The sample data for the above ex- and geometries, but lacks typed literals for the geo: amples is actually contained in two different graphs. A asWKT and geo:asGML property values. For this union graph that virtually combines both graphs, how- data, the constructor functions for the GeoSPARQL lit- ever, enables a shorter and simpler query. eral datatypes can be used to create correctly typed val- ues at query time. The existing spatial relations state- ments in the dataset were ignored when processing the 8. Conclusion data. GeoSPARQL is the genesis of a significant amount 7.2.1. GeoSPARQL Data Query of previous work on combining RDF and OWL with Once GeoSPARQL data exists for both GeoNames geospatial data. Its creation means that geospatial data and the USGS and is loaded into a GeoSPARQL ca- interchange within the Semantic Web can take place 14

unify their implementations, allowing users to consis- Listing 13: Georgia School’s Near Rail Lines Query tently exchange and process geospatial data. We have worked to update our open source triple SELECT DISTINCT ?school ?distance store Parliament to support GeoSPARQL, and we hope WHERE { that others will find it useful both for working on GRAPH { geospatial Semantic Web applications and understand- # get Georgia geometry ing the specification. gu:_2403126 geo:hasGeometry ?ga_geo .

# get schools within Georgia 9. Acknowledgments ?school a geo:Feature ; geo:hasGeometry ?school_geo ; We would like to thank the people at the Cen- gn:featureCode gn:S.SCH . ter of Excellence for Geospatial Information Science (CEGIS) at USGS for their efforts to support Geo- ?school_geo gep:within ?ga_geo ; SPARQL even before the specification has been final- geo:asWKT ?school_wkt . ized. This has allowed us to identify and resolve addi- tional issues with the specification earlier. # buffer schools 50m We would also like to thank the OGC and the other LET (?s_buff := geof:buffer(? members of the GeoSPARQL working group for their school_wkt, 50, units:m)) . continued efforts to bring this marriage of the geospa- tial community and the Semantic Web to life. # find rail links within buffer ?rail a trans:railFeature ; Listing 14: RDF Prefixes geo:hasGeometry ?rail_geo . pf: LET (?rail_wkt :=geo-sf:WKTLiteral ex: gn: ,?s_buff)) . gu: LET (?distance := geof:distance(? geo: . geof: ORDER BY ASC(?distance) geo-sf: geo-gml: This, by extension, should lead to users’ ability to fi- os: text available in RDF datasets. ose: GeoSPARQL, support similar functionality or a subset osg: tial applications are. Hopefully when the GeoSPARQL owl: 15 rdfs: 0428-3. spatial: 472, New York, NY, USA, 1989. Springer-Verlag New York, time: Inc. ISBN 0-387-51295-0. trans: guage. IEEE Transactions on Knowledge and Data Engineer- units: Proceedings of the 10th ACM international symposium on Ad- wgs84_pos: New York, NY, USA, 2002. ACM. ISBN 1-58113-591-2. xsd: logical relations between regions, lines, and points in geo- graphic databases. Technical report, Department of Surveying virtrdf: [12] Franz, Inc. Geospatial support in SPARQL queries, April 2011. URL http://www.textriples.com/agraph/ support/documentation/4.2/sparql-geo.html. [Online; accessed 25-May-2011]. [13] R. Grütter and B. Bauer-Messmer. Combining owl References with rcc for spatioterminological reasoning on envi- ronmental data. In OWLED 2007 Workshop on OWL: [1] J. F. Allen. Maintaining knowledge about temporal intervals. Experiences and Directions, volume 258, 2007. URL Communications of the ACM, 26:832–843, November 1983. http://sunsite.informatik.rwth-aachen.de/ ISSN 0001-0782. Publications/CEUR-WS/Vol-258/paper17.pdf. [2] S. Auer and J. Lehmann. What Have Innsbruck and Leipzig [14] A. Guttman. R-trees: A dynamic index structure for spatial in Common? Extracting Semantics from Wiki Content. In searching. In International Conference on Management of E. Franconi, M. Kifer, and W. May, editors, Proceedings of Data, pages 47–57. ACM, 1984. the European Semantic Web Conference (ESWC-2007), vol- [15] H. Hu and X. Du. Linking Open Spatiotemporal Data in the ume 4519 of Lecture Notes in Computer Science, pages 503– Data Clouds. In J. Yu, S. Greco, P. Lingras, G. Wang, and 517. Springer Berlin / Heidelberg, 2007. ISBN 978-3-540- A. Skowron, editors, Rough Set and Knowledge Technology, 72666-1. volume 6401 of Lecture Notes in Computer Science, pages [3] S. Auer, J. Lehmann, and S. Hellmann. LinkedGeoData: 304–309. Springer Berlin / Heidelberg, 2010. Adding a Spatial Dimension to the Web of Data. In A. Bern- [16] D. Kolas. Supporting spatial semantics with SPARQL. In stein, D. Karger, T. Heath, L. Feigenbaum, D. Maynard, Proceedings of the Terra Cognita Workshop, collocated with E. Motta, and K. Thirunarayan, editors, Proceedings of the 8th the 7th International Semantic Web Conference (ISWC-2008), International Semantic Web Conference (ISWC-2009), volume 2008. 5823 of Lecture Notes in Computer Science, pages 731–746. [17] D. Kolas and T. Self. Spatially augmented knowledgebase. In Springer Berlin / Heidelberg, 2009. ISBN 978-3-642-04929-3. Proceedings of the 6th International Semantic Web Conference [4] C. Becker and C. Bizer. Exploring the geospatial semantic web (ISWC-2007), volume 4825 of Lecture Notes in Computer Sci- with mobile. Web Semantics: Science, Services and ence. Springer, 2007. Agents on the World Wide Web, 7(4):278 – 286, 2009. ISSN [18] D. Kolas, I. Emmons, and M. Dean. Efficient linked-list RDF 1570-8268. . URL http://www.sciencedirect.com/ indexing in Parliament. In 5th International Workshop on Scal- science/article/pii/S1570826809000468. Se- able Semantic Web Knowledge Base Systems (SSWS 2009), mantic Web challenge 2008. 2009. [5] A. Belussi, E. Bertino, and B. Catania. Manipulating spatial [19] M. Koubarakis and K. Kyzirakos. Modeling and Querying data in constraint databases. In M. Scholl and A. Voisard, ed- Metadata in the Semantic Sensor Web: The Model stRDF and itors, Advances in Spatial Databases, volume 1262 of Lecture the Query Language stSPARQL. In L. Aroyo, G. Antoniou, Notes in Computer Science, pages 113–141. Springer Berlin / E. Hyvönen, A. ten Teije, H. Stuckenschmidt, L. Cabral, and Heidelberg, 1997. ISBN 978-3-540-63238-2. T. Tudorache, editors, Proceedings of the Extended Semantic [6] C. Bizer, T. Heath, and T. Berners-Lee. Linked Data – The Web Conference 2010, volume 6088 of Lecture Notes in Com- Story So Far. International Journal on Semantic Web and puter Science, pages 425–439. Springer Berlin / Heidelberg, Information Systems, Special Issue on Linked Data, 5:1–22, 2010. ISBN 978-3-642-13485-2. 2009. ISSN 1552-6283. [20] K. Kyzirakos, Z. Kaoudi, and M. Koubarakis. Data models and [7] A. Brodt, D. Nicklas, and B. Mitschang. Deep integration of languages for registries in SemsorGrid4Env, 2009. Deliverable spatial query processing into native RDF triple stores. In Pro- D3.1, SemSorGrid4Env. ceedings of the 18th SIGSPATIAL International Conference on [21] J. Lieberman, R. Singh, and C. Goad. W3C Geospatial Ontolo- Advances in Geographic Information Systems, GIS ’10, pages gies, October 2007. URL http://www.w3.org/2005/ 16

Incubator/geo/XGR-geo-ont/. [Online; accessed 25- tional Conference on Knowledge Representation and Reason- May-2011]. ing, 1992. [22] Ontotext AD. Geo-spatial indexing in OWLIM, 2011. URL [29] J. M. Salas, A. Harth, B. Norton, L. M. Vilches, A. D. León, http://www.ontotext.com/owlim/geo-spatial. J. Goodwin, C. Stadler, S. Anand, and D. Harries. Neo- [Online; accessed 25-May-2011]. Geo Vocabulary: Defining a shared RDF representation for [23] Open Geospatial Consortium. An Introduction to GeoRSS: A GeoData. Public draft, NeoGeo, May 2011. URL http: Standards Based Approach for Geo-enabling RSS feeds. White //geovocab.org/doc/neogeo.html. paper, 2006. URL http://www.opengeospatial. [30] D. Varanka. National topographic modeling, ontology-driven org/pt/06-050r3. Document 06-050r3. geographic information in the context of the national map. In [24] Open Geospatial Consortium. OGC GeoSPARQL - A Geo- First International Workshop on Information Semantics and graphic Query Language for RDF Data. Open Geospatial Con- its Implications for Geographical Analysis (ISGA ’08) at GI- sortium, 2011. URL http://www.opengeospatial. Science 2008, the 5th International Conference on Geographic org/standards/requests/80. Document 11-052r3. Information Science, September 2008. [25] Open Geospatial Consortium. OpenGIS R Implementation [31] W3C. W3C Semantic Web Interest Group: Basic Geo (WGS84 Standard for Geographic information - Simple feature access lat/long) Vocabulary, 2006. URL http://www.w3.org/ - Part 1: Common architecture. Open Geospatial Consortium, 2003/01/geo/. [Online; accessed 25-May-2011]. 2011. URL http://www.opengeospatial.org/pt/ [32] C. Weiss, P. Karras, and A. Bernstein. Hexastore: sextuple in- 06-103r4. Document 06-103r4. dexing for semantic web data management. Proc. VLDB En- [26] M. Perry. A Framework to Support Spatial, Temporal and The- dow., 1:1008–1019, August 2008. ISSN 2150-8097. matic Analytics over Semantic Web Data. PhD thesis, Wright [33] Z. Xiao, L. Huang, and X. Zhai. Spatial information semantic State University, 2008. query based on sparql. In Proceedings of SPIE, volume 7492, [27] E. Prud’hommeaux. SPARQL 1.1 federation ex- October 2009. tensions. W3C working draft, W3C, June 2010. [34] X. Zhai, L. Huang, and Z. Xiao. Geo-spatial query based on http://www.w3.org/TR/sparql11-federated-query/. extended sparql. In 2010 18th International Conference on [28] D. A. Randell, Z. Cui, and A. G. Cohn. A spatial logic based Geoinformatics, pages 1–4, June 2010. on regions and connection. In Proceedings of the 3rd Interna-