Graphdb Free Documentation Release 6.6

Total Page:16

File Type:pdf, Size:1020Kb

Graphdb Free Documentation Release 6.6 GraphDB Free Documentation Release 6.6 Ontotext Dec 14, 2017 CONTENTS 1 General 1 1.1 About GraphDB...........................................2 1.2 Architecture & Components.....................................2 1.2.1 Architecture.........................................2 1.2.1.1 Sesame.......................................3 1.2.1.2 The SAIL API...................................4 1.2.2 GraphDB components...................................4 1.2.2.1 Engine.......................................4 1.2.2.2 Connectors.....................................5 1.2.2.3 Workbench.....................................5 1.3 GraphDB Free............................................5 1.3.1 Comparison of GraphDB Free and GraphDB SE......................6 1.4 Connectors..............................................6 1.5 Workbench..............................................6 1.5.1 Requirements........................................7 1.5.2 How to use it........................................7 2 Quick Start Guide 9 2.1 Starting the database.........................................9 2.2 Creating locations and repositories................................. 10 2.3 Loading data............................................. 11 2.3.1 Supported file formats................................... 11 2.3.2 Loading data through the GraphDB Workbench...................... 11 2.3.2.1 To load a local file:................................ 11 2.3.2.2 To load a database server file:........................... 12 2.3.2.3 Other ways of loading data:............................ 13 2.3.3 Loading data through SPARQL or Sesame API...................... 13 2.3.4 Loading data through the GraphDB LoadRDF tool.................... 14 2.4 Querying data............................................ 14 2.4.1 Querying data through the GraphDB Workbench..................... 14 2.4.2 Querying data programmatically.............................. 15 2.4.2.1 Execute the example query with a HTTP GET request:.............. 15 2.4.2.2 Execute the example query with a POST operation:................ 15 2.4.3 Supported export/download formats............................ 15 2.5 Additional resources......................................... 15 3 Installation 17 3.1 Distribution Package......................................... 17 3.2 Requirements............................................. 18 3.2.1 Minimum requirements................................... 18 3.2.2 Production environment.................................. 18 3.2.2.1 Hardware..................................... 18 3.2.2.2 Software...................................... 18 3.2.2.3 Control & management.............................. 18 i 3.2.3 Licensing.......................................... 18 3.3 Installing GraphDB Free....................................... 18 3.3.1 Install GraphDB on a laptop/desktop computer...................... 18 3.3.2 Installing GraphDB on a server............................... 18 3.4 Upgrading GraphDB Free...................................... 19 3.4.1 Procedure.......................................... 19 3.4.2 Upgrade scenarios..................................... 19 3.4.2.1 The storage/ folder format is not different (for minor versions)........ 19 3.4.2.2 The storage/ folder format is different but GraphDB can auto-upgrade it (for minor changes in the format)........................... 20 3.4.2.3 The storage/ folder format is different but the changes cannot be auto- upgraded (for major versions)........................... 20 3.4.3 Upgrade from/to specific versions............................. 20 3.4.3.1 Upgrade from 6.2................................. 20 3.4.3.2 Upgrade from 6.1................................. 21 3.4.3.3 Upgrade from 5.4 to 6.2+............................. 21 3.4.3.4 Upgrade from 5.x................................. 21 4 Administration 23 4.1 Administration Tasks......................................... 23 4.2 Administration Tools......................................... 23 4.2.1 Through the Workbench.................................. 23 4.2.2 Through the JMX interface................................. 24 4.2.3 GraphDB configurator (spreadsheet)............................ 24 4.3 Creating a Repository........................................ 24 4.3.1 Creating locations and repositories............................. 24 4.4 Configuring a Repository...................................... 25 4.4.1 Steps............................................ 25 4.4.2 A repository configuration template - how it works.................... 26 4.4.3 Sample configuration.................................... 27 4.4.4 Configuration parameters.................................. 27 4.4.4.1 Debugging options................................. 30 4.4.5 Configuring memory.................................... 31 4.4.5.1 Configuring the JVM, the application and the GraphDB workspace memory.. 32 4.4.5.2 Configuring the memory for storing entities................... 32 4.4.5.3 Configuring the Cache memory.......................... 32 4.5 Sizing Guidelines........................................... 33 4.5.1 Entry-level deployment................................... 33 4.5.2 Mid-range deployment................................... 34 4.5.3 Enterprise deployment................................... 34 4.6 Disk Space Requirements...................................... 34 4.6.1 GraphDB disk space requirements for loading a dataset.................. 34 4.6.2 GraphDB disk space requirements per statement...................... 35 4.7 Configuring the Entity Pool..................................... 35 4.8 Starting & Shutting down GraphDB................................. 36 4.8.1 Starting GraphDB in the embedded Tomcat........................ 36 4.8.2 Shutting down GraphDB in the embedded Tomcat..................... 36 4.8.3 How to start/stop via non-embedded Tomcat (or other application server)........ 36 4.8.4 Shutting down a repository instance from the JMX interface............... 37 4.9 Managing Repositories........................................ 37 4.9.1 Changing repository parameters.............................. 37 4.9.2 Renaming a repository................................... 37 4.10 Access Rights and Security..................................... 38 4.10.1 Using the GraphDB Workbench.............................. 38 4.10.2 Using an HTTP authentication and the Sesame server’s deployment descriptor...... 40 4.10.2.1 Operations..................................... 40 4.10.2.2 Security constraints and roles........................... 40 4.10.2.3 User accounts................................... 42 ii 4.10.3 Programmatic authentication................................ 42 4.11 Backing up and Recovering a Repository.............................. 42 4.11.1 Backing up a repository................................... 42 4.11.2 Restoring a repository.................................... 44 4.12 Query and Transaction Monitoring................................. 45 4.12.1 Query monitoring and termination............................. 45 4.12.1.1 Preventing long running queries.......................... 46 4.12.2 Terminating a transaction.................................. 46 4.13 Diagnosing and Reporting Critical Errors.............................. 47 4.13.1 Logs............................................. 47 4.13.2 Report script........................................ 47 4.13.2.1 Requirements................................... 48 4.13.2.2 Example...................................... 48 5 Usage 51 5.1 Workbench User Guide....................................... 51 5.1.1 Installation, start-up and shut down............................. 51 5.1.2 Checking your setup.................................... 52 5.1.3 Using the Workbench.................................... 52 5.1.3.1 Managing locations................................ 52 5.1.3.2 Managing repositories............................... 53 5.1.4 Loading data into a repository............................... 56 5.1.4.1 Import settings................................... 56 5.1.4.2 Four ways to import data............................. 57 5.1.5 Executing queries...................................... 59 5.1.6 Exporting data....................................... 61 5.1.7 Viewing and editing resources............................... 63 5.1.8 Namespace management.................................. 65 5.1.9 Context view........................................ 66 5.1.10 Connector management................................... 67 5.1.11 Users and access management............................... 68 5.1.11.1 Free access..................................... 70 5.1.12 REST API.......................................... 70 5.1.13 Configuration properties.................................. 72 5.2 Using GraphDB with the Sesame API................................ 73 5.2.1 Sesame Application Programming Interface (API)..................... 73 5.2.1.1 Using the Sesame API to access a local GraphDB repository.......... 73 5.2.1.2 Using the Sesame API to access a remote GraphDB repository......... 74 5.2.2 Managing repositories with the Sesame Workbench.................... 75 5.2.3 SPARQL endpoint..................................... 75 5.2.4 Graph Store HTTP Protocol................................ 75 5.3 Using GraphDB with Jena...................................... 76 5.3.1 Installing GraphDB with Jena............................... 76 5.4 GraphDB Connectors........................................ 79 5.4.1 Lucene GraphDB Connector................................ 79 5.4.1.1 Overview and features............................... 79
Recommended publications
  • Mapping Spatiotemporal Data to RDF: a SPARQL Endpoint for Brussels
    International Journal of Geo-Information Article Mapping Spatiotemporal Data to RDF: A SPARQL Endpoint for Brussels Alejandro Vaisman 1, * and Kevin Chentout 2 1 Instituto Tecnológico de Buenos Aires, Buenos Aires 1424, Argentina 2 Sopra Banking Software, Avenue de Tevuren 226, B-1150 Brussels, Belgium * Correspondence: [email protected]; Tel.: +54-11-3457-4864 Received: 20 June 2019; Accepted: 7 August 2019; Published: 10 August 2019 Abstract: This paper describes how a platform for publishing and querying linked open data for the Brussels Capital region in Belgium is built. Data are provided as relational tables or XML documents and are mapped into the RDF data model using R2RML, a standard language that allows defining customized mappings from relational databases to RDF datasets. In this work, data are spatiotemporal in nature; therefore, R2RML must be adapted to allow producing spatiotemporal Linked Open Data.Data generated in this way are used to populate a SPARQL endpoint, where queries are submitted and the result can be displayed on a map. This endpoint is implemented using Strabon, a spatiotemporal RDF triple store built by extending the RDF store Sesame. The first part of the paper describes how R2RML is adapted to allow producing spatial RDF data and to support XML data sources. These techniques are then used to map data about cultural events and public transport in Brussels into RDF. Spatial data are stored in the form of stRDF triples, the format required by Strabon. In addition, the endpoint is enriched with external data obtained from the Linked Open Data Cloud, from sites like DBpedia, Geonames, and LinkedGeoData, to provide context for analysis.
    [Show full text]
  • Ontotext Platform Documentation Release 3.4 Ontotext
    Ontotext Platform Documentation Release 3.4 Ontotext Apr 16, 2021 CONTENTS 1 Overview 1 1.1 30,000ft ................................................ 2 1.2 Layered View ............................................ 2 1.2.1 Application Layer ...................................... 3 1.2.1.1 Ontotext Platform Workbench .......................... 3 1.2.1.2 GraphDB Workbench ............................... 4 1.2.2 Service Layer ........................................ 5 1.2.2.1 Semantic Objects (GraphQL) ........................... 5 1.2.2.2 GraphQL Federation (Apollo GraphQL Federation) ............... 5 1.2.2.3 Text Analytics Service ............................... 6 1.2.2.4 Annotation Service ................................ 7 1.2.3 Data Layer ......................................... 7 1.2.3.1 Graph Database (GraphDB) ........................... 7 1.2.3.2 Semantic Object Schema Storage (MongoDB) ................. 7 1.2.3.3 Semantic Objects for MongoDB ......................... 8 1.2.3.4 Semantic Object for Elasticsearch ........................ 8 1.2.4 Authentication and Authorization ............................. 8 1.2.4.1 FusionAuth ..................................... 8 1.2.4.2 Semantic Objects RBAC ............................. 9 1.2.5 Kubernetes ......................................... 9 1.2.5.1 Ingress and GW .................................. 9 1.2.6 Operation Layer ....................................... 10 1.2.6.1 Health Checking .................................. 10 1.2.6.2 Telegraf ....................................... 10
    [Show full text]
  • Semantics Developer's Guide
    MarkLogic Server Semantic Graph Developer’s Guide 2 MarkLogic 10 May, 2019 Last Revised: 10.0-8, October, 2021 Copyright © 2021 MarkLogic Corporation. All rights reserved. MarkLogic Server MarkLogic 10—May, 2019 Semantic Graph Developer’s Guide—Page 2 MarkLogic Server Table of Contents Table of Contents Semantic Graph Developer’s Guide 1.0 Introduction to Semantic Graphs in MarkLogic ..........................................11 1.1 Terminology ..........................................................................................................12 1.2 Linked Open Data .................................................................................................13 1.3 RDF Implementation in MarkLogic .....................................................................14 1.3.1 Using RDF in MarkLogic .........................................................................15 1.3.1.1 Storing RDF Triples in MarkLogic ...........................................17 1.3.1.2 Querying Triples .......................................................................18 1.3.2 RDF Data Model .......................................................................................20 1.3.3 Blank Node Identifiers ..............................................................................21 1.3.4 RDF Datatypes ..........................................................................................21 1.3.5 IRIs and Prefixes .......................................................................................22 1.3.5.1 IRIs ............................................................................................22
    [Show full text]
  • Data Platforms Map from 451 Research
    1 2 3 4 5 6 Azure AgilData Cloudera Distribu2on HDInsight Metascale of Apache Kaa MapR Streams MapR Hortonworks Towards Teradata Listener Doopex Apache Spark Strao enterprise search Apache Solr Google Cloud Confluent/Apache Kaa Al2scale Qubole AWS IBM Azure DataTorrent/Apache Apex PipelineDB Dataproc BigInsights Apache Lucene Apache Samza EMR Data Lake IBM Analy2cs for Apache Spark Oracle Stream Explorer Teradata Cloud Databricks A Towards SRCH2 So\ware AG for Hadoop Oracle Big Data Cloud A E-discovery TIBCO StreamBase Cloudera Elas2csearch SQLStream Data Elas2c Found Apache S4 Apache Storm Rackspace Non-relaonal Oracle Big Data Appliance ObjectRocket for IBM InfoSphere Streams xPlenty Apache Hadoop HP IDOL Elas2csearch Google Azure Stream Analy2cs Data Ar2sans Apache Flink Azure Cloud EsgnDB/ zone Platforms Oracle Dataflow Endeca Server Search AWS Apache Apache IBM Ac2an Treasure Avio Kinesis LeanXcale Trafodion Splice Machine MammothDB Drill Presto Big SQL Vortex Data SciDB HPCC AsterixDB IBM InfoSphere Towards LucidWorks Starcounter SQLite Apache Teradata Map Data Explorer Firebird Apache Apache JethroData Pivotal HD/ Apache Cazena CitusDB SIEM Big Data Tajo Hive Impala Apache HAWQ Kudu Aster Loggly Ac2an Ingres Sumo Cloudera SAP Sybase ASE IBM PureData January 2016 Logic Search for Analy2cs/dashDB Logentries SAP Sybase SQL Anywhere Key: B TIBCO Splunk Maana Rela%onal zone B LogLogic EnterpriseDB SQream General purpose Postgres-XL Microso\ Ry\ X15 So\ware Oracle IBM SAP SQL Server Oracle Teradata Specialist analy2c PostgreSQL Exadata
    [Show full text]
  • RDF Query Languages Need Support for Graph Properties
    RDF Query Languages Need Support for Graph Properties Renzo Angles1, Claudio Gutierrez1, and Jonathan Hayes1,2 1 Dept. of Computer Science, Universidad de Chile 2 Dept. of Computer Science, Technische Universit¨at Darmstadt, Germany {rangles,cgutierr,jhayes}@dcc.uchile.cl Abstract. This short paper discusses the need to include into RDF query languages the ability to directly query graph properties from RDF data. We study the support that current RDF query languages give to these features, to conclude that they are currently not supported. We propose a set of basic graph properties that should be added to RDF query languages and provide evidence for this view. 1 Introduction One of the main features of the Resource Description Framework (RDF) is its ability to interconnect information resources, resulting in a graph-like structure for which connectivity is a central notion [GLMB98]. As we will argue, basic concepts of graph theory such as degree, path, and diameter play an important role for applications that involve RDF querying. Considering the fact that the data model influences the set of operations that should be provided by a query language [HBEV04], it follows the need for graph operations support in RDF query languages. For example, the query “all relatives of degree 1 of Alice”, submitted to a genealogy database, amounts to retrieving the nodes adjacent to a resource. The query “are suspects A and B related?”, submitted to a police database, asks for any path connecting these resources in the (RDF) graph that is stored in this database. The query “what is the Erd˝osnumber of Alberto Mendelzon”, submitted to (a RDF version of) DBLP, asks simply for the length of the shortest path between the nodes representing Erd˝osand Mendelzon.
    [Show full text]
  • Open Web Ontobud: an Open Source RDF4J Frontend
    Open Web Ontobud: An Open Source RDF4J Frontend Francisco José Moreira Oliveira University of Minho, Braga, Portugal [email protected] José Carlos Ramalho Department of Informatics, University of Minho, Braga, Portugal [email protected] Abstract Nowadays, we deal with increasing volumes of data. A few years ago, data was isolated, which did not allow communication or sharing between datasets. We live in a world where everything is connected, and our data mimics this. Data model focus changed from a square structure like the relational model to a model centered on the relations. Knowledge graphs are the new paradigm to represent and manage this new kind of information structure. Along with this new paradigm, a new kind of database emerged to support the new needs, graph databases! Although there is an increasing interest in this field, only a few native solutions are available. Most of these are commercial, and the ones that are open source have poor interfaces, and for that, they are a little distant from end-users. In this article, we introduce Ontobud, and discuss its design and development. A Web application that intends to improve the interface for one of the most interesting frameworks in this area: RDF4J. RDF4J is a Java framework to deal with RDF triples storage and management. Open Web Ontobud is an open source RDF4J web frontend, created to reduce the gap between end users and the RDF4J backend. We have created a web interface that enables users with a basic knowledge of OWL and SPARQL to explore ontologies and extract information from them.
    [Show full text]
  • A Performance Study of RDF Stores for Linked Sensor Data
    Semantic Web 1 (0) 1–5 1 IOS Press 1 1 2 2 3 3 4 A Performance Study of RDF Stores for 4 5 5 6 Linked Sensor Data 6 7 7 8 Hoan Nguyen Mau Quoc a,*, Martin Serrano b, Han Nguyen Mau c, John G. Breslin d , Danh Le Phuoc e 8 9 a Insight Centre for Data Analytics, National University of Ireland Galway, Ireland 9 10 E-mail: [email protected] 10 11 b Insight Centre for Data Analytics, National University of Ireland Galway, Ireland 11 12 E-mail: [email protected] 12 13 c Information Technology Department, Hue University, Viet Nam 13 14 E-mail: [email protected] 14 15 d Confirm Centre for Smart Manufacturing and Insight Centre for Data Analytics, National University of Ireland 15 16 Galway, Ireland 16 17 E-mail: [email protected] 17 18 e Open Distributed Systems, Technical University of Berlin, Germany 18 19 E-mail: [email protected] 19 20 20 21 21 Editors: First Editor, University or Company name, Country; Second Editor, University or Company name, Country 22 Solicited reviews: First Solicited Reviewer, University or Company name, Country; Second Solicited Reviewer, University or Company name, 22 23 Country 23 24 Open reviews: First Open Reviewer, University or Company name, Country; Second Open Reviewer, University or Company name, Country 24 25 25 26 26 27 27 28 28 29 Abstract. The ever-increasing amount of Internet of Things (IoT) data emanating from sensor and mobile devices is creating 29 30 new capabilities and unprecedented economic opportunity for individuals, organisations and states.
    [Show full text]
  • Storage, Indexing, Query Processing, And
    Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 23 May 2020 doi:10.20944/preprints202005.0360.v1 STORAGE,INDEXING,QUERY PROCESSING, AND BENCHMARKING IN CENTRALIZED AND DISTRIBUTED RDF ENGINES:ASURVEY Waqas Ali Department of Computer Science and Engineering, School of Electronic, Information and Electrical Engineering (SEIEE), Shanghai Jiao Tong University, Shanghai, China [email protected] Muhammad Saleem Agile Knowledge and Semantic Web (AKWS), University of Leipzig, Leipzig, Germany [email protected] Bin Yao Department of Computer Science and Engineering, School of Electronic, Information and Electrical Engineering (SEIEE), Shanghai Jiao Tong University, Shanghai, China [email protected] Axel-Cyrille Ngonga Ngomo University of Paderborn, Paderborn, Germany [email protected] ABSTRACT The recent advancements of the Semantic Web and Linked Data have changed the working of the traditional web. There is a huge adoption of the Resource Description Framework (RDF) format for saving of web-based data. This massive adoption has paved the way for the development of various centralized and distributed RDF processing engines. These engines employ different mechanisms to implement key components of the query processing engines such as data storage, indexing, language support, and query execution. All these components govern how queries are executed and can have a substantial effect on the query runtime. For example, the storage of RDF data in various ways significantly affects the data storage space required and the query runtime performance. The type of indexing approach used in RDF engines is key for fast data lookup. The type of the underlying querying language (e.g., SPARQL or SQL) used for query execution is a key optimization component of the RDF storage solutions.
    [Show full text]
  • Benchmarking RDF Query Engines: the LDBC Semantic Publishing Benchmark
    Benchmarking RDF Query Engines: The LDBC Semantic Publishing Benchmark V. Kotsev1, N. Minadakis2, V. Papakonstantinou2, O. Erling3, I. Fundulaki2, and A. Kiryakov1 1 Ontotext, Bulgaria 2 Institute of Computer Science-FORTH, Greece 3 OpenLink Software, Netherlands Abstract. The Linked Data paradigm which is now the prominent en- abler for sharing huge volumes of data by means of Semantic Web tech- nologies, has created novel challenges for non-relational data manage- ment technologies such as RDF and graph database systems. Bench- marking, which is an important factor in the development of research on RDF and graph data management technologies, must address these challenges. In this paper we present the Semantic Publishing Benchmark (SPB) developed in the context of the Linked Data Benchmark Council (LDBC) EU project. It is based on the scenario of the BBC media or- ganisation which makes heavy use of Linked Data Technologies such as RDF and SPARQL. In SPB a large number of aggregation agents pro- vide the heavy query workload, while at the same time a steady stream of editorial agents execute a number of update operations. In this paper we describe the benchmark’s schema, data generator, workload and re- port the results of experiments conducted using SPB for the Virtuoso and GraphDB RDF engines. Keywords: RDF, Linked Data, Benchmarking, Graph Databases 1 Introduction Non-relational data management is emerging as a critical need in the era of a new data economy where heterogeneous, schema-less, and complexly structured data from a number of domains are published in RDF. In this new environment where the Linked Data paradigm is now the prominent enabler for sharing huge volumes of data, several data management challenges are present and which RDF and graph database technologies are called to tackle.
    [Show full text]
  • Graphdb Free Documentation Release 8.6
    GraphDB Free Documentation Release 8.6 Ontotext Sep 24, 2018 CONTENTS 1 General 1 1.1 About GraphDB...........................................2 1.2 Architecture & components.....................................2 1.2.1 Architecture.........................................2 1.2.1.1 RDF4J.......................................3 1.2.1.2 The Sail API....................................4 1.2.2 Components.........................................4 1.2.2.1 Engine.......................................4 1.2.2.2 Connectors.....................................5 1.2.2.3 Workbench.....................................5 1.3 GraphDB Free............................................5 1.3.1 Comparison of GraphDB Free and GraphDB SE......................6 1.4 Connectors..............................................6 1.5 Workbench..............................................6 2 Quick start guide 9 2.1 Run GraphDB as a desktop installation...............................9 2.1.1 On Windows........................................ 10 2.1.2 On Mac OS......................................... 10 2.1.3 On Linux.......................................... 10 2.1.4 Configuring GraphDB................................... 10 2.1.5 Stopping GraphDB..................................... 11 2.2 Run GraphDB as a stand-alone server................................ 11 2.2.1 Running GraphDB..................................... 11 2.2.1.1 Options...................................... 11 2.2.2 Configuring GraphDB................................... 12 2.2.2.1 Paths and network settings...........................
    [Show full text]
  • Web and Semantic Web Query Languages: a Survey
    Web and Semantic Web Query Languages: A Survey James Bailey1, Fran¸coisBry2, Tim Furche2, and Sebastian Schaffert2 1 NICTA Victoria Laboratory Department of Computer Science and Software Engineering The University of Melbourne, Victoria 3010, Australia http://www.cs.mu.oz.au/~jbailey/ 2 Institute for Informatics,University of Munich, Oettingenstraße 67, 80538 M¨unchen, Germany http://pms.ifi.lmu.de/ Abstract. A number of techniques have been developed to facilitate powerful data retrieval on the Web and Semantic Web. Three categories of Web query languages can be distinguished, according to the format of the data they can retrieve: XML, RDF and Topic Maps. This ar- ticle introduces the spectrum of languages falling into these categories and summarises their salient aspects. The languages are introduced us- ing common sample data and query types. Key aspects of the query languages considered are stressed in a conclusion. 1 Introduction The Semantic Web Vision A major endeavour in current Web research is the so-called Semantic Web, a term coined by W3C founder Tim Berners-Lee in a Scientific American article describing the future of the Web [37]. The Semantic Web aims at enriching Web data (that is usually represented in (X)HTML or other XML formats) by meta-data and (meta-)data processing specifying the “meaning” of such data and allowing Web based systems to take advantage of “intelligent” reasoning capabilities. To quote Berners-Lee et al. [37]: “The Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users.” The Semantic Web meta-data added to today’s Web can be seen as advanced semantic indices, making the Web into something rather like an encyclopedia.
    [Show full text]
  • XML-Based RDF Data Management for Efficient Query Processing
    XML-Based RDF Data Management for Efficient Query Processing Mo Zhou Yuqing Wu Indiana University, USA Indiana University, USA [email protected] [email protected] Abstract SPARQL[17] is a W3C recommended RDF query language. A The Semantic Web, which represents a web of knowledge, offers SPARQL query contains a collection of triples with variables called new opportunities to search for knowledge and information. To simple access patterns which form graph patterns for describing harvest such search power requires robust and scalable data repos- query requirements. For SELECT ?t example, the SPARQL itories that can store RDF data and support efficient evaluation of WHERE {?p type Person. ?r ?x ?t. SPARQL queries. Most of the existing RDF storage techniques rely ?p name ?n. ?p write ?r} query on the left re- on relation model and relational database technologies for these trieves all properties of tasks. They either keep the RDF data as triples, or decompose it reviews written by a person whose name is known. into multiple relations. The mis-match between the graph model The needs to develop applications on the Semantic Web and sup- of the RDF data and the rigid 2D tables of relational model jeop- port search in RDF graphs call for RDF repositories to be reliable, ardizes the scalability of such repositories and frequently renders a robust and efficient in answering SPARQL queries. As in the con- repository inefficient for some types of data and queries. We pro- text of RDB and XML, the selection of storage models is critical to pose to decompose RDF graph into a forest of semantically cor- a data repository as it is the dominating factor to determine how to related XML trees, store them in an XML repository and rewrite evaluate queries and how the system behaves when it scales up.
    [Show full text]