RISIS Linked Data Course Overview of the Course: Friday
Total Page:16
File Type:pdf, Size:1020Kb
Day 2 RISIS Linked Data Course Overview of the Course: Friday 9:00-9:15 Coffee 9:15-9:45 Introduction & Reflection 10:30-11:30 SPARQL Query Language 11:30-11:45 Coffee 11:45-12:30 SPARQL Hands-on 12:30-13:30 Lunch 13:30-14:00 Linksets & Lenses for Aligning Datasets 14:00-14:30 Linked Data Publishing 14:30-14:45 Coffee 14:45-16:00 Data Publishing Hands-On 16:00-17:00 Actual Research SPARQL Query Language Rinke Hoekstra, [email protected] Questions Goals • How can we query Linked Data • Know the syntax of SPARQL • What is the syntax of SPARQL • Understand how to query Linked Data on the Web • How can we use SPARQL • Know about standard vocabularies in Linked Data Deep Breath by Tomas A RDF - Where is it? • As separate files, e.g. as .ttl, .rdf, .nt, etc. • Integrated with Web pages (RDFa/Microdata) • Accessible through content negotiation curl -L -H "Accept: text/turtle" "http://dbpedia.org/resource/Inside_Out_(2015_film)" curl -L -H "Accept: text/html" "http://dbpedia.org/resource/Inside_Out_(2015_film)" • In RDF-specific databases called triple stores • A standard HTTP REST API for querying using the SPARQL language • This API is called a SPARQL endpoint SPARQL - The SPARQL Protocol and RDF Query Language • The SPARQL Protocol specifies how to query Triple stores over HTTP • The SPARQL Query Language specifies a syntax for writing queries • There are six types of queries SELECT, CONSTRUCT, INSERT, DELETE, ASK, DESCRIBE ? http://www.w3.org/TR/sparql11-query SPARQL - Query Syntax PREFIX: the namespace prefixes used in the SPARQL query PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?city WHERE { ?city dbo:areaCode "020". } LIMIT 10 SELECT: the entities (variables) you want to return WHERE: the (sub)graph you want to get information from ... including additional constraints on results (using operators) http://www.w3.org/TR/sparql11-query SPARQL - Graph Pattern • The WHERE clause specifies a graph pattern • … that should be matched • … can match multiple times • A graph pattern is an RDF graph with some nodes & edges as variables dbo:Country "020"^^xsd:string rdf:type ? ? dbo:capital ? SPARQL - Triple Patterns • A graph pattern consists of multiple triple patterns • A triple pattern is a triple with zero or more variables ?x dbo:capital dbpedia:Amsterdam . ?x dbo:capital ?y . ?x dbo:areaCode "020". ?x ?p ?y dbo:Country "020"^^xsd:string dbpedia:Netherlands rdf:type dbo:Country ; rdf:type dbo:capital dbpedia:Amsterdam . dbpedia:Amsterdam dbo:areaCode "020"dbo:areaCode. dbpedia:The_Netherlands dbo:capital dbpedia:Amsterdam SPARQL - Triple patterns form a conjunction Every triple pattern should match PREFIX dbo: <http://dbpedia.org/ontology/> dbo:Country "020"^^xsd:string SELECT ?x WHERE { ?x dbo:capital ?y . rdf:type ? ?y dbo:areaCode "020". } LIMIT 10 ? dbo:capital ? dbo:Country "020"^^xsd:string "020"^^xsd:string rdf:type dbo:areaCode dbo:areaCode dbpedia:The_Netherlands dbo:capital dbpedia:Amsterdam dbpedia:Westminster http://www.w3.org/TR/sparql11-query ✖ SPARQL - But we can also use disjunctions At least one of the graph patterns should match PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?y WHERE { { ?y dbo:areaCode "020". } UNION { ?y dbo:areaCode "010". } } LIMIT 10 "020"^^xsd:string "010"^^xsd:string dbo:areaCode dbo:areaCode dbpedia:Amsterdam dbpedia:Rotterdam http://www.w3.org/TR/sparql11-query SPARQL - Optional graphs A part of the graph pattern is optional, and does not need to match PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?x ?y WHERE { ?y dbo:areaCode "020". OPTIONAL { ?x dbo:capital ?y . } } LIMIT 10 dbo:Country "020"^^xsd:string "020"^^xsd:string rdf:type dbo:areaCode & dbo:areaCode dbpedia:The_Netherlands dbo:capital dbpedia:Amsterdam dbpedia:Westminster http://www.w3.org/TR/sparql11-query Filtering query results Tests in the FILTER clause have to be validated for matching subgraphs PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?y WHERE { ?y dbo:areaCode "020". ?y dbo:populationTotal ?x . FILTER (?x > 500000) } "020"^^xsd:string dbo:areaCode "790654"^^xsd:integer dbpedia:Amsterdam dbo:populationTotal "020"^^xsd:string dbo:areaCode http://www.w3.org/TR/sparql11-query dbpedia:Borehamwood ✖dbo:populationTotal "28546"^^xsd:integer SPARQL - Solution modifiers Sorting using ORDER BY PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> DISTINCT results SELECT ?label WHERE { ?place dbo:areaCode "020"; rdfs:label ?label . PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX dbpedia: <http://dbpedia.org/resource/> } ORDER BY DESC (?label) PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT DISTINCT ?name WHERE { ?country rdf:type dbo:Country ; dbo:capital ?ams . LIMITing the number of results ?ams dbo:country dbpedia:Netherlands ; rdfs:label ?name . FILTER (lang(?name) = "en" ) PREFIX dbo: <http://dbpedia.org/ontology/> } PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?label WHERE { ?y dbo:areaCode "020"; rdfs:label ?label . FILTER (lang(?label) = "en" ) } ORDER BY DESC(?label) LIMIT 10 SPARQL - Property Paths • Not interested in variable bindings • Transitively walk over the same property relation • Are interested in alternatives without having to write a UNION PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> SELECT DISTINCT ?name WHERE { ?country rdf:type dbo:Country ; dbo:capitalPREFIX dbo: ?city<http://dbpedia.org/ontology . /> { ?city rdfs:labelPREFIX dbpedia: ?name <http://dbpedia.org/resource/ . } > UNION PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> { ?city skos:prefLabelPREFIX skos: ?name<http://www.w3.org/2004/02/skos/core#> . } FILTER (lang(?name ) = "en" ) } SELECT DISTINCT ?name WHERE { ?country rdf:type dbo:Country ; dbo:capital/(rdfs:label|skos:prefLabel) ?name . } FILTER (lang(?name) = "en" ) } PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX ex: <http://example.com/vocab/> • SELECT returns a table with CONSTRUCT { variable bindings ?y ex:kengetal "020". } WHERE { ?y dbo:areaCode "020". } • CONSTRUCT returns a RDF graph PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX dbr: <http://dbpedia.org/resource/> • ASK returns true or false ASK WHERE { dbr:Amsterdam dbo:areaCode "020". } • DESCRIBE returns a RDF graph PREFIX dbr: <http://dbpedia.org/resource/> • INSERT is like CONSTRUCT, DESCRIBE dbr:Amsterdam but inserts the graph into the triple store PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX ex: <http://example.com/vocab/> INSERT { ?y ex:kengetal "020". } WHERE { ?y dbo:areaCode "020". } http://www.w3.org/TR/sparql11-query SPARQL Update SPARQL - More features • Named Graphs • Negation provide context for statements they contain MINUS GRAPH <http://example.com> { … } • Functions • Aggregates IF, BOUND, EXISTS, NOT EXISTS, IN, NOT IN COUNT, SUM, MIN, MAX, AVG, GROUP_CONCAT &&, ||, =, <, >, =>, =< isIRI, isBlank, isLiteral, isNumeric, str, and SAMPLE lang, datatype STRLEN, SUBSTR, CONTAINS, CONCAT, REGEX now, year, month, day, hours, minutes, • Grouping and group aggregates seconds GROUP BY and HAVING … and more • Provide data inline VALUES http://www.w3.org/TR/sparql11-query http://yasgui.org SPARQL - Querying the Linked Data Cloud + = • DBPedia • LOD Cache: DBPedia + Geonames + NY Times + MusicBrainz + ... PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX owl: <http://www.w3.org/2002/07/owl#> SELECT DISTINCT ?x ?y WHERE { ?x dbo:areaCode "020". ?y owl:sameAs ?x . FILTER (?x != ?y) } • Same query, different results! SPARQL - Summary • Linked Data is usually stored in triple stores • SPARQL is the query language for the Web of Linked Data • Queries are sent to SPARQL endpoints over HTTP • Queries describe graph patterns with variables • Graph patterns match (parts of) the RDF graphs stored in the triple store • Results are returned as a table with variable bindings (JSON, XML, TSV, CSV…) LinkedLinked Data Open LifecycleData Interlinking Authoring Enrichment Linked (Open) Data Storage/ Quality Querying Lifecycle Analysis Extraction Evolution Exploration http://stack.linkeddata.org/ Ali Khalili Designing Linked Data Applications 14 Exploration Exploration RDF - Vocabularies • The RDF vocabulary and RDFS • Geonames reserved terms needed for the data model locations • Friend of a Friend (FOAF) • Data Cube (QB) persons and relations between persons statistical data • Dublin Core (DC and DCTerms) • Simple Knowledge Organization bibliographic attributes (author, title, etc.) System (SKOS) concept hierarchies and mappings • DBPedia Ontology DBPedia is at the heart of the Web of Data • Web Ontology Language (OWL) formal constraints on class membership Exploration RDF - The RDF Vocabulary Namespace <http://www.w3.org/1999/02/22-rdf-syntax-ns#> Usual prefix rdf (NB: needs to be declared, like all others) Important elements rdf:type … links a resource to a type rdf:Resource … is the type of all resources rdf:Property … is the type of all predicates Examples geo:Amsterdam rdf:type rdf:Resource . geo:containedIn a rdf:Property . rdf:type a rdf:Property . Exploration RDFS - The RDF Schema Vocabulary Namespace <http://www.w3.org/2000/01/rdf-schema#> Usual prefix rdfs (NB: needs to be declared, like all others) Important elements rdfs:label … links a resource to its name rdfs:Class … is the type of all “types” rdfs:subClassOf … inheritance over types rdfs:subPropertyOf … inheritance over properties Examples geo:Amsterdam rdfs:label “Amsterdam”@nl . geo:containedIn rdfs:subPropertyOf