The Web of Linked Data

WebDB 2010 June 6th, 2010, Indianapolis, USA The Web of Linked Data A global public dataspace on the Web Christian Bizer Freie Universität Berlin Christian Bizer: The Web of Linked Data (6/6/2010) Outline 1. Foundations of Dataspaces and Linked Data Where do they overlap? 2. The Web of Linked Data What data is out there? 3. Linked Data Applications What i s b ei ng d one with the da ta? 4. Remarks on Identity Self-descriptive Data Pay-as-you-go Integration Christian Bizer: The Web of Linked Data (6/6/2010) The Dataspace Vision Alternative to classic data integration systems in order to cope with growing number of data sources. PtifdtProperties of dataspaces may contain any kind of data (structured, semi-structured, unstructured) require no upfront investment into a global schema provide for data-coexistence give best-effort answers to queries rely on pay-as-you-go data integration Franklin, M ., Halevy , A ., and Maier , D .: From Databases to Dataspaces A new Abstraction for Information Management, SIGMOD Rec. 2005. Christian Bizer: The Web of Linked Data (6/6/2010) Dataspace Architecture Source: Franklin et al: From Databases to Dataspaces,Christian Bizer: The SIGMOD Web of Linked Rec. Data (6/6/2010)2005. Linked Data Principles Set of best practices for publishing structured data on the Web in accordance with the general architecture of the Web. 1. Use URIs as names for things. 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful RDF information. 4. Include RDF statements that link to other URIs so that they can discover related things. Tim Berners-Lee, http://www.w3.org/DesignIssues/LinkedData.html, 2006 Christian Bizer: The Web of Linked Data (6/6/2010) Architecture of the classic Web Single global information space Web Search Browsers Engines SlltfiltddSmall set of simple standards HTTP 1. HTML as document format 2. HTTP URLs as globally unique IDs HTML HTML HTML retrieval mechanism hyperlinks 3. Hyperlinks to connect everything A B C Christian Bizer: The Web of Linked Data (6/6/2010) Web 2.0 APIs and Mashups No single global dataspace Mashup Shor tcomi ngs 1. APIs have proprietary interfaces 2. Mashups are based on a Web Web Web Web fixed set of data sources API API API API 3. YtthlikYou can not set hyperlinks between data items within different APIs A B C D Christian Bizer: The Web of Linked Data (6/6/2010) Web APIs slice the Web into Walled Gardens Image: Bob Jagensdorf, http://flickr.com/photos/darwinbell/, CC-BY Christian Bizer: The Web of Linked Data (6/6/2010) Linked Data Extend the Web with a single global dataspace 1. by using RDF to publish structured data on the Web 2. by setting links between data items within different data sources RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF link links links links A B C D E Christian Bizer: The Web of Linked Data (6/6/2010) The RDF Data Model rdf:type pd:cygri foaf:Person fffoaf:name Richard Cyganiak foaf:based_near dbpedia:Berlin Flexible graph-based data model. Christian Bizer: The Web of Linked Data (6/6/2010) Entities are identified with HTTP URIs rdf:type pd:cygri foaf:Person fffoaf:name Richard Cyganiak foaf:based_near dbpedia:Berlin HTTP URIs take the role of global primary keys. pdid:cygri = http: //ri ch ard .cygani ak .d e/f oaf .rdf# cygri dbpedia:Berlin = http://dbpedia.org/resource/Berlin Christian Bizer: The Web of Linked Data (6/6/2010) Resolving URIs over the Web rdf:type pd:cygri foaf:Person fffoaf:name 3. 405. 259 Richard Cyganiak dp:population foaf:based_near dbpedia:Berlin skos:subject dp: Cities_ in_ Germany The HTTP protocol brings together identification and retriev al again. Christian Bizer: The Web of Linked Data (6/6/2010) Following Links deeper into the Web rdf:type pd:cygri foaf:Person fffoaf:name 3. 405. 259 Richard Cyganiak dp:population foaf:based_near dbpedia:Berlin skos:subject skos:subject dbpe dia: Ham burg dp: Cities_ in_ Germany dbpedia:Muenchen skos:subject Christian Bizer: The Web of Linked Data (6/6/2010) The Disco – Hyperdata Browser Christian Bizer: The Web of Linked Data (6/6/2010) Christian Bizer: The Web of Linked Data (6/6/2010) Properties of the Web of Linked Data Global, distributed dataspace built on a simple set of standards RDF, URIs, HTTP Entities are connected by links creating a global data graph that spans data sources and enables the discovery of new data sources. Provides for data-coexistence Everyone can publish data to the Web of Linked Data Everyone can express their personal view on things Everybody can use the schemata that they like for this Christian Bizer: The Web of Linked Data (6/6/2010) 2. Linked Data Deployment on the Web Is this real? RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF link links links links A B C D E Christian Bizer: The Web of Linked Data (6/6/2010) W3C Linking Open Data Project Grassroots community effort to publish existing open license datasets as Linked Data on the Web interlink things between different data sources Christian Bizer: The Web of Linked Data (6/6/2010) LOD Datasets on the Web: May 2007 Over 500 million RDF triples Around 120,000 RDF links between data sources Christian Bizer: The Web of Linked Data (6/6/2010) LOD Datasets on the Web: September 2008 Christian Bizer: The Web of Linked Data (6/6/2010) LOD Datasets on the Web: July 2009 Over 13.1 billion RDF triples Over 142 million RDF links between data sources Christian Bizer: The Web of Linked Data (6/6/2010) DBpedia – An Interlinking Hub in the Web of Data Christian Bizer: The Web of Linked Data (6/6/2010) DBpedia community effort to extract structured information from Wikipedia. provides data about 3.4 million things 312, 000 persons 140,000 organizations 413,000 places 94,000 music albums 49,000 films 146,000 species … provides identifiers for many common things http://dbpedia.org/resource/Calgary overlaps with many other data sources on the Web Christian Bizer: The Web of Linked Data (6/6/2010) The LOD effort is losing track with the diagram :-) Christian Bizer: The Web of Linked Data (6/6/2010) Christian Bizer: The Web of Linked Data (6/6/2010) Christian Bizer: The Web of Linked Data (6/6/2010) Uptake in Life Sciences W3C Linking Open Drug Data Effort Bio2RDF Project Allen Brain Atlas Christian Bizer: The Web of Linked Data (6/6/2010) Uptake in the Libraries Community Institutions publishing Linked Data Library of Congress (subject headings) German National Library (PND dataset and subject headings) Swedish National Library (Libris - catalog) Hungarian National Library (OPAC and Digital Library) German Central Library of Economics (subject headings) Workshop: Semantic Web in Bibliotheken (SWIB09) Köln, 24. und 25. November 2009 http://www.swib09.de/ W3C Library Linked Data Incubator Group Oppjen Archives Object Reuse and Exchang g(e (OAI-ORE) Standard Christian Bizer: The Web of Linked Data (6/6/2010) Uptake in the Media Industry publish data as RDF/XML and/or embed data into HTML using RDFa Christian Bizer: The Web of Linked Data (6/6/2010) The Structural Continuum The Web of Linked Data is interwoven with the classic Web. Unstructured data: HTML Semi-structured data: RDFa embed into HTML Structured data: RDF/XML Services using named entity recognition to annotate texts with Linked Data URIs Open Calais (Thomsons Reuters) for news Zt(tt)fbltZemanta (startup) for blog posts Christian Bizer: The Web of Linked Data (6/6/2010) 3. Linked Data Applications What can I do with this? Linked Data Linked Data Search Browsers Mashups Engines Thing Thing Thing Thing Thing Thing Thing Thing Thing Thing typed typed typed typed links links links links A B C D E Christian Bizer: The Web of Linked Data (6/6/2010) Linked Data Browsers PidfProvide for navi itibtgating between d dtata sources in order to explore the dataspace. Tabulator Browser (MIT, USA) Marbles (FU Berlin, DE) Opp(p)enLink RDF Browser (OpenLink, UK) Zitgist RDF Browser (Zitgist, USA) Disco Hyper da ta Browser (FU B erli n, DE) Fenfire (DERI, Irland) Christian Bizer: The Web of Linked Data (6/6/2010) Christian Bizer: The Web of Linked Data (6/6/2010) DBpedia Mobile Displays DBpedia data on a map Provides for navigating into other data sources Christian Bizer: The Web of Linked Data (6/6/2010) Web of Data Search Engines ClthdtCrawl the dataspace and provid idbte best-effor t query answers over crawled data. Falcons (IWS, China) Sig.ma (DERI, Ireland) Swoogle (UMBC, USA) VisiNav (DERI, Ireland) Wat son (O pen U ni versit y, UK) Christian Bizer: The Web of Linked Data (6/6/2010) Christian Bizer: The Web of Linked Data (6/6/2010) Christian Bizer: The Web of Linked Data (6/6/2010) Christian Bizer: The Web of Linked Data (6/6/2010) What are the big players doing? Yahoo! and Google have started to crawl Linked Data in its RDFa serialization as well as Microformats . Yahoo! provides access to crawled data through the Yahoo BOSS API is using the data within Yahoo Search Monkey to make search results more usefu l an d v isua lly appea ling. Google uses crawled RDF data for its Social Graph API uses crawled data to enhance search results snippets fifor reviews an dld people. Christian Bizer: The Web of Linked Data (6/6/2010) Yahoo! Search Monkey Christian Bizer: The Web of Linked Data (6/6/2010) Facebook’s Open Graph Protocol Facebook imports RDFa data from external web sites.

The Web of Linked Data

A Data-Driven Framework for Assisting Geo-Ontology Engineering Using a Discrepancy Index

QUERY-DRIVEN TEXT ANALYTICS for KNOWLEDGE EXTRACTION, RESOLUTION, and INFERENCE by CHRISTAN EARL GRANT a DISSERTATION PRESENTED

Usage-Dependent Maintenance of Structured Web Data Sets

Semantic Web and Services

Linked Data - the Story So Far

Semantic Web and Exam Preparation

L Dataspaces Make Data Ntegration Obsolete?

The Point of View Axis: Varying the Levels of Explanation Within a Generic RDF Data Browsing Environment

Second Year Report

Hyperdata: Update Apis for RDF Data Sources (Vision Paper)⋆

Exploring Digital Preservation Strategies Using DLT in the Context Of

Data in Context: Aiding News Consumers While Taming Dataspaces