The Web of Linked Data

Total Page:16

File Type:pdf, Size:1020Kb

The Web of Linked Data WebDB 2010 June 6th, 2010, Indianapolis, USA The Web of Linked Data A global public dataspace on the Web Christian Bizer Freie Universität Berlin Christian Bizer: The Web of Linked Data (6/6/2010) Outline 1. Foundations of Dataspaces and Linked Data Where do they overlap? 2. The Web of Linked Data What data is out there? 3. Linked Data Applications What i s b ei ng d one with the da ta? 4. Remarks on Identity Self-descriptive Data Pay-as-you-go Integration Christian Bizer: The Web of Linked Data (6/6/2010) The Dataspace Vision Alternative to classic data integration systems in order to cope with growing number of data sources. PtifdtProperties of dataspaces may contain any kind of data (structured, semi-structured, unstructured) require no upfront investment into a global schema provide for data-coexistence give best-effort answers to queries rely on pay-as-you-go data integration Franklin, M ., Halevy , A ., and Maier , D .: From Databases to Dataspaces A new Abstraction for Information Management, SIGMOD Rec. 2005. Christian Bizer: The Web of Linked Data (6/6/2010) Dataspace Architecture Source: Franklin et al: From Databases to Dataspaces,Christian Bizer: The SIGMOD Web of Linked Rec. Data (6/6/2010)2005. Linked Data Principles Set of best practices for publishing structured data on the Web in accordance with the general architecture of the Web. 1. Use URIs as names for things. 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful RDF information. 4. Include RDF statements that link to other URIs so that they can discover related things. Tim Berners-Lee, http://www.w3.org/DesignIssues/LinkedData.html, 2006 Christian Bizer: The Web of Linked Data (6/6/2010) Architecture of the classic Web Single global information space Web Search Browsers Engines SlltfiltddSmall set of simple standards HTTP 1. HTML as document format 2. HTTP URLs as globally unique IDs HTML HTML HTML retrieval mechanism hyper- links 3. Hyperlinks to connect everything A B C Christian Bizer: The Web of Linked Data (6/6/2010) Web 2.0 APIs and Mashups No single global dataspace Mashup Shor tcomi ngs 1. APIs have proprietary interfaces 2. Mashups are based on a Web Web Web Web fixed set of data sources API API API API 3. YtthlikYou can not set hyperlinks between data items within different APIs A B C D Christian Bizer: The Web of Linked Data (6/6/2010) Web APIs slice the Web into Walled Gardens Image: Bob Jagensdorf, http://flickr.com/photos/darwinbell/, CC-BY Christian Bizer: The Web of Linked Data (6/6/2010) Linked Data Extend the Web with a single global dataspace 1. by using RDF to publish structured data on the Web 2. by setting links between data items within different data sources RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF link links links links A B C D E Christian Bizer: The Web of Linked Data (6/6/2010) The RDF Data Model rdf:type pd:cygri foaf:Person fffoaf:name Richard Cyganiak foaf:based_near dbpedia:Berlin Flexible graph-based data model. Christian Bizer: The Web of Linked Data (6/6/2010) Entities are identified with HTTP URIs rdf:type pd:cygri foaf:Person fffoaf:name Richard Cyganiak foaf:based_near dbpedia:Berlin HTTP URIs take the role of global primary keys. pdid:cygri = http: //ri ch ard .cygani ak .d e/f oaf .rdf# cygri dbpedia:Berlin = http://dbpedia.org/resource/Berlin Christian Bizer: The Web of Linked Data (6/6/2010) Resolving URIs over the Web rdf:type pd:cygri foaf:Person fffoaf:name 3. 405. 259 Richard Cyganiak dp:population foaf:based_near dbpedia:Berlin skos:subject dp: Cities_ in_ Germany The HTTP protocol brings together identification and retriev al again. Christian Bizer: The Web of Linked Data (6/6/2010) Following Links deeper into the Web rdf:type pd:cygri foaf:Person fffoaf:name 3. 405. 259 Richard Cyganiak dp:population foaf:based_near dbpedia:Berlin skos:subject skos:subject dbpe dia: Ham burg dp: Cities_ in_ Germany dbpedia:Muenchen skos:subject Christian Bizer: The Web of Linked Data (6/6/2010) The Disco – Hyperdata Browser Christian Bizer: The Web of Linked Data (6/6/2010) Christian Bizer: The Web of Linked Data (6/6/2010) Properties of the Web of Linked Data Global, distributed dataspace built on a simple set of standards RDF, URIs, HTTP Entities are connected by links creating a global data graph that spans data sources and enables the discovery of new data sources. Provides for data-coexistence Everyone can publish data to the Web of Linked Data Everyone can express their personal view on things Everybody can use the schemata that they like for this Christian Bizer: The Web of Linked Data (6/6/2010) 2. Linked Data Deployment on the Web Is this real? RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF RDF link links links links A B C D E Christian Bizer: The Web of Linked Data (6/6/2010) W3C Linking Open Data Project Grassroots community effort to publish existing open license datasets as Linked Data on the Web interlink things between different data sources Christian Bizer: The Web of Linked Data (6/6/2010) LOD Datasets on the Web: May 2007 Over 500 million RDF triples Around 120,000 RDF links between data sources Christian Bizer: The Web of Linked Data (6/6/2010) LOD Datasets on the Web: September 2008 Christian Bizer: The Web of Linked Data (6/6/2010) LOD Datasets on the Web: July 2009 Over 13.1 billion RDF triples Over 142 million RDF links between data sources Christian Bizer: The Web of Linked Data (6/6/2010) DBpedia – An Interlinking Hub in the Web of Data Christian Bizer: The Web of Linked Data (6/6/2010) DBpedia community effort to extract structured information from Wikipedia. provides data about 3.4 million things 312, 000 persons 140,000 organizations 413,000 places 94,000 music albums 49,000 films 146,000 species … provides identifiers for many common things http://dbpedia.org/resource/Calgary overlaps with many other data sources on the Web Christian Bizer: The Web of Linked Data (6/6/2010) The LOD effort is losing track with the diagram :-) Christian Bizer: The Web of Linked Data (6/6/2010) Christian Bizer: The Web of Linked Data (6/6/2010) Christian Bizer: The Web of Linked Data (6/6/2010) Uptake in Life Sciences W3C Linking Open Drug Data Effort Bio2RDF Project Allen Brain Atlas Christian Bizer: The Web of Linked Data (6/6/2010) Uptake in the Libraries Community Institutions publishing Linked Data Library of Congress (subject headings) German National Library (PND dataset and subject headings) Swedish National Library (Libris - catalog) Hungarian National Library (OPAC and Digital Library) German Central Library of Economics (subject headings) Workshop: Semantic Web in Bibliotheken (SWIB09) Köln, 24. und 25. November 2009 http://www.swib09.de/ W3C Library Linked Data Incubator Group Oppjen Archives Object Reuse and Exchang g(e (OAI-ORE) Standard Christian Bizer: The Web of Linked Data (6/6/2010) Uptake in the Media Industry publish data as RDF/XML and/or embed data into HTML using RDFa Christian Bizer: The Web of Linked Data (6/6/2010) The Structural Continuum The Web of Linked Data is interwoven with the classic Web. Unstructured data: HTML Semi-structured data: RDFa embed into HTML Structured data: RDF/XML Services using named entity recognition to annotate texts with Linked Data URIs Open Calais (Thomsons Reuters) for news Zt(tt)fbltZemanta (startup) for blog posts Christian Bizer: The Web of Linked Data (6/6/2010) 3. Linked Data Applications What can I do with this? Linked Data Linked Data Search Browsers Mashups Engines Thing Thing Thing Thing Thing Thing Thing Thing Thing Thing typed typed typed typed links links links links A B C D E Christian Bizer: The Web of Linked Data (6/6/2010) Linked Data Browsers PidfProvide for navi itibtgating between d dtata sources in order to explore the dataspace. Tabulator Browser (MIT, USA) Marbles (FU Berlin, DE) Opp(p)enLink RDF Browser (OpenLink, UK) Zitgist RDF Browser (Zitgist, USA) Disco Hyper da ta Browser (FU B erli n, DE) Fenfire (DERI, Irland) Christian Bizer: The Web of Linked Data (6/6/2010) Christian Bizer: The Web of Linked Data (6/6/2010) DBpedia Mobile Displays DBpedia data on a map Provides for navigating into other data sources Christian Bizer: The Web of Linked Data (6/6/2010) Web of Data Search Engines ClthdtCrawl the dataspace and provid idbte best-effor t query answers over crawled data. Falcons (IWS, China) Sig.ma (DERI, Ireland) Swoogle (UMBC, USA) VisiNav (DERI, Ireland) Wat son (O pen U ni versit y, UK) Christian Bizer: The Web of Linked Data (6/6/2010) Christian Bizer: The Web of Linked Data (6/6/2010) Christian Bizer: The Web of Linked Data (6/6/2010) Christian Bizer: The Web of Linked Data (6/6/2010) What are the big players doing? Yahoo! and Google have started to crawl Linked Data in its RDFa serialization as well as Microformats . Yahoo! provides access to crawled data through the Yahoo BOSS API is using the data within Yahoo Search Monkey to make search results more usefu l an d v isua lly appea ling. Google uses crawled RDF data for its Social Graph API uses crawled data to enhance search results snippets fifor reviews an dld people. Christian Bizer: The Web of Linked Data (6/6/2010) Yahoo! Search Monkey Christian Bizer: The Web of Linked Data (6/6/2010) Facebook’s Open Graph Protocol Facebook imports RDFa data from external web sites.
Recommended publications
  • A Data-Driven Framework for Assisting Geo-Ontology Engineering Using a Discrepancy Index
    University of California Santa Barbara A Data-Driven Framework for Assisting Geo-Ontology Engineering Using a Discrepancy Index A Thesis submitted in partial satisfaction of the requirements for the degree Master of Arts in Geography by Bo Yan Committee in charge: Professor Krzysztof Janowicz, Chair Professor Werner Kuhn Professor Emerita Helen Couclelis June 2016 The Thesis of Bo Yan is approved. Professor Werner Kuhn Professor Emerita Helen Couclelis Professor Krzysztof Janowicz, Committee Chair May 2016 A Data-Driven Framework for Assisting Geo-Ontology Engineering Using a Discrepancy Index Copyright c 2016 by Bo Yan iii Acknowledgements I would like to thank the members of my committee for their guidance and patience in the face of obstacles over the course of my research. I would like to thank my advisor, Krzysztof Janowicz, for his invaluable input on my work. Without his help and encour- agement, I would not have been able to find the light at the end of the tunnel during the last stage of the work. Because he provided insight that helped me think out of the box. There is no better advisor. I would like to thank Yingjie Hu who has offered me numer- ous feedback, suggestions and inspirations on my thesis topic. I would like to thank all my other intelligent colleagues in the STKO lab and the Geography Department { those who have moved on and started anew, those who are still in the quagmire, and those who have just begun { for their support and friendship. Last, but most importantly, I would like to thank my parents for their unconditional love.
    [Show full text]
  • QUERY-DRIVEN TEXT ANALYTICS for KNOWLEDGE EXTRACTION, RESOLUTION, and INFERENCE by CHRISTAN EARL GRANT a DISSERTATION PRESENTED
    QUERY-DRIVEN TEXT ANALYTICS FOR KNOWLEDGE EXTRACTION, RESOLUTION, AND INFERENCE By CHRISTAN EARL GRANT A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2015 c 2015 Christan Earl Grant To Jesus my Savior, Vanisia my wife, my daughter Caliah, soon to be born son and my parents and siblings, whom I strive to impress. Also, to all my brothers and sisters battling injustice while I battled bugs and deadlines. ACKNOWLEDGMENTS I had an opportunity to see my dad, a software engineer from Jamaica work extremely hard to get a master's degree and work as a software engineer. I even had the privilege of sitting in some of his classes as he taught at a local university. Watching my dad work towards intellectual endeavors made me believe that anything is possible. I am extremely privileged to have someone I could look up to as an example of being a man, father, and scholar. I had my first taste of research when Dr. Joachim Hammer went out of his way to find a task for me on one of his research projects because I was interested in attending graduate school. After working with the team for a few weeks he was willing to give me increased responsibility | he let me attend the 2006 SIGMOD Conference in Chicago. It was at this that my eyes were opened to the world of database research. As an early graduate student Dr. Joseph Wilson exercised superhuman patience with me as I learned to grasp the fundamentals of paper writing.
    [Show full text]
  • Usage-Dependent Maintenance of Structured Web Data Sets
    Usage-dependent maintenance of structured Web data sets Dissertation zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften (Dr. rer. nat) am Institut f¨urInformatik des Fachbereichs Mathematik und Informatik der Freien Unviersit¨atBerlin vorgelegt von Dipl. Inform. Markus Luczak-R¨osch Berlin, August 2013 Referent: Prof. Dr.-Ing. Robert Tolksdorf (Freie Universit¨atBerlin) Erste Korreferentin: Natalya F. Noy, PhD (Stanford University) Zweite Korreferentin: Dr. rer. nat. Elena Simperl (University of Southampton) Tag der Disputation: 13.01.2014 To Johanna. To Levi, Yael and Mili. Vielen Dank, dass ich durch Euch eine Lebenseinstellung lernen durfte, \. die bereit ist, auf kritische Argumente zu h¨oren und von der Erfahrung zu lernen. Es ist im Grunde eine Einstellung, die zugibt, daß ich mich irren kann, daß Du Recht haben kannst und daß wir zusammen vielleicht der Wahrheit auf die Spur kommen." { Karl Popper Abstract The Web of Data is the current shape of the Semantic Web that gained momentum outside of the research community and becomes publicly visible. It is a matter of fact that the Web of Data does not fully exploit the primarily intended technology stack. Instead, the so called Linked Data design issues [BL06], which are the basis for the Web of Data, rely on the much more lightweight technologies. Openly avail- able structured Web data sets are at the beginning of being used in real-world applications. The Linked Data research community investigates the overall goal to approach the Web-scale data integration problem in a way that distributes efforts between three contributing stakeholders on the Web of Data { the data publishers, the data consumers, and third parties.
    [Show full text]
  • Semantic Web and Services
    Where are we? Artificial Intelligence # Title 1 Introduction 2 Propositional Logic 3 Predicate Logic 4 Reasoning 5 Search Methods Semantic Web and 6 CommonKADS 7 Problem-Solving Methods 8 Planning Services 9 Software Agents 10 Rule Learning 11 Inductive Logic Programming 12 Formal Concept Analysis 13 Neural Networks 14 Semantic Web and Services © Copyright 2010 Dieter Fensel, Mick Kerrigan and Ioan Toma 1 2 Agenda • Semantic Web - Data • Motivation • Development of the Web • Internet • Web 1.0 • Web 2.0 • Limitations of the current Web • Technical Solution: URI, RDF, RDFS, OWL, SPARQL • Illustration by Larger Examples: KIM Browser Plugin, Disco Hyperdata Browser • Extensions: Linked Open Data • Semantic Web – Processes • Motivation • Technical Solution: Semantic Web Services, WSMO, WSML, SEE, WSMX SEMANTIC WEB - DATA • Illustration by Larger Examples: SWS Challenge, Virtual Travel Agency, WSMX at work • Extensions: Mobile Services, Intelligent Cars, Intelligent Electricity Meters • Summary • References 3 3 4 4 1 MOTIVATION DEVELOPMENT OF THE WEB 5 5 6 Development of the Web 1. Internet 2. Web 1.0 3. Web 2.0 INTERNET 7 8 2 Internet A brief summary of Internet evolution Age of eCommerce Mosaic Begins WWW • “The Internet is a global system of interconnected Internet Created 1995 Created 1993 Named 1989 computer networks that use the standard Internet and Goes Protocol Suite (TCP/IP) to serve billions of users TCP/IP TCP/IP Created 1984 ARPANET 1972 worldwide. It is a network of networks that consists of 1969 Hypertext millions of private
    [Show full text]
  • Linked Data - the Story So Far
    Linked Data - The Story So Far Christian Bizer, Freie Universität Berlin, Germany Tom Heath, Talis Information Ltd, United Kingdom Tim Berners-Lee, Massachusetts Institute of Technology, USA This is a preprint of a paper to appear in: Heath, T., Hepp, M., and Bizer, C. (eds.). Special Issue on Linked Data, International Journal on Semantic Web and Information Systems (IJSWIS). http://linkeddata.org/docs/ijswis-special-issue Abstract The term Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions - the Web of Data. In this article we present the concept and technical principles of Linked Data, and situate these within the broader context of related technological developments. We describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked Data community as it moves forward. Keywords: Linked Data, Web of Data, Semantic Web, Data Sharing, Data Exploration 1. Introduction The World Wide Web has radically altered the way we share knowledge by lowering the barrier to publishing and accessing documents as part of a global information space. Hypertext links allow users to traverse this information space using Web browsers, while search engines index the documents and analyse the structure of links between them to infer potential relevance to users' search queries (Brin & Page, 1998).
    [Show full text]
  • Semantic Web and Exam Preparation
    Intelligent Systems Semantic Web and Exam Preparation © Copyright @2009 Dieter Fensel and Mick Kerrigan 1 Where are we? # Title 1 Introduction 2 Propositional Logic 3 Predicate Logic 4 Theorem Proving, Description Logics and Logic Programming 5 Search Methods 6 CommonKADS 7 Problem Solving Methods 8 Planning 9 Agents 10 Rule Learning 11 Inductive Logic Programming 12 Formal Concept Analysis 13 Neural Networks 14 Semantic Web and Exam Preparation 2 Agenda • Semantic Web - Data • Motivation • Technical Solution: URI, RDF, RDFS, OWL, SPARQL • Illustration by Larger Examples: KIM Browser Plugin, Disco Hyperdata Browser • Extensions: Linked Open Data • Semantic Web – Processes • Motivation • Technical Solution: Semantic Web Services, WSMO, WSML, SEE, WSMX • Illustration by Larger Examples: SWS Challenge, Virtual Travel Agency • Extensions: WSMX at work • Conclusions 3 3 SEMANTIC WEB - DATA 4 4 MOTIVATION 5 5 Motivation • If the Web is about the global networking of data through URL, HTML, and HTTP… • … the Semantic Web is about the global networking of knowledge through URI, RDF, and SPARQL • This knowledge can be an annotation of Web data (this picture depicts Innsbruck) or just for knowledge‘s sake (Innsbruck is a city in Austria) • Structured data: – is a key towards Artificial Intelligence – is background knowledge – enables formal reasoning 6 6 TECHNICAL SOLUTIONS 7 7 Uniform Resource Identifier Taken from http://www.w3.org/TR/webarch/ 8 RDF • URIs are used to identify resources, not just things that exists on the Web, e.g. Dieter Fensel,
    [Show full text]
  • L Dataspaces Make Data Ntegration Obsolete?
    DBKDA 2011, January 23-28, 2011 – St. Maarten, The Netherlands Antilles DBKDA 2011 Panel Discussion: Will Dataspaces Make Data Integration Obsolete? Moderator: Fritz Laux, Reutlingen Univ., Germany Panelists: Kazuko Takahashi, Kwansei Gakuin Univ., Japan Lena Strömbäck, Linköping Univ., Sweden Nipun Agarwal, Oracle Corp., USA Christopher Ireland, The Open Univ., UK Fritz Laux, Reutlingen Univ., Germany DBKDA 2011, January 23-28, 2011 – St. Maarten, The Netherlands Antilles The Dataspace Idea Space of Data Management Scalable Functionality and Costs far Web Search Functionality virtual Organization pay-as-you-go, Enterprise Dataspaces Admin. Portal Schema Proximity Federated first, DBMS DBMS scient. Desktop Repository Search DBMS schemaless, near unstructured high Semantic Integration low Time and Cost adopted from [Franklin, Halvey, Maier, 2005] DBKDA 2011, January 23-28, 2011 – St. Maarten, The Netherlands Antilles Dataspaces (DS) [Franklin, Halevy, Maier, 2005] is a new abstraction for Information Management ● DS are [paraphrasing and commenting Franklin, 2009] – Inclusive ● Deal with all the data of interest, in whatever form => but semantics matters ● We need access to the metadata! ● derive schema from instances? ● Discovering new data sources => The Münchhausen bootstrap problem? Theodor Hosemann (1807-1875) DBKDA 2011, January 23-28, 2011 – St. Maarten, The Netherlands Antilles Dataspaces (DS) [Franklin, Halevy, Maier, 2005] is a new abstraction for Information Management ● DS are [paraphrasing and commenting Franklin, 2009] – Co-existence
    [Show full text]
  • The Point of View Axis: Varying the Levels of Explanation Within a Generic RDF Data Browsing Environment
    The Point of View Axis: Varying the Levels of Explanation Within a Generic RDF Data Browsing Environment Oshani Seneviratne [email protected] Tim Berners-Lee [email protected] Decentralized Information Group, MIT Computer Science and Artificial Intelligence Laboratory 1. Introduction 3. Panes in Tabulator RDF is at the heart of the Semantic Web as it is the primary Tabulator is capable of generic data browsing, but goes means by which applications can share data and interoper- one step further by allowing users to exploit the RDF data ate. Tabulator is a generic data browser and editor for linked browsing and editing capabilities to build custom applica- RDF data on the web. It was developed with the motiva- tions through a ’Pane’. tion of providing a natural and a seamless experience for browsing and editing data (Tim Berners-Lee, 2008). This paper describes how Tabulator can be used to develop cus- tom applications which consume RDF data, in addition to providing a generic data browsing and editing environment. The goal is to make sure that the end-user has the ability to view the RDF data in a visualization that is most suitable given the nature of the data. The paper is structured as follows. We begin by describ- ing some related work in Section 2. Section 3 gives an overview of the Pane System in Tabulator, and then in Sec- tion 4, we give an example where Tabulator can be used to provide varying levels of explanations through The Justi- fication User Interface. We then give a short overview of our future work in Section 5, and conclude the paper with a discussion of our results in Section 6.
    [Show full text]
  • Second Year Report
    UNIVERSITY OF SOUTHAMPTON Web and Internet Science Research Group Electronics and Computer Science A mini-thesis submitted for transfer from MPhil toPhD Supervised by: Prof. Dame Wendy Hall Prof. Vladimiro Sassone Dr. Corina Cîrstea Examined by: Dr. Nicholas Gibbins Dr. Enrico Marchioni Co-Operating Systems by Henry J. Story 1st April 2019 UNIVERSITY OF SOUTHAMPTON ABSTRACT WEB AND INTERNET SCIENCE RESEARCH GROUP ELECTRONICS AND COMPUTER SCIENCE A mini-thesis submitted for transfer from MPhil toPhD by Henry J. Story The Internet and the World Wide Web are global engineering projects that emerged from questions around information, meaning and logic that grew out of telecommunication research. It borrowed answers provided by philosophy, mathematics, engineering, security, and other areas. As a global engineering project that needs to grow in a multi-polar world of competing and cooperating powers, such a system must be built to a number of geopolitical constraints, of which the most important is a peer-to-peer architecture, i.e. one which does not require a central power to function, and that allows open as well as secret communication. After elaborating a set of geopolitical constraints on any global information system, we show that these are more or less satisfied at the raw-information transmission side of the Internet, as well as the document Web, but fails at the Application web, which currently is fragmented in a growing number of large systems with panopticon like architectures. In order to overcome this fragmentation, it is argued that the web needs to move to generalise the concepts from HyperText applications known as browsers to every data consuming application.
    [Show full text]
  • Hyperdata: Update Apis for RDF Data Sources (Vision Paper)⋆
    Hyperdata: Update APIs for RDF Data Sources (Vision Paper)? Jacek Kopeck´y Knowledge Media Institute, The Open University, UK [email protected] Abstract. The Linked Data effort has been focusing on how to publish open data sets on the Web, and it has had great results. However, mech- anisms for updating linked data sources have been neglected in research. We propose a structure for Linked Data resources in named graphs, con- nected through hyperlinks and self-described with light metadata, that is a natural match for using standard HTTP methods to implement application-specific (high-level) public update APIs. 1 Vision A major function of Web APIs is to give users a way to contribute to data sources (whether they be social networks, photo sharing sites, or anything else) through rich scripted web sites, rather than through simple web forms, and also through external (even 3rd-party) tools. Facebook API, Flickr API and so on, support interactive Web interfaces as well as mobile apps or desktop tools. Some of the data in these apps then gets published as Linked Data, a machine- friendly representation suitable for combining with other data. Commonly, there is a technologies disconnect, though, between the Linked Data read-only view on the data source (which employs RDF and URIs), and the update APIs (with JSON or XML, and non-URI identifiers). In this paper, we describe a vision of hyperdata1 | data that is not only hyperlinked and self-describing in terms of its schema, but also self-describing on how it can be updated.
    [Show full text]
  • Exploring Digital Preservation Strategies Using DLT in the Context Of
    Forget-me-block Exploring digital preservation strategies using Distributed Ledger Technology in the context of personal information management By JAMES DAVID HACKMAN Department of Computer Science UNIVERSITY OF BRISTOL A dissertation submitted to the University of Bristol in accordance with the requirements of the degree of Master of Science by advanced study in Computer Science in the Faculty of Engineering. 15TH SEPTEMBER 2020 arXiv:2011.05759v1 [cs.CY] 2 Nov 2020 EXECUTIVE SUMMARY eceived wisdom portrays digital records as guaranteeing perpetuity; as the New York Times wrote a decade ago: “the web means the end of forgetting” [1]. The Rreality however is that digital records suffer similar risks of access loss as the analogue versions they replaced - but through the mechanisms of software, hardware and organisational change. The first two of these mechanisms are straightforward. Software change relates to how data is encoded - for instance later versions of Microsoft Word often cannot access documents written with earlier versions [2]. Likewise hardware formats obsolesce; even popular technologies such as the floppy disk reach a point where accessing data on these formats becomes increasingly difficult [3]. The third mechanism is however more abstract as it relates to societal structures, and ironically is often generated as a by-product of attempts to escape the first two risks. In our efforts to rid ourselves of hardware and software change these risks are often delegated to specialised external parties. Common use cases are those of conveying information to a future self, e.g. calendars, diaries, tasks, etc. These applications, categorised as Personal Information Management (PIM) [4, p.
    [Show full text]
  • Data in Context: Aiding News Consumers While Taming Dataspaces
    DBCrowd 2013: First VLDB Workshop on Databases and Crowdsourcing Data In Context: Aiding News Consumers while Taming Dataspaces Adam Marcus∗ , Eugene Wu, Sam Madden MIT CSAIL marcua, sirrice, madden @csail.mit.edu { } ...were it left to me to decide whether we should have a gov- reasons: 1) A lack of space or time, as is common in minute- ernment without newspapers, or newspapers without a gov- by-minute reporting 2) The article is a segment in a multi- ernment, I should not hesitate a moment to prefer the latter. part series, 3) The reader doesn’t have the assumed back- — Thomas Jefferson ground knowledge, 4) A newsroom is resources-limited and can not do additional analysis in-house, 5) The writer’s ABSTRACT agenda is better served through the lack of context, or 6) The context is not materialized in a convenient place (e.g., We present MuckRaker, a tool that provides news consumers there is no readily accessible table of historical earnings). with datasets and visualizations that contextualize facts and In some cases, the missing data is often accessible (e.g, on figures in the articles they read. MuckRaker takes advantage Wikipedia), and with enough effort, an enterprising reader of data integration techniques to identify matching datasets, can usually analyze or visualize it themselves. Ideally, all and makes use of data and schema extraction algorithms to news consumers would have tools to simplify this task. identify data points of interest in articles. It presents the Many database research results could aid readers, par- output of these algorithms to users requesting additional ticularly those related to dataspace management.
    [Show full text]