Consuming Linked Open Data Via Standard Web Widgets

Die approbierte Originalversion dieser Diplom-/ Masterarbeit ist in der Hauptbibliothek der Tech- nischen Universität Wien aufgestellt und zugänglich. http://www.ub.tuwien.ac.at The approved original version of this diploma or master thesis is available at the main library of the Vienna University of Technology. http://www.ub.tuwien.ac.at/eng Consuming Linked Open Data via Standard Web Widgets DIPLOMARBEIT zur Erlangung des akademischen Grades Diplom-Ingenieurin im Rahmen des Studiums Business Informatics eingereicht von Irina Pershina Matrikelnummer 1127738 an der Fakultät für Informatik der Technischen Universität Wien Betreuung: o.Univ.-Prof. Dipl.-Ing. Dr.techn. A Min Tjoa Mitwirkung: Univ.Ass. Dipl.-Ing. Dr.rer.soc.oec. Amin Anjomshoaa Wien, 23.04.2014 (Unterschrift Verfasserin) (Unterschrift Betreuung) Technische Universität Wien A-1040 Wien Karlsplatz 13 Tel. +43-1-58801-0 www.tuwien.ac.at Consuming Linked Open Data via Standard Web Widgets MASTER’S THESIS submitted in partial fulfillment of the requirements for the degree of Diplom-Ingenieurin in Business Informatics by Irina Pershina Registration Number 1127738 to the Faculty of Informatics at the Vienna University of Technology Advisor: o.Univ.-Prof. Dipl.-Ing. Dr.techn. A Min Tjoa Assistance: Univ.Ass. Dipl.-Ing. Dr.rer.soc.oec. Amin Anjomshoaa Vienna, 23.04.2014 (Signature of Author) (Signature of Advisor) Technische Universität Wien A-1040 Wien Karlsplatz 13 Tel. +43-1-58801-0 www.tuwien.ac.at Erklärung zur Verfassung der Arbeit Irina Pershina Kohlgasse 49/15, 1050 Hiermit erkläre ich, dass ich diese Arbeit selbständig verfasst habe, dass ich die verwende- ten Quellen und Hilfsmittel vollständig angegeben habe und dass ich die Stellen der Arbeit - einschließlich Tabellen, Karten und Abbildungen -, die anderen Werken oder dem Internet im Wortlaut oder dem Sinn nach entnommen sind, auf jeden Fall unter Angabe der Quelle als Ent- lehnung kenntlich gemacht habe. (Ort, Datum) (Unterschrift Verfasserin) i Acknowledgements I would like to express my very great appreciation to my supervisors, Univ.-Prof. A Min Tjoa and Univ. Ass. Dr. Amin Anjomshoaa, for his valuable and constructive suggestions, and useful critiques during the the process of writing this master thesis. Their willingness to give his time so generously has been very much appreciated. To my family, who has always been my support in every stage of my life. I am especially grateful to my parents, who supported me emotionally and financially. I also would like to thank my colleagues Peter Wetz, Dat Trinh Tuan, and Lam Ba Do from Linked Data Lab, and Lucas Gerrand and Raffael Prätterhoffer in Business Informatics Master program for listening, patience and support during the last ten months. I gratefully enjoyed the collaboration and knowledge exchange with them. iii Abstract The Semantic Web describes a concept for information storing, sharing, and retrieving on the Web by adding machine-readable meta information to convey meaning to the data. Linked Open Data is publicly available structured data which is stored and modelled according to Semantic Web standards and interlinked with other Open Data. The Linked Open Data cloud comprises of Linked Data sources and has been growing significantly in recent years. Complementary to this, mashups allow non-professional users to access, consume, and analyze data from various sources. The basic component of a mashup is a widget that can access certain datasets, process data, and provide additional functionality. Mashups can partly handle Linked Data consumption for knowledge workers. The main challenges are finding the appropriate widget when the amount of available widgets is increasing, categorizing and finding widgets with similar functionality, and adding provenance information to widgets. The primary purpose of this thesis is to design a semantic model for a mashup platform that enables (i) publishing of widget information on the Linked Open Data Cloud, (ii) widget discovery, (iii) widget composition, and (iv) smart data consumption based on semantic model. Additionally, the semantic model should provide provenance information in order to provide additional information about origin and authenticity of data, and increase trust in data resources. During this research work existing approaches applicable for Semantic Web Service description have been compared and Semantic Web Service description techniques concerning application in the area of Web Widgets have been evaluated. Requirements to the semantic model are derived from literature review and complemented with requirements for mashup systems. Finally, the semantic widget model is implemented into a mashup prototype to demonstrate its usability. v Kurzfassung Das Semantische Web beschreibt ein Konzept zu Informationspeicherung, -austausch und -abruf im Web durch Hinzufügen maschinenlesbarer Metainformation. Ziel ist es, Daten eine Bedeu- tung zu geben. Zu diesem Konzept zählt auch Linked Open Data. Dabei handelt es sich um Daten, die der Resource Description Framework Spezifikation entsprechend modelliert und ge- speichert sind. Zudem sind diese Daten öffentlich verfügbar und miteinander verknüpft. Die Linked Open Data Cloud beeinhaltet alle bedeutenden Linked Data Quellen und befindet sich seit den letzten Jahren in einem ständigen Wachstum. Ergänzend dazu ermöglichen Mashups nicht fachkundigen Anwendern Zugang zu Konsum und Analyse von Linked Data. Die Grund- komponente eines Mashups ist ein Widget. Dieses kann auf bestimmte Datensätze zugreifen, Daten verarbeiten und zusätzliche Funktionen zur Verfügung stellen. Bis dato können Mashups nur teilweise die vorhandenen Probleme für Wissensarbeiter, die mit der Verwendung von Linked Data zusammenhängen, lösen. Die größten Herausforderungen sind es, passende Widgets zu finden, während die Anzahl verfügbarer Widgtes steigt, Widgets mit ähnlichen Funktionalitäten zu kategorisieren und zu finden, und Informationen über die Her- kunft und Vertrauenswürdigkeit von Daten hinzuzufügen. Der Hauptzweck dieser Masterarbeit ist die Entwicklung eines semantischen Modells für eine Mashup Platform. Das Modell ermöglicht (i) die automatische Veröffentlichung der Widget- information in die Linked Open Data Cloud, (ii) Widget Auffindung, (iii) Widget Zusammenset- zung und (iv) smarte Anwendung von Daten basiert auf semantische Modellen. Zusätzlich soll das Modell Informationen über die Herkunft von Daten beinhalten. Während meiner Forschung evaluierte ich Änlichkeiten und Unterschiede zwischen Web Widgets und Semantic Web Services, verglich existierende Ansätze zu Semantic Web Service Beschreibungen und evaluierte Semantic Web Service Beschreibungstechniken, die für eine An- wendung im Bereich Web Widgets relevant sind. Anforderungen an das Modell werden von vorhandener Literatur abgeleitet und mit den Anforderungen an Mashupsysteme ergänzt. Ab- schließend wird das semantische Widget Modell mittels eines Prototyps implementiert, um des- sen praktische Nutzbarkeit zu demonstrieren. vii Contents 1 Introduction 1 1.1 Motivation . 1 1.2 Problem Statement . 2 1.3 Structure of the Thesis . 5 2 Web of Data 7 2.1 Web 2.0 . 7 2.2 Web 3.0 . 9 2.3 Resource Description Framework (RDF) . 11 2.4 Web Ontology Language (OWL) . 16 2.5 SPARQL. Query Language for RDF . 19 2.6 Linked Open Data . 21 2.7 Overview of Linked Data Endpoints . 24 2.7.1 DBPedia . 25 2.7.2 Linked Movie Data Base . 28 2.8 Widgets & Mashups . 28 2.9 Scheme.org . 32 2.10 Semantic Web Services . 33 3 State of the Art 37 3.1 Applications . 37 3.1.1 Overview of existing application . 37 3.1.2 Yahoo!Pipes . 38 3.1.3 DERI Pipes . 41 3.1.4 BIO2RDF . 44 3.1.5 LOD2 . 45 3.2 Semantic Description Approaches . 50 3.2.1 Web Services Description Language (WSDL) . 50 3.2.2 Semantic Annotation for Web Services Description Language (SAWSDL) 51 3.2.3 Semantic Markup for Web Services (OWL-S) . 51 3.2.4 Web Service Modeling Ontology (WSMO) . 54 3.2.5 WSMO lite . 54 3.2.6 RESTDesc semantic description . 55 ix 3.2.7 SA-REST . 56 3.2.8 EXPRESS . 57 3.2.9 Linked Open Services (LOS) . 59 3.2.10 Linked Date Services (LIDS) . 60 3.2.11 Data-Fu . 61 3.2.12 Karma . 62 3.2.13 RDB to RDF Mapping Language (R2RML) . 65 3.3 Summary . 68 4 Solution 73 4.1 Definition of requirements . 73 4.2 Use and Extension of Karma Approach . 75 4.3 Widget Model . 81 4.4 DCAT . 84 4.5 Provenance . 86 5 Results and Evaluation 95 5.1 Resulting Semantic Model . 95 5.2 Semantic Model Use cases . 95 5.2.1 Publishing examples . 96 5.2.2 Discovery examples . 98 5.2.3 Composition examples . 101 5.2.4 Smart Data Consumption . 103 5.3 Result evaluation . 104 6 Conclusion and Future Work 107 6.1 Research Summary . 107 6.2 Research Limitation . 109 6.3 Future Work . 109 7 Appendix 111 7.1 Acronyms . 111 7.2 Widget Semantic Model . 112 7.3 Semantic Models in Top Braid Composer . 116 Bibliography 121 x CHAPTER 1 Introduction 1.1 Motivation The Web is a phenomenon which has changed the modern era of communication and enterprise networks. The idea originally was conceived 25 years ago by Tim Berners-Lee and Robert Cailiau [15]. The main goals of the project were: to provide a protocol for requesting and exchanging information with use of networks; provide a method of reading information; provide search mechanisms; provide a collection of documents [15]. The documents were presented by a list of references, so-called hyperlinks, to other text sources over the Internet. In general, the Web is based on following technologies: • documents written in Hypertext Markup Language (HTML)1, the language that “was pri- marily designed as a language for semantically describing scientific documents“ [67], • Uniform Resource Locator (URL) references to a resource that consists of “a naming scheme specifier followed by a string whose format is a function of the naming scheme“ and Uniform Resource Identifier (URI), “a compact sequence of characters that identify an abstract or physical resource“ [1], i.e. the name of Web sources, • Hypertext Transfer Protocol (HTTP), a protocol for “distributed, collaborative, hyperme- dia information systems“ [64] With widespread use of the Web we saw the next stage in this evolution, the so called “Read- Write“ Web, or Web 2.0, where information can be distributed.

Consuming Linked Open Data Via Standard Web Widgets

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support