Master Thesis August 22, 2012 Tagging Methods for Linked Media Data David Oggier of Vevey VD, Switzerland Student-ID: 05-219-209 [email protected] Advisor: Thomas Scharrenbach, PhD Prof. Abraham Bernstein, PhD Department of Informatics University of Zurich http://www.ifi.uzh.ch/ddis Acknowledgements I would like to thank Professor Abraham Bernstein for the opportunity to write my Master’s Thesis with the DDIS group of the University of Zurich. I am also very thankful to my advisor Thomas Scharrenbach for his great support and advice and to Jörg-Uwe Kietz for filling in during his absence. Many thanks to the whole IT-OSE team at Hessischer Rundfunk for their help and in particular to Patric Kabus and Haiko Emmel for making this thesis possible. Last but not least, I would like to thank my family and friends for supporting me. Special thanks to my proofreaders Amir and Christoph. ii Abstract In this thesis, a method is presented to tag the media metadata of a broadcasting company with Linked Data concepts. Specifically, a controlled vocabulary in the form of a thesaurus is used as an intermediary between broadcast metadata and Linked Data vocabularies. A method to link this metadata with appropriate thesaurus entries, as well as an algorithm to align the latter with Linked Data concepts are presented and evaluated. Furthermore, it is investigated whether a benefit is gained for user queries by applying faceted search on the resulting semantically enhanced data. iii Zusammenfassung In dieser Arbeit wird ein Vorgehen um Medien-Metadaten einer Rundfunkanstalt mit Linked Data- Konzepten zu verlinken vorgestellt. Dabei wird ein kontrolliertes Vokabular in der Form eines Thesaurus verwendet, um die Verbindung zwischen Inhaltsbeschreibungen von Beiträgen und Linked Data-Datenbeständen herzustellen. Das Vorgehen beinhaltet eine Methode um diese Inhaltbeschreibungen mit Thesaurus-Konzepten zu versehen, sowie einen Algorithmus welcher diese Thesaurus-Konzepte mit Linked Data-Ressourcen verknüpft. Des Weiteren wird untersucht, wie diese Verknüpfung von Inhaltsbeschreibungen mit semantischen Daten durch eine facettierte Suche genutzt werden kann, und ob diese gegenüber herkömmlichen Suchmethoden vorteilhaft ist. iv List of Figures Figure 1: Semantic Web standards stack (Obitko 2007) ……………………………………………………………......... 7 Figure 2: Example RDF graph ……………………………………………………………………………………………………………. 8 Figure 3: The Linked Data cloud ……………………………………………………………………………………………………… 16 Figure 4: Screenshot of Faceted Wikipedia Search …………………………………………………………………………. 18 Figure 5: The FESADdigital search form ………………………………………………………………………………………….. 27 Figure 6: The FESADred search page ………………………………………………………………………………………………. 28 Figure 7: Overview of the implementation steps ……………………………………………………………………………. 31 Figure 8: Workflow of the matching for a particular thesaurus-DBpedia concept pair ……………………. 35 Figure 9: Screenshot of the faceted search client …………………………………………………………………………… 41 Figure 10: Aggregate precision values ……………………………………………………………………………………………. 49 Figure 11: Aggregate recall values …………………………………………………………………………………………………. 49 v List of Tables Table 1: General alignment statistics ……………………………………………………………………..…………….………. 45 Table 2: Frequencies of match types ………………………………………………………………………….…………………. 45 Table 3: Alignment quality results ………………………………………………………………………………………………… 46 Table 4: Keyword extraction quality results ………………………………………………………………………………….. 50 vi List of Listings Listing 1: Sample RDF/XML statement …………………………………………………………………………………………….. 8 Listing 2: Sample RDF/N3 statement ……………………………………………………………………………………………….. 9 Listing 3: Sample SPARQL query …………………………………………………………………………………………………….. 11 Listing 4: Sample SKOS concept definition ……………………………………………………………………………………… 13 vii Contents Acknowledgements ..................................................................................................................................ii Abstract ................................................................................................................................................... iii Zusammenfassung ................................................................................................................................... iv List of Figures ............................................................................................................................................ v List of Tables ............................................................................................................................................ vi List of Listings ......................................................................................................................................... vii Contents ................................................................................................................................................ viii 1. Introduction ......................................................................................................................................... 1 1.1 Motivation ..................................................................................................................................... 1 1.2 Thesis Goals ................................................................................................................................... 2 1.3 Thesis Outline ................................................................................................................................ 3 2. Background .......................................................................................................................................... 5 2.1 The Semantic Web......................................................................................................................... 5 2.1.1 Semantic Web Goals ............................................................................................................... 5 2.1.2 Resource Description Framework .......................................................................................... 6 2.1.3 RDF Schema .......................................................................................................................... 10 2.1.4 SPARQL ................................................................................................................................. 10 2.1.5 Web Ontology Language ...................................................................................................... 11 2.1.6 Simple Knowledge Organization System .............................................................................. 13 2.2 Linked Data .................................................................................................................................. 14 2.2.1 Linked Data Principles .......................................................................................................... 14 2.2.2 The Linked Data Cloud .......................................................................................................... 15 2.2.3 Faceted Search ..................................................................................................................... 17 2.3 Ontology Learning and Matching ................................................................................................ 19 2.3.1 Ontology Learning ................................................................................................................ 20 2.3.2 Ontology Alignment ............................................................................................................. 21 3. Linking Media Metadata .................................................................................................................... 23 3.1 Media Metadata .......................................................................................................................... 23 viii 3.1.1 Metadata Generation and Consumption Lifecycle .............................................................. 23 3.1.2 Content Description Metadata ............................................................................................. 25 3.1.3 Current FESAD Access Points ................................................................................................ 26 3.2 Implementation Details ............................................................................................................... 29 3.2.1 Implementation Overview.................................................................................................... 29 3.2.2 Thesaurus Conversion .......................................................................................................... 32 3.2.3 Selection of Linked Data Vocabulary .................................................................................... 33 3.2.4 Thesaurus Alignment ............................................................................................................ 34 3.2.5 Metadata Conversion ........................................................................................................... 38 3.2.6 Tagging of Media Data.......................................................................................................... 38 3.2.7 Faceted Search Client ........................................................................................................... 39 4. Evaluation .......................................................................................................................................... 43 4.1 Alignment Evaluation .................................................................................................................
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages114 Page
-
File Size-