Rdfa Vs. Microformats

Total Page:16

File Type:pdf, Size:1020Kb

Rdfa Vs. Microformats DERI – DIGITAL ENTERPRISE RESEARCH INSTITUTE RDFA VS.MICROFORMATS Alexander Graf DERI TECHNICAL REPORT 2007-04-10 APRIL 2007 DERI Galway University Road Galway, Ireland www.deri.ie DERI Innsbruck Technikerstrasse 21a Innsbruck, Austria www.deri.at DERI Korea Yeonggun-Dong, Chongno-Gu Seoul, Korea korea.deri.org DERI Stanford Serra Mall Stanford, USA DERI – DIGITAL ENTERPRISE RESEARCH INSTITUTE www.deri.us DERI TECHNICAL REPORT DERI TECHNICAL REPORT 2007-04-10, APRIL 2007 RDFA VS.MICROFORMATS A COMPARISON OF INLINE METADATA FORMATS IN (X)HTML Alexander Graf1 Abstract. Most Web pages contain inherent structured and significant data like contact details for various people, dates and addresses of events, descriptive elements for photos and a lot more. As it is, this data is expressed in a way that is easily understandable for humans but incredibly hard to detect and interpret for machines. Once content publishers gain the ability to express this data more completely and tools are developed that are able to understand the semantics, a whole new set of possibilities on the internet becomes available to the end user. New forms of web content, meaningful to computers, will unleash a revolution of the internet. A true Semantic Web might still lie in the future but that’s no reason not to start using the core ideas from which it is being formed. There are several attempts that try to combine the principles of the Semantic Web, as envisioned by Tim Berners-Lee, with currently established technologies such as (X)HTML. This growth of semantics in the existing Web is swiftly advancing the state of the art for all Semantic Web processes. By enhancing existing Web documents with semantics we allow machines to categorise and handle information, so it can be used in a much more practical way, yet keep the principles of the existing web and still code for humans first. This paper reviews, analyzes and compares RDFa and microformats, two of the current technologies for inline metadata in (X)HTML and aims to give an overview over what possibilites are currently available for annotating existing data in Web sites. Keywords: semantic web, microformats, rdfa, xhtml, inline metadata, erdf. 1Digital Enterprise Research Institute Innsbruck, University of Innsbruck, Technikerstraße 21a, A-6020 Innsbruck, Austria. E-mail: [email protected] Copyright c 2007 by the authors DERI TR 2007-04-10 I Contents 1 Introduction 1 2 Common Principles 1 2.1 Visible Metadata . 1 2.2 DRY principle . 2 3 RDFa 2 3.1 Benefits . 3 3.2 Drawbacks . 3 4 Microformats 3 4.1 Benefits . 4 4.2 Drawbacks . 4 5 Side by Side Comparison 5 6 Discussion and Conclusions 6 DERI TR 2007-04-10 1 1 Introduction “The goal of the Semantic Web initiative is as broad as that of the Web: to create a universal medium for the exchange of data. It is envisaged to smoothly interconnect personal information management, enterprise application integration, and the global sharing of commercial, scien- tific and cultural data. Facilities to put machine-understandable data on the Web are quickly becoming a high priority for many organizations, individuals and communities.” [1] This vision of the Semantic Web consists essentially of a distributed knowledge system based on RDF, a markup format that provides a way to express logical statements in serialized formats like XML. It derives from Tim Berners-Lee’s vision of the World Wide Web as a universal medium for knowledge exchange. This new Semantic Web would be fundamentally different from the Web of today, mainly because it will be designed for machines first and humans second. Additionally the Semantic Web requires that we take a step away from the Web that we all know, throw away current practices and formats and embrace a new Web. While this isn’t exactly bad, it’s not a step that is to be taken lightly and it’s certainly not a step that can be taken quickly. At the moment we already have a Web which is viewable by humans in its native form and yet can be used as a first step to a Semantic Web. By re-using existing data and allowing the expression of semantics in Web pages, we can provide machines with information already being published on the Web as (X)HTML. Those “Real World Semantics” are seeing a widespread adoption by companies, bloggers and other “real people” on the internet beyond academic institutions. Recently the term “Lowercase Semantic Web” was coined for this type of mark-up, where the goals of the semantic web are achieved without dependence on the standards that are part of the wider Semantic Web initiative but can still work together with the “Uppercase Semantic Web” which comprises those standards. Several technologies that aim to enhance (X)HTML with semantic information have surfaced over the time and struggle for public acceptance, the most important ones being RDFa and microformats. Both RDFa and microformats share the same goal, yet are fundamentally different in that they approach the problem from a different direction, and deserve a closer inspection. 2 Common Principles While RDFa and microformats are very different, they share several core principles. For example both technologies support plain literals, are well formed and have no negative effect on browser behaviour. They also both follow the Principle of Least Astonishment which states that, when several elements of an interface are ambiguous, the behaviour that least surprises the human user should apply as it will usually be the correct one. The principle of Visible Metadata and the DRY Principle are two more features that are equally available in all approaches. 2.1 Visible Metadata Previously there were several attempts to annotate HTML documents with metadata. Ranging from <meta> tags in the head of a document to embedded RDF in HTML comments, those attempts had in common that the metadata was invisible to the human reader of the document. Hidden metadata is often abused for search engine placement or other gain that only benefits the author of the document, not the user. 2 DERI TR 2007-04-10 By making metadata available and completely visible, a consumer can easily know whether to trust the author and can be sure that all data is actually relevant to the human reader as well as machines. This principle also assists the document author in keeping the metadata up-to-date. Metadata that is hidden away can be easily forgotten and go stale, whereas visible inaccuracies would soon be discovered by humans and could thus be fixed. 2.2 DRY principle DRY stands for Don’t Repeat Yourself and describes another important process philosophy used in the RDFa and microformats approaches. Also known as Once and Only Once or Single Point of Truth, the core principle, which has first been mentioned in Andy Hunt and Dave Thomas’s book The Pragmatic Programmer, aims to reduce redundancy in computing. “DRY says that every piece of system knowledge should have one authoritative, unambigu- ous representation. Every piece of knowledge in the development of something should have a single representation. [...] Given all this knowledge, why should you find one way to represent each feature? The obvious answer is, if you have more than one way to express the same thing, at some point the two or three different representations will most likely fall out of step with each other. Even if they don’t, you’re guaranteeing yourself the headache of maintaining them in parallel whenever a change occurs. And change will occur.” [3] Often we maintain seperate RDF documents along with their HTML equivalents and have to update both resources on a regular basis. If the DRY principle is applied, a modification of any metadata in the system has to be done only in one place, expressed for both humans and machines. 3 RDFa RDFa, developed and proposed by the W3C, is a set of rules that can be used as a module for XHTML 2. It reuses attributes from standard XHTML meta and link elements and applies them to all other XHTML elements, so that one can annotate XHTML markup with semantic information. With a simple mapping it is possible to extract RDF triples from a RDFa annotated document. “RDFa is a syntax for expressing this structured data in XHTML. The rendered, hypertext data of XHTML is reused by the RDFa markup, so that publishers don’t repeat themselves. The underlying abstract representation is RDF, which lets publishers build their own vocabulary, extend others, and evolve their vocabulary with maximal interoperability over time. The ex- pressed structure is closely tied to the data, so that rendered data can be copied and pasted along with its relevant structure.” [4] The ultimate goal of RDFa is to make any RDF structure representable in pure XHTML. Other than the microformats approach, this allows an author to use a predefined set of rules to mark up just about anything. Since the underlying abstract presentation is pure RDF, publishers can build their own vocabulary and extend other vocabularies with maximum interoperability. The structure expressed with RDFa is closely tied to the actual data, so the rendered elements can be copied and pasted along with their relevant RDF structure. However, there are also problems with RDFa. Not only does it require XHTML 2, it also requires a new form of URIs, called CURIEs. DERI TR 2007-04-10 3 3.1 Benefits • Publishers are independent and each website is allowed to use their own standards • Because of Self Containment, the RDF triples are seperated from the (X)HTML content • Modularity of the schema makes attributes reusable • Follows several well-working microformats principles
Recommended publications
  • Using Json Schema for Seo
    Using Json Schema For Seo orAristocratic high-hat unyieldingly.Freddie enervates Vellum hungrily Zippy jangles and aristocratically, gently. she exploiter her epoxy gnarls vivace. Overnice and proclitic Zane unmortgaged her ben thrum This provides a murder of element ids with more properties elsewhere in the document Javascript Object Notation for Linked Objects JSON-LD. Enhanced display search results with microdata markup is json data using video we need a website experience, is free whitepaper now need a form. Schemaorg Wikipedia. Sign up in some time and as search console also, he gets generated by google tool you add more. Schema Markup 2021 SEO Best Practices Moz. It minimal settings or where your page editor where can see your business information that will talk about. Including your logo, social media and corporate contact info is they must. How various Use JSON-LD for Advanced SEO in Angular by Lewis. How do no implement a FAQ schema? In seo plugin uses standard schema using html. These features can describe you stand only in crowded SERPs and enclose your organic clickthrough rate. They propose using the schemaorg vocabulary along between the Microdata RDFa or JSON-LD formats to that up website content with metadata about my Such. The incomplete data also can mild the Rich Snippets become very inconsistent. Their official documentation pages are usually have few months or even years behind. Can this be included in this? Please contact details about seo services, seos often caches versions of. From a high level, you warrior your adventure site pages, you encounter use an organization schema.
    [Show full text]
  • V a Lida T in G R D F Da
    Series ISSN: 2160-4711 LABRA GAYO • ET AL GAYO LABRA Series Editors: Ying Ding, Indiana University Paul Groth, Elsevier Labs Validating RDF Data Jose Emilio Labra Gayo, University of Oviedo Eric Prud’hommeaux, W3C/MIT and Micelio Iovka Boneva, University of Lille Dimitris Kontokostas, University of Leipzig VALIDATING RDF DATA This book describes two technologies for RDF validation: Shape Expressions (ShEx) and Shapes Constraint Language (SHACL), the rationales for their designs, a comparison of the two, and some example applications. RDF and Linked Data have broad applicability across many fields, from aircraft manufacturing to zoology. Requirements for detecting bad data differ across communities, fields, and tasks, but nearly all involve some form of data validation. This book introduces data validation and describes its practical use in day-to-day data exchange. The Semantic Web offers a bold, new take on how to organize, distribute, index, and share data. Using Web addresses (URIs) as identifiers for data elements enables the construction of distributed databases on a global scale. Like the Web, the Semantic Web is heralded as an information revolution, and also like the Web, it is encumbered by data quality issues. The quality of Semantic Web data is compromised by the lack of resources for data curation, for maintenance, and for developing globally applicable data models. At the enterprise scale, these problems have conventional solutions. Master data management provides an enterprise-wide vocabulary, while constraint languages capture and enforce data structures. Filling a need long recognized by Semantic Web users, shapes languages provide models and vocabularies for expressing such structural constraints.
    [Show full text]
  • Ontologies and Semantic Web for the Internet of Things - a Survey
    See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/312113565 Ontologies and Semantic Web for the Internet of Things - a survey Conference Paper · October 2016 DOI: 10.1109/IECON.2016.7793744 CITATIONS READS 5 256 2 authors: Ioan Szilagyi Patrice Wira Université de Haute-Alsace Université de Haute-Alsace 10 PUBLICATIONS 17 CITATIONS 122 PUBLICATIONS 679 CITATIONS SEE PROFILE SEE PROFILE Some of the authors of this publication are also working on these related projects: Physics of Solar Cells and Systems View project Artificial intelligence for renewable power generation and management: Application to wind and photovoltaic systems View project All content following this page was uploaded by Patrice Wira on 08 January 2018. The user has requested enhancement of the downloaded file. Ontologies and Semantic Web for the Internet of Things – A Survey Ioan Szilagyi, Patrice Wira MIPS Laboratory, University of Haute-Alsace, Mulhouse, France {ioan.szilagyi; patrice.wira}@uha.fr Abstract—The reality of Internet of Things (IoT), with its one of the most important task in an IoT system [6]. Providing growing number of devices and their diversity is challenging interoperability among the things is “one of the most current approaches and technologies for a smarter integration of fundamental requirements to support object addressing, their data, applications and services. While the Web is seen as a tracking and discovery as well as information representation, convenient platform for integrating things, the Semantic Web can storage, and exchange” [4]. further improve its capacity to understand things’ data and facilitate their interoperability. In this paper we present an There is consensus that Semantic Technologies is the overview of some of the Semantic Web technologies used in IoT appropriate tool to address the diversity of Things [4], [7]–[9].
    [Show full text]
  • Semantics Developer's Guide
    MarkLogic Server Semantic Graph Developer’s Guide 2 MarkLogic 10 May, 2019 Last Revised: 10.0-8, October, 2021 Copyright © 2021 MarkLogic Corporation. All rights reserved. MarkLogic Server MarkLogic 10—May, 2019 Semantic Graph Developer’s Guide—Page 2 MarkLogic Server Table of Contents Table of Contents Semantic Graph Developer’s Guide 1.0 Introduction to Semantic Graphs in MarkLogic ..........................................11 1.1 Terminology ..........................................................................................................12 1.2 Linked Open Data .................................................................................................13 1.3 RDF Implementation in MarkLogic .....................................................................14 1.3.1 Using RDF in MarkLogic .........................................................................15 1.3.1.1 Storing RDF Triples in MarkLogic ...........................................17 1.3.1.2 Querying Triples .......................................................................18 1.3.2 RDF Data Model .......................................................................................20 1.3.3 Blank Node Identifiers ..............................................................................21 1.3.4 RDF Datatypes ..........................................................................................21 1.3.5 IRIs and Prefixes .......................................................................................22 1.3.5.1 IRIs ............................................................................................22
    [Show full text]
  • Spatial Data Infrastructures and Linked Data
    Spatial Data Infrastructures and Linked Data Carlos Granell Centre for Interactive Visualization Universitat Jaume I, Castellón, Spain Sven Schade European Commission - Joint Research Centre Institute for Environment and Sustainability, Ispra, Italy Gobe Hobona Centre for Geospatial Science University of Nottingham, Nottingham, United Kingdom ABSTRACT A Spatial Data Infrastructure (SDI) is a type of information infrastructure for enhancing geospatial data sharing and access. At the moment, we face the transition from the service-oriented second generation of SDI to a third generation, characterized by user-centric approaches. This new movement closes the gap between classical SDI and Volunteered Geographic Information (VGI). Public use and acquisition of information provides additional challenges within and beyond the geospatial domain. Linked data has been suggested recently as a possible overall solution. This notion refers to a best practice for exposing, sharing, and connecting resources in the (semantic) web. In this paper, we project the linked data approach to SDI and suggest it as a possibility to combine SDI with VGI. We advocate a Spatial Linked Data Infrastructure, which applies solutions for linked data to classical SDI standards. We detail different implementing strategies, give examples, and argue for benefits, while at the same time trying to outline possible fallbacks. We hope that this contribution will enlighten a way towards a single shared information space. 2 INTRODUCTION A Spatial Data Infrastructure (SDI) is a type of information infrastructure for enhancing geospatial data sharing and access. An SDI embraces a set of rules, standards, procedures, guidelines, policies, institutions, data, networks, technology and human resources for enabling and coordinating the management and exchange of geospatial data between stakeholders in the spatial data community (Nebert, 2004; Rajabifard et al., 2006; Masser, 2007).
    [Show full text]
  • Rdfa in XHTML: Syntax and Processing Rdfa in XHTML: Syntax and Processing
    RDFa in XHTML: Syntax and Processing RDFa in XHTML: Syntax and Processing RDFa in XHTML: Syntax and Processing A collection of attributes and processing rules for extending XHTML to support RDF W3C Recommendation 14 October 2008 This version: http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014 Latest version: http://www.w3.org/TR/rdfa-syntax Previous version: http://www.w3.org/TR/2008/PR-rdfa-syntax-20080904 Diff from previous version: rdfa-syntax-diff.html Editors: Ben Adida, Creative Commons [email protected] Mark Birbeck, webBackplane [email protected] Shane McCarron, Applied Testing and Technology, Inc. [email protected] Steven Pemberton, CWI Please refer to the errata for this document, which may include some normative corrections. This document is also available in these non-normative formats: PostScript version, PDF version, ZIP archive, and Gzip’d TAR archive. The English version of this specification is the only normative version. Non-normative translations may also be available. Copyright © 2007-2008 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply. Abstract The current Web is primarily made up of an enormous number of documents that have been created using HTML. These documents contain significant amounts of structured data, which is largely unavailable to tools and applications. When publishers can express this data more completely, and when tools can read it, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites, and allowing browsing applications to improve the user experience: an event on a web page can be directly imported - 1 - How to Read this Document RDFa in XHTML: Syntax and Processing into a user’s desktop calendar; a license on a document can be detected so that users can be informed of their rights automatically; a photo’s creator, camera setting information, resolution, location and topic can be published as easily as the original photo itself, enabling structured search and sharing.
    [Show full text]
  • The Application of Semantic Web Technologies to Content Analysis in Sociology
    THEAPPLICATIONOFSEMANTICWEBTECHNOLOGIESTO CONTENTANALYSISINSOCIOLOGY MASTER THESIS tabea tietz Matrikelnummer: 749153 Faculty of Economics and Social Science University of Potsdam Erstgutachter: Alexander Knoth, M.A. Zweitgutachter: Prof. Dr. rer. nat. Harald Sack Potsdam, August 2018 Tabea Tietz: The Application of Semantic Web Technologies to Content Analysis in Soci- ology, , © August 2018 ABSTRACT In sociology, texts are understood as social phenomena and provide means to an- alyze social reality. Throughout the years, a broad range of techniques evolved to perform such analysis, qualitative and quantitative approaches as well as com- pletely manual analyses and computer-assisted methods. The development of the World Wide Web and social media as well as technical developments like optical character recognition and automated speech recognition contributed to the enor- mous increase of text available for analysis. This also led sociologists to rely more on computer-assisted approaches for their text analysis and included statistical Natural Language Processing (NLP) techniques. A variety of techniques, tools and use cases developed, which lack an overall uniform way of standardizing these approaches. Furthermore, this problem is coupled with a lack of standards for reporting studies with regards to text analysis in sociology. Semantic Web and Linked Data provide a variety of standards to represent information and knowl- edge. Numerous applications make use of these standards, including possibilities to publish data and to perform Named Entity Linking, a specific branch of NLP. This thesis attempts to discuss the question to which extend the standards and tools provided by the Semantic Web and Linked Data community may support computer-assisted text analysis in sociology. First, these said tools and standards will be briefly introduced and then applied to the use case of constitutional texts of the Netherlands from 1884 to 2016.
    [Show full text]
  • Semantic Skin: from Flat Textual Content to Interconnected Repositories Of
    Semantic Skin: from flat textual content to interconnected repositories of semantic data. Claudio Baldassarre ABSTRACT front-end web application. This application offers a faceted One approach to re-balancing the Digital Divide tends to view of the underlying \news-KB". The current blog site favor the production of informative content in flat formats, appearance is merely a stylistic choice, while a running in- which are easy to distribute and consume. At the same time stance is always backed by a SPARQL endpoint over the this approach forbids to deliver the core knowledge perti- \news-KB". The facets are typically rendered as menu ele- nent within the content; i.e. it increases the Knowledge ments5: some menus facet the entire \news-KB" (e.g., news Divide. In some international organizations1, informative Topics, or Provenance); while other menus facet only the content distribution to groups in Latin America happens by content currently visible to the users. The faceting mech- manually collecting text-based content, then disseminating anism is also applied tothe \news archive" as a time-based it via standard mailing lists, or databases copies sent out facet of the repository content. All the facets are popu- regularly. Our demo showcases the use of Semantic Skin lated with SPARQL queries over the \news-model" instances a technology that after semantifying the content submit- in the \news-KB". Each news item is then presented with ted in flat formats, provides access to the information via a its summary, title, publication date, and provenance (e.g., knowledge layer, which is, however, transparent to the end permalink).
    [Show full text]
  • The Semantic Web: the Origins of Artificial Intelligence Redux
    The Semantic Web: The Origins of Artificial Intelligence Redux Harry Halpin ICCS, School of Informatics University of Edinburgh 2 Buccleuch Place Edinburgh EH8 9LW Scotland UK Fax:+44 (0) 131 650 458 E-mail:[email protected] Corresponding author is Harry Halpin. For further information please contact him. This is the tear-off page. To facilitate blind review. Title:The Semantic Web: The Origins of AI Redux working process managed to both halt the fragmentation of Submission for HPLMC-04 the Web and create accepted Web standards through its con- sensus process and its own research team. The W3C set three long-term goals for itself: universal access, Semantic Web, and a web of trust, and since its creation these three goals 1 Introduction have driven a large portion of development of the Web(W3C, 1999) The World Wide Web is considered by many to be the most significant computational phenomenon yet, although even by One comparable program is the Hilbert Program in mathe- the standards of computer science its development has been matics, which set out to prove all of mathematics follows chaotic. While the promise of artificial intelligence to give us from a finite system of axioms and that such an axiom system machines capable of genuine human-level intelligence seems is consistent(Hilbert, 1922). It was through both force of per- nearly as distant as it was during the heyday of the field, the sonality and merit as a mathematician that Hilbert was able ubiquity of the World Wide Web is unquestionable. If any- to set the research program and his challenge led many of the thing it is the Web, not artificial intelligence as traditionally greatest mathematical minds to work.
    [Show full text]
  • Where Is the Semantic Web? – an Overview of the Use of Embeddable Semantics in Austria
    Where Is The Semantic Web? – An Overview of the Use of Embeddable Semantics in Austria Wilhelm Loibl Institute for Service Marketing and Tourism Vienna University of Economics and Business, Austria [email protected] Abstract Improving the results of search engines and enabling new online applications are two of the main aims of the Semantic Web. For a machine to be able to read and interpret semantic information, this content has to be offered online first. With several technologies available the question arises which one to use. Those who want to build the software necessary to interpret the offered data have to know what information is available and in which format. In order to answer these questions, the author analysed the business websites of different Austrian industry sectors as to what semantic information is embedded. Preliminary results show that, although overall usage numbers are still small, certain differences between individual sectors exist. Keywords: semantic web, RDFa, microformats, Austria, industry sectors 1 Introduction As tourism is a very information-intense industry (Werthner & Klein, 1999), especially novel users resort to well-known generic search engines like Google to find travel related information (Mitsche, 2005). Often, these machines do not provide satisfactory search results as their algorithms match a user’s query against the (weighted) terms found in online documents (Berry and Browne, 1999). One solution to this problem lies in “Semantic Searches” (Maedche & Staab, 2002). In order for them to work, web resources must first be annotated with additional metadata describing the content (Davies, Studer & Warren., 2006). Therefore, anyone who wants to provide data online must decide on which technology to use.
    [Show full text]
  • Mathematical Model of Semantic Look-An Efficient Context Driven Search
    Mathematical Model of Semantic Look - An Efficient Context Driven Search Engine Leena Giri Ga, Srikanth P La, Manjula S Ha, K R Venugopal a, L M Patnaikb aDepartment of Computer Science and Engineering, University Visvesvaraya College of Engineering, Bangalore University, Bangalore 560 001 India, Contact: [email protected]. bHonorary Professor, IISc., Bangalore. The World Wide Web (WWW) is a huge conservatory of web pages. Search Engines are key applications that fetch web pages for the user query. In the current generation web architecture, search engines treat keywords provided by the user as isolated keywords without considering the context of the user query. This results in a lot of unrelated pages or links being displayed to the user. Semantic Web is based on the current web with a revised framework to display a more precise result set as response to a user query. The current web pages need to be annotated by finding relevant meta data to be added to each of them, so that they become useful to Semantic Web search engines. Semantic Look explores the context of user query by processing the Semantic information recorded in the web pages. It is compared with an existing algorithm called OntoLook and it is shown that Semantic Look is a better optimized search engine by being more than twice as fast as OntoLook. Keywords : Ontology, RDF, Semantic Web. 1. INTRODUCTION the web page playing multiple roles. Both Ontolo- gies and RDF are embedded in web pages forming Semantic Web (Web 3.0) is the proliferation of the semantic annotation of a web page.
    [Show full text]
  • Conceptualization and Visualization of Tagging and Folksonomies
    Conceptualization and Visualization of Tagging and Folksonomies Von der Fakultät für Ingenieurwissenschaften, Abteilung Informatik und Angewandte Kognitionswissenschaft der Universität Duisburg-Essen zur Erlangung des akademischen Grades Doktor der Ingenieurwissenschaften (Dr.-Ing.) genehmigte Dissertation von Steffen Lohmann aus Hamburg 1. Gutachter: Prof. Dr. Maria Paloma Díaz Pérez 2. Gutachter: Prof. Dr.-Ing. Jürgen Ziegler Tag der mündlichen Prüfung: 27.11.2013 Hinweis: Diese Dissertation ist im Rahmen eines binationalen Promotionsverfahrens (Cotutelle) in Kooperation mit der Universidad Carlos III de Madrid entstanden. Abstract Tagging has become a popular indexing method for interactive systems in the past decade. It offers a simple yet effective way for users to organize an ever increasing amount of digital information for themselves and/or others. The linked user vocabulary resulting from tagging is known as folksonomy and provides a valuable source for the retrieval and exploration of digital resources. Although several models and representations of tagging have been proposed, there is no coherent conceptualization that provides a comprehensive and pre- cise description of the concepts and relationships in the domain. Furthermore, there is little systematic research in the area of folksonomy visualization, and so folksonomies are still mainly depicted as simple tag clouds. Both problems are related, as a well-defined conceptualization is an important prerequisite for the interoperable use and visualization of folksonomies. The thesis addresses these shortcomings by developing a coherent conceptualiza- tion of tagging and visualizations for the interactive exploration of folksonomies. It gives an overview and comparison of tagging models and defines key concepts of the domain. After a comprehensive review of existing tagging ontologies, a unified and coherent conceptualization is presented that incorporates the best parts of the reviewed ontologies.
    [Show full text]