A Semantics-based User Interface Model for Content Annotation, Authoring and Exploration

Der Fakult¨atf¨urMathematik und Informatik der Universit¨atLeipzig eingereichte DISSERTATION

zur Erlangung des akademischen Grades

Doktor-Ingenieur (Dr. Ing.)

im Fachgebiet Informatik

vorgelegt

von M.Sc. Ali Khalili

geboren am 26. Juni 1984 in Karaj, Iran

Die Annahme der Dissertation wurde empfohlen von:

1. Prof. Dr. Klaus-Peter F¨ahnrich, Universit¨atLeipzig 2. Prof. Dr. Roberto Garc´ıa,Universitat de Lleida

Die Verleihung des akademischen Grades erfolgt mit Bestehen der Verteidigung am 26.1.2015 mit dem Gesamtpr¨adikat magna cum laude.

Bibliographic Data

Title: A Semantics-based User Interface Model for Content Annotation, Authoring and Exploration Author: Ali Khalili Institution: Universit¨atLeipzig, Fakult¨atf¨urMathematik und Informatik Statistical Information: 182 pages, 78 figures, 11 tables, 165 literature references

Abstract

The Semantic Web and Linked Data movements with the aim of creating, publish- ing and interconnecting machine readable information have gained traction in the last years. However, the majority of information still is contained in and exchanged using unstructured documents, such as Web pages, text documents, images and videos. This can also not be expected to change, since text, images and videos are the natural way in which humans interact with information. Semantic structuring of content on the other hand provides a wide range of advantages compared to unstructured information. Semantically-enriched documents facilitate information search and retrieval, presentation, integration, reusability, interoperability and personalization. Looking at the life-cycle of semantic content on the Web of Data, we see quite some progress on the backend side in storing structured content or for linking data and schemata. Nevertheless, the currently least developed aspect of the semantic content life-cycle is from our point of view the user-friendly manual and semi-automatic creation of rich semantic content. In this thesis, we propose a semantics-based user interface model, which aims to reduce the complexity of underlying technologies for semantic enrichment of content by Web users. By surveying existing tools and approaches for semantic content authoring, we extracted a set of guidelines for designing efficient and effective semantic authoring user interfaces. We applied these guidelines to devise a semantics-based user interface model called WYSIWYM (What You See Is What You Mean) which enables integrated authoring, visualization and exploration of unstructured and (semi-)structured content. To assess the applicability of our proposed WYSIWYM model, we incorporated the model into four real-world use cases comprising two general and two domain-specific applications. These use cases address four aspects of the WYSIWYM implementation: 1) Its integration into existing user interfaces, 2) Utilizing it for lightweight text analytics to incentivize users, 3) Dealing with crowdsourcing of semi-structured e-learning content, 4) Incorporating it for authoring of semantic medical prescriptions.

III Publications

This thesis is based on the following conference and journal publications, in which I have been author or contributor. At the respective chapter and section,I included the references to the appropriate publications.

Conference Publications, peer-reviewed

• conTEXT – Lightweight Text Analytics using Linked Data, In proceedings of the 11th Extended Semantic Web Conference (ESWC2014) [Khalili et al., 2014]. * 1st Prize of the AI Mashup Challenge 2014.

• WYSIWYM Authoring of Structured Content Based on Schema.org, In proceedings of the 14th International Conference on Web Information Systems Engineering (WISE 2013)[Khalili and Auer, 2013b].

• Semantic Medical Prescriptions – Towards Intelligent and Interoperable Med- ical Prescriptions, In proceedings of the IEEE 7th Internat