Towards an Ontology Design Pattern Quality Model

Linköping Studies in Science and Technology Licentiate Thesis No. 1606 Towards an Ontology Design Pattern Quality Model by Karl Hammar Department of Computer and Information Science Linköping University SE-581 83 Linköping, Sweden Linköping 2013 This is a Swedish Licentiate’s Thesis Swedish postgraduate education leads to a doctor’s degree and/or a licentiate’s degree. A doctor’s degree comprises 240 ECTS credits (4 year of full-time studies). A licentiate’s degree comprises 120 ECTS credits. Copyright c 2013 Karl Hammar ISBN 978-91-7519-570-4 ISSN 0280–7971 Printed by LiU Tryck 2013 URL: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-93370 Towards an Ontology Design Pattern Quality Model by Karl Hammar September 2013 ISBN 978-91-7519-570-4 Linköping Studies in Science and Technology Licentiate Thesis No. 1606 ISSN 0280–7971 LiU–Tek–Lic–2013:40 ABSTRACT The use of semantic technologies and Semantic Web ontologies in particular have enabled many recent developments in information integration, search engines, and reasoning over formalised knowledge. Ontology Design Patterns have been proposed to be useful in simplifying the development of Semantic Web ontologies by codifying and reusing modelling best practices. This thesis investigates the quality of Ontology Design Patterns. The main contribution of the thesis is a theoretically grounded and partially empirically evaluated quality model for such patterns including a set of quality characteristics, indicators, measurement methods and recommendations. The quality model is based on established theory on information system quality, conceptual model quality, and ontology evaluation. It has been tested in a case study setting and in two experiments. The main findings of this thesis are that the quality of Ontology Design Patterns can be identified, formalised and measured, and furthermore, that these qualities interact in such a way that ontology engineers using patterns need to make tradeoffs regarding which qualities they wish to prioritise. The developed model may aid them in making these choices. This work has been supported by Jönköping University. Department of Computer and Information Science Linköping University SE-581 83 Linköping, Sweden Acknowledgements There are several individuals and organizations that have enabled and supported this research project in different ways, to which I extend my deepest gratitude. The supervision team consisted of Kurt Sandkuhl, Vladimir Tarasov, Eva Blomqvist,andHenrik Eriksson. This large and geographically dis- tributed team have complemented one another splendidly throughout the process and have when needed provided guidance regarding everything from research methods and writing style to academic social protocol and how to arrange workshops and other events. I’ve especially appreciated their hands- off approach which has enabled me to explore different directions and learn from many mistakes. My colleagues at the Department of Computer and Electrical Engineer- ing at Jönköping University have made the last couple of years working in research not only interesting but also fun. Christer Thörn and Ulf Seigerroth are however worthy of particular mention in that they have not only been great coworkers but have also contributed directly to the success of this work, the former by reviewing and commenting, and the latter by keeping bosses off my back and ensuring I’ve had time to get it done. Fredrik R. Krohnman has provided encouragement and many good discussions, keeping up my interest in computer technologies and research, and motivating me to continue on the academic path. A special thank you goes to my wife Jenny, who has supported me through good and bad, and has had to put up with many a late night spent working, without any complaints. I love you. Karl Hammar Jönköping, June 2013 v vi Contents 1 Introduction 1 1.1MotivatingFactors........................ 1 1.2 Research Questions . ..................... 3 1.3Contributions........................... 4 1.4ThesisOutline.......................... 5 2 Ontologies and Ontology Design Patterns 7 2.1KnowledgeModellingandOntologies.............. 7 2.1.1 Data,Information,andKnowledge........... 7 2.1.2 Terminological and Assertional Knowledge ....... 9 2.1.3 OntologyComponents.................. 10 2.1.4 RDF,RDFS,andOWL................. 13 2.2OntologyApplications...................... 16 2.2.1 OntologyTypes...................... 17 2.2.2 LinkedData....................... 18 2.2.3 SemanticSearch..................... 20 2.2.4 ReasoningTasks..................... 22 2.3OntologyDevelopment...................... 25 2.3.1 METHONTOLOGY................... 25 2.3.2 On-To-Knowledge.................... 26 2.3.3 DILIGENT........................ 27 2.3.4 OntologyDevelopment101............... 28 2.4OntologyDesignPatterns.................... 29 2.4.1 ODPTypologies..................... 32 2.4.2 ODP-basedOntologyConstruction........... 34 2.5TheStateofODPResearch................... 38 2.5.1 Results.......................... 39 2.5.2 AnalysisandDiscussion................. 40 3 Evaluation and Quality Frameworks 45 3.1RelatedQualityFrameworks.................. 45 3.1.1 MAPPER ......................... 45 3.1.2 ConceptualModelQuality................ 46 3.1.3 EntityRelationshipModelQuality........... 48 vii Contents 3.1.4 InformationSystemQuality............... 50 3.1.5 PatternQuality...................... 53 3.2OntologyEvaluation....................... 54 3.2.1 O2 andoQual....................... 54 3.2.2 ONTOMETRIC..................... 55 3.2.3 OntoClean........................ 56 3.2.4 TerminologicalCycleEffects............... 58 3.2.5 ODPDocumentationTemplateEffects......... 58 4 Research Method 59 4.1 A Perspective on Methods in the Computing Disciplines . 59 4.1.1 SystematicLiteratureReview.............. 61 4.1.2 CaseStudies....................... 63 4.1.3 Interviews......................... 64 4.1.4 Experimentation..................... 65 4.2 Description of the Research Process .............. 66 4.2.1 ODPLiteratureStudy.................. 66 4.2.2 InitialQualityModelDevelopment........... 71 4.2.3 KnowledgeFusionCaseStudy............. 75 4.2.4 Learnability and Usability Evaluations ......... 78 4.2.5 PerformanceIndicatorEvaluation........... 81 5 Initial Quality Model 85 5.1QualityMetamodelDevelopment................ 85 5.2QualityModelDevelopment................... 87 5.2.1 ISO 25010 Adaptation .................. 88 5.2.2 Thörn’sQualities..................... 90 5.2.3 Reuse of ER Model Quality Research ......... 91 5.2.4 Reuse of Established Ontology Quality Research . 93 5.3EmpiricalPre-studies...................... 95 5.3.1 ODPDocumentationStructureInterviews....... 96 5.3.2 ODPUsageExperimentandSurvey.......... 97 5.4TheDevelopedInitialQualityModel.............. 98 5.4.1 QualityCharacteristics................. 98 5.4.2 IndicatorsandEffects..................100 6 Quality Model Evaluations 107 6.1TheKnowledgeFusionCaseStudy...............107 6.1.1 CaseCharacterisation..................107 6.1.2 Data............................109 6.1.3 Findings..........................110 6.2 Learnability and Usability Evaluations .............114 6.2.1 Results..........................115 6.2.2 Findings..........................117 6.3PerformanceIndicatorEvaluation................119 6.3.1 LiteratureStudy.....................119 viii Contents 6.3.2 IndicatorVarianceinODPRepositories........122 6.3.3 Results..........................128 7 Refined ODP Quality Model 129 7.1Metamodel............................129 7.2QualityCharacteristics......................129 7.3IndicatorsandEffects......................132 7.3.1 UpdatedIndicators....................133 7.3.2 NewIndicators......................136 8 Conclusions and Future Work 139 8.1SummaryofContributions....................139 8.2 Research Questions Revisited ..................141 8.3FutureWork...........................142 Bibliography 145 List of Figures 157 List of Tables 159 ix Contents x Chapter 1 Introduction This licentiate thesis concerns the development of a quality model for On- tology Design Patterns, with a goal towards simplifying selection and use of such patterns for non-expert ontology engineers. In the following sections the background and motivation for the licentiate project are discussed, the research questions introduced, contributions of the work summarised, and structure of the remainder of the thesis briefly described. 1.1 Motivating Factors The work presented in this thesis falls within the Information Logistics research area, which is concerned with various types of problems pertaining to information provision and information flow. To name but a few: in the academic domain, where researchers seeking to find unexplored areas for study need structured information about what kind of research is being done and where; in product development, where keeping track of different sets of pos- sibly conflicting requirements is necessary to avoid costly product failures; and in disaster management and recovery, where the need for rapid and cor- rect information on which to base decisions is crucial in preventing injuries or even saving lives. In order to explore and solve information logistics problems three different perspectives are commonly employed [1]: • The information demand perspective, in which researchers study what individual or group has need of which information, in which presenta- tion format, in a certain context or situation. This can involve factors such as the task for which information is to be used, the effect of ge- ographical

Towards an Ontology Design Pattern Quality Model

Realising the Potential of Algal Biomass Production Through Semantic Web and Linked Data

From Cataloguing Cards to Semantic Web 1

Download Slides

Knowledge Extraction for Hybrid Question Answering

Datatone: Managing Ambiguity in Natural Language Interfaces for Data Visualization Tong Gao1, Mira Dontcheva2, Eytan Adar1, Zhicheng Liu2, Karrie Karahalios3

Directions in AI Research and Applications at Siemens Corporate

A Framework for Ontology-Based Library Data Generation, Access and Exploitation

Ontologies to Interpret Remote Sensing Images: Why Do We Need Them?

Question Answering

Ques+On Answering

Using a Natural Language Understanding System to Generate Semantic Web Content

Using Distributional Semantics for Automatic Taxonomy Induction