Creating Ontology-Based Metadata by Annotation for the Semantic Web

Zur Erlangung des akademischen Grades eines Doktors der Wirtschaftswissenschaften (Dr. rer. pol.) von der Fakultät für Wirtschaftswissenschaften der Universität Fridericiana zu Karlsruhe genehmigte Dissertation. Creating Ontology-based Metadata by Annotation for the Semantic Web von Dipl.-Ing.(FH), Dipl.-Inf.wiss. Siegfried Handschuh Tag der mündlichen Prüfung: 16. Februar 2005 Referent: Prof. Dr. Rudi Studer Korreferent: Prof. Dr. Christof Weinhardt To my family. Acknowledgements I would like to thank people that supported me during the process of researching and writing this thesis. First of all, I would like to express my gratitude to my advisor, Prof. Dr. Rudi Studer, for his support, patience, and encouragement throughout my PhD studies at the Institute AIFB at the University of Karlsruhe. I am also grateful to Prof. Dr. Christof Weinhardt on my dissertation committee, and to Prof. Dr. Andreas Oberweis, who served on the examination committee. My work very much profited from working with Prof. Dr. Steffen Staab. I learned a lot from him, he inspiried me with his ideas and he significantly improved my scientific skills, e.g. by teaching me how to successfully write papers. He and Prof. Dr. Gerd Stumme gave me invaluable advice on organizing research workshops. My thanks also goes to all my colleagues in Karlsruhe at the Institute AIFB, the FZI and Ontoprise GmbH for providing a very fruitful and stimulating working atmosphere. In particular to Philipp Cimiano, Nenad and Ljiljana Stojanovic, Sudhir Agarwal and Alexander Mädche. I’m especially indebted to Dr. Stefan Decker, for taking me on board his Onto- Agents project and the DARPA DAML program, which funded my work on semantic annotation. Tanja Sollazzo, Guylaine Kepeden, Wolf Winkler, Mika Maier-Collin and Kai Kühn who wrote their diploma theses under my supervision, contributed to the annotation framework. Furthermore I would like to thank Leo Meyer and Matthias Braun for their implementation work. My gratitude goes to Dr. Simon Beck for improving my time management skills in the final stage of my thesis writing, and my sister-in-law, Caroline for proof- reading drafts of this thesis - any remaining errors are of course mine. I thank my friends Esther Stäheli and Sven Ruby for their devotion and putting up with me over a number of years; and Jorge Gonzalez for his advice and Julien Tane for the lively discussions. My family, my parents Philomena and Karl receive my deepest gratitude and love for the many years of support and for always believing in me. Last, but not least, I would like to thank my wife Patricia for her patience, love and understanding and our adorable daughter Ella for being a part of our lives from now on. v Acknowledgements vi Contents Acknowledgements v I. Foundations 1 1. Introduction 3 1.1. Motivation & Problem Description . 3 1.2. ResearchQuestions.......................... 4 1.3. Approaches .............................. 5 1.4. Reader’sGuide ............................ 6 2. Metadata and Ontology Languages 9 2.1. Metadata ............................... 9 2.1.1. Metadata Standards . 11 2.1.2. Metalanguages . 11 2.2. XML.................................. 12 2.3. XML Pointer Language (XPointer) . 14 2.3.1. XPath Basics . 15 2.3.2. XPointer Standards . 16 2.3.3. Referencing to Text passages with XPointers . 17 2.4. RDFandRDFS ........................... 20 2.4.1. TheRDFDataModel . 20 2.4.2. RDFSchema ......................... 22 2.4.3. RDFSyntax ......................... 24 2.4.4. Notation3 .......................... 25 2.5. Ontologies............................... 27 2.5.1. Ontology Definition . 27 2.5.2. Classification ......................... 29 2.6. Ontology Languages . 30 3. Semantic Annotation for the Web 35 3.1. TheSemanticWeb .......................... 35 3.2. Infrastructure for the Semantic Web – The Information Foodchain 36 3.3. Processes for the Semantic Web – Knowledge Process and Know- ledgeMetaProcess.......................... 38 vii Contents 3.3.1. Knowledge Meta Process . 39 3.3.2. KnowledgeProcess. 39 3.4. Semantic Annotation . 40 3.4.1. Terminology ......................... 40 3.4.2. The Semantics of Semantic Annotation . 41 3.4.3. Layering of Annotation . 43 3.5. Summary ............................... 44 II. Metadata for the Semantic Web 45 4. Annotation and Authoring Framework 47 4.1. Introduction.............................. 47 4.2. CaseStudiesforCREAM . 48 4.2.1. KA2 Initiative and KA2 Portal . 49 4.2.2. TIME2Research Portal . 49 4.2.3. Authors’ Annotations of Paper Abstracts at ISWC . 50 4.3. RequirementsforCREAM. 51 4.4. DesignofCREAM .......................... 52 4.4.1. CREAMModules ...................... 52 4.4.2. Architecture of CREAM . 57 4.5. MetaOntology ............................ 57 4.5.1. Label ............................. 60 4.5.2. Default Pointing . 61 4.5.3. PropertyMode ........................ 62 4.5.4. Further Meta Ontology Descriptions . 63 4.6. ModesofInteraction . .. .. .. .. .. .. .. 64 4.6.1. Annotation by Typing . 65 4.6.2. Annotation by Markup . 65 4.6.3. Annotation by Authoring . 66 4.7. Conclusion .............................. 67 5. Semi-Automatic Annotation 69 5.1. InformationExtraction. 69 5.1.1. Process of ontology-based Information Extraction . 70 5.1.2. Amilcare ........................... 73 5.1.3. SynthesizingS-CREAM . 75 5.1.4. Discourse Representation (DR) . 78 5.1.5. Conclusion .......................... 81 5.2. TheSelf-AnnotatingWeb . 83 5.2.1. The Process of PANKOW . 84 5.2.2. Pattern-based Categorization of Candidate Proper Nouns 86 5.2.3. Integration into CREAM . 89 viii Contents 5.2.4. Conclusion .......................... 90 6. Deep Annotation 93 6.1. Introduction.............................. 93 6.2. Use Cases for Deep Annotation . 95 6.3. The Process of Deep Annotation . 96 6.4. Architecture.............................. 97 6.5. Server-Side Web Page Markup . 99 6.5.1. Requirements ......................... 99 6.5.2. Database Representation . 99 6.5.3. Query Representation . 100 6.5.4. Result Representation . 101 6.6. Annotation .............................. 102 6.6.1. Annotation Process . 102 6.6.2. Creating Generic Instances of Classes . 103 6.6.3. Creating Generic Attribute Instances . 104 6.6.4. Creating Generic Relationship Instances . 105 6.7. MappingandQuerying. 105 6.7.1. Investigating Mappings . 105 6.7.2. Querying the Database . 106 6.8. Conclusion .............................. 108 7. Application 111 7.1. Linguistic Annotation . 111 7.1.1. The Ontology-based linguistic annotation framework . 112 7.1.2. Annotating anaphoric relations . 114 7.1.3. CREAM and OntoMat . 117 7.1.4. Conclusion .......................... 119 7.2. ServiceAnnotation . 120 7.2.1. UseCase ........................... 120 7.2.2. Overview of the Complete Process of CREAM-Service . 124 7.2.3. Semantic Web Page Markup for Web Services . 126 7.2.4. Browsing and Deep Annotation . 127 7.2.5. Conclusion .......................... 132 III. Evaluation 135 8. Evaluation of Manual Annotation 137 8.1. Introduction.............................. 137 8.2. Evaluation Setting . 138 8.2.1. General Setting . 138 8.2.2. Semantic Annotation Categories . 138 ix Contents 8.3. Formal Definition of Evaluation Setting . 139 8.4. Evaluation Measures . 140 8.4.1. Perfect Agreement — Agreement Precision & Agreement Recall ............................. 140 8.4.2. Sliding Agreements . 141 8.5. Example of an Evaluation . 143 8.6. Cross-EvaluationResults. 144 8.6.1. Basic statistics . 144 8.6.2. Perfect Agreement: Agreement-Precision & Agreement- Recall ............................. 145 8.6.3. Sliding Agreement . 146 8.6.4. Overallresults . 147 8.7. Conclusion .............................. 148 9. Evaluation of Semi-Automatic Annotation 153 9.1. Introduction.............................. 153 9.2. Method ................................ 154 9.2.1. General Setting . 154 9.2.2. The Domain Ontology . 155 9.2.3. Training of the Annotation . 156 9.2.4. Experimental Design . 157 9.2.5. Application of a statistical test . 157 9.2.6. Testprocedure . 157 9.3. Results................................. 158 9.3.1. TimeMeasurement. 158 9.3.2. ResultsofGroupA. 162 9.3.3. ResultsofGroupB. 166 9.3.4. Statistical Results . 169 9.3.5. Summary ........................... 173 9.4. Discussion............................... 174 9.5. Conclusion .............................. 177 IV. Related Work & Conclusions 179 10.Comparison with Related Work 181 10.1. Related Work for the Basic Framework . 181 10.1.1. Knowledge Markup in the Semantic Web . 181 10.1.2. Comparison with Knowledge Acquisition Frameworks . 183 10.1.3. Comparison with Annotation Frameworks . 183 10.1.4. Comparison with Authoring Frameworks . 185 10.2. Related Work for the Extended Framework . 186 10.2.1. Related Work for Deep Annotation . 186 x Contents 10.2.2. Related Work for Pattern-based Annotation . 187 10.3. Related Work for Applications of CREAM . 189 10.3.1. Comparison with Service Annotation . 189 10.3.2. Comparison with Linguistic Annotation . 190 10.4. Related Work for Evaluation . 192 11.Conclusion and Future Work 195 11.1.Contributions .. .. .. .. .. .. .. .. 195 11.2. Insights into Semantic Annotation . 196 11.3.OpenQuestions. .. .. .. .. .. .. .. 197 11.4.FutureResearch . .. .. .. .. .. .. .. 198 11.5.Summary ............................... 199 V. Appendix 201 A. Metadata Standards 203 B. Glossary 205 Bibliography 209 xi Contents xii List of Figures 2.1. Indexingofpointsandnotes. 18 2.2. RDFstatement ............................ 21 2.3. RDFreification...........................

Creating Ontology-Based Metadata by Annotation for the Semantic Web

A Data-Driven Framework for Assisting Geo-Ontology Engineering Using a Discrepancy Index

A Survey of Top-Level Ontologies to Inform the Ontological Choices for a Foundation Data Model

Annotea: an Open RDF Infrastructure for Shared Web Annotations

Ontologies and Semantic Web for the Internet of Things - a Survey

Semantic Web Foundations for Representing, Reasoning, and Traversing Contextualized Knowledge Graphs

Rdfa in XHTML: Syntax and Processing Rdfa in XHTML: Syntax and Processing

Bibliography of Erik Wilde

Studying Social Tagging and Folksonomy: a Review and Framework

The Application of Semantic Web Technologies to Content Analysis in Sociology

Instructions on the Annotation of Pdf Files

Browser Security Features

Application of Ontological Models for Representing Engineering Concepts in Engineering