MPEG-7-Aligned Spatiotemporal Video Annotation and Scene
Total Page:16
File Type:pdf, Size:1020Kb
MPEG-7-Aligned Spatiotemporal Video Annotation and Scene Interpretation via an Ontological Framework for Intelligent Applications in Medical Image Sequence and Video Analysis by Leslie Frank Sikos Thesis Submitted to Flinders University for the degree of Doctor of Philosophy College of Science and Engineering 5 March 2018 Contents Preface ............................................................................................................................................ VI List of Figures .............................................................................................................................. VIII List of Tables .................................................................................................................................. IX List of Listings .................................................................................................................................. X Declaration .................................................................................................................................... XII Acknowledgements ..................................................................................................................... XIII Chapter 1 Introduction and Motivation ......................................................................................... 1 1.1 The Limitations of Video Metadata.............................................................................................. 1 1.2 The Limitations of Feature Descriptors: the Semantic Gap ...................................................... 2 1.2.1 Common Visual Descriptors .................................................................................................. 4 1.2.2 Common Audio Descriptors .................................................................................................. 5 1.2.3 Common Spatiotemporal Feature Descriptors, Feature Aggregates, and Feature Statistics .................................................................................................................................... 7 1.3 Machine Learning Approaches for Multimedia Understanding .............................................. 7 1.4 Motivation: Multimedia Semantics in the Medical Domain ..................................................... 9 1.5 Summary ....................................................................................................................................... 12 Chapter 2 Medical Video Semantics ............................................................................................. 13 2.1 Challenges in Medical Video Interpretation ............................................................................ 13 2.2 From Video Metadata to Medical Video Semantics ................................................................ 16 2.3 Summary ....................................................................................................................................... 19 Chapter 3 Formal Knowledge Representation .............................................................................. 20 3.1 Structured Data ............................................................................................................................ 20 I Contents 3.2 Controlled Vocabularies and Ontologies: from RDFS to OWL 2 .......................................... 21 3.2.1 Modeling with RDFS ............................................................................................................ 22 3.2.2 Modeling with OWL ............................................................................................................. 23 3.2.2.1 Class Declarations ......................................................................................................... 26 3.2.2.2 Property Declarations ................................................................................................... 27 3.2.2.3 Individual Declarations ................................................................................................ 33 3.2.3 Serialization ........................................................................................................................... 35 3.3 Description Logics: Formal Grounding for OWL Ontologies ............................................... 38 3.4 A Hybrid DL-Based Formalism for Video Event Representation ......................................... 39 3.4.1 Rationale ................................................................................................................................. 40 3.4.2 Development.......................................................................................................................... 47 3.4.3 Importance in Spatiotemporal Reasoning ......................................................................... 49 3.5 Summary ........................................................................................................................................ 49 Chapter 4 Ontology Implementations .......................................................................................... 50 4.1 Common Vocabularies and Ontologies for Video Representation ....................................... 50 4.2 Common Medical Ontologies..................................................................................................... 53 4.2.1 Biomedical Ontologies for Representing Background Knowledge ................................. 55 4.2.2 Vocabularies and Ontologies for the Semantic Annotation of Medical Multimedia ............................................................................................................................. 56 4.3 Structured Video Data Deployment .......................................................................................... 57 4.3.1 Lightweight Video Annotations .......................................................................................... 57 4.3.2 Video Representations in Graph Databases ...................................................................... 59 4.3.3 Semantic Enrichment of Videos with Linked Data .......................................................... 61 4.4 Summary ....................................................................................................................................... 62 Chapter 5 Ontology-Based Structured Video Annotation .......................................................... 63 5.1 Semantic Video Annotation ........................................................................................................ 63 5.1.1 Feature Extraction for Concept Mapping .......................................................................... 63 5.1.2 Knowledge Representation of Video Scenes ..................................................................... 64 II Contents 5.1.3 Ontology-Based Video Indexing and Retrieval ................................................................ 65 5.1.4 Primary Application Areas .................................................................................................. 68 5.2 Structured Video Annotation Tools .......................................................................................... 68 5.2.1 A Retrospective Survey of Semantic Video Annotation Tools ....................................... 69 5.2.2 State-of-the-Art Structured Video Annotation Tools ...................................................... 73 5.2.3 Comparison of Structured Video Annotation Tools ........................................................ 74 5.3 Standards Alignment ................................................................................................................... 77 5.4 Linked Data Support ................................................................................................................... 77 5.5 Ontology Use ................................................................................................................................ 78 5.6 Spatiotemporal Annotation Support ......................................................................................... 80 5.7 Summary ....................................................................................................................................... 81 Chapter 6 Video Ontology Engineering ........................................................................................ 82 6.1 The Video Ontology (VidOnt) ................................................................................................... 82 6.1.1 Rationale ................................................................................................................................. 83 6.1.2 Formal Grounding ................................................................................................................ 84 6.1.2.1 Terminological Knowledge .......................................................................................... 85 6.1.2.2 Assertional Knowledge ................................................................................................. 86 6.1.2.3 Role Box .......................................................................................................................... 87 6.1.2.4 Increasing Expressivity ................................................................................................. 88 6.1.3 Evaluation ............................................................................................................................... 89 6.1.3.1 Case Study 1: Professional Video Production ..........................................................