Deep Integration of the OWL Ontology Language Into Ruby Using Metaprogramming

Deep integration of the OWL ontology language into Ruby using metaprogramming Inaugural-Dissertation zur Erlangung des Doktorgrades der Mathematisch-Naturwissenschaftlichen Fakultät der Heinrich-Heine-Universität Düsseldorf vorgelegt von Dominic Mainz aus Krefeld Dezember 2008 Aus dem Institut für Informatik der Heinrich-Heine-Universität Düsseldorf Gedruckt mit der Genehmigung der Mathematisch-Naturwissenschaftlichen Fakultät der Heinrich-Heine-Universität Düsseldorf Referent: Prof. Dr. Arndt von Haeseler Korreferent: Prof. Dr. Martin Lercher Tag der mündlichen Prüfung: 21. Januar 2009 Publications Parts of this thesis have been published in the following conference proceedings: 1. Dominic Mainz, Katrin Weller, Jürgen Mainz. (2008) SEMANTIC IMAGE ANNOTATION AND RETRIEVAL WITH IKEN.InInternational Semantic Web Conference (Posters & Demos) 2. Dominic Mainz, Ingo Paulsen, Indra Mainz, Katrin Weller, Jochen Kohl, Arndt von Hae- seler. (2008) KNOWLEDGE ACQUISITION FOCUSED COOPERATIVE DEVELOPMENT OF BIO-ONTOLOGIES -ACASE STUDY WITH BIO2ME.InM. Elloumi, J. Küng, M. Linial, R.F. Murphy, K. Schneider, T. Cristian (eds.) Bioinformatics Research and Devel- opment., 258-272, Springer, Berlin. (ISBN 978-3-540-70598-7) Other publications: 1. Indra Mainz, Katrin Weller, Ingo Paulsen, Dominic Mainz, Jochen Kohl, Arndt von Haeseler. (2008) ONTOVERSE:COLLABORATIVE ONTOLOGY ENGINEERING FOR THE LIFE SCIENCES. Inform. Wiss. Praxis, 2, 91-99. 2. Zoulfa El Jerroudi, Stefan Weinbrenner, Dominic Mainz, Katrin Weller. (2008) ONTO- VERSE: Kollaborative Ontologieentwicklung mit interaktiver visueller Unterstützung. In: Mensch & Computer 3. Ingo Paulsen, Dominic Mainz, Katrin Weller, Indra Mainz, Jochen Kohl, Arndt von Hae- seler. (2007) ONTOVERSE: Collaborative Knowledge Management in the Life Sciences Network. In: Proceedings of the Germany eScience Conference 2007, Max Planck Digital Library, ID 316588.0. 4. Katrin Weller, Dominic Mainz, Indra Mainz, Ingo Paulsen: Wissenschaft 2.0? Social Software im Einsatz für die Wissenschaft. In: Marlies Ockenfeld (Hrsg.): Information in Wissenschaft, Bildung und Wirtschaft, 29. Online-Tagung der DGI, 59. Jahrestagung der DGI, Proceedings, Frankfurt (Main): DGI, 2007, S. 121-136. 5. Katrin Weller, Indra Mainz, Ingo Paulsen, Dominic Mainz: Semantisches und ver- netztes Wissensmanagement für Forschung und Wissenschaft. Erscheint in: WissKom 2007, Wissenschaftskommunikation der Zukunft, 4. Konferenz der Zentralbibliothek im Forschungszentrum Jülich, Proceedings, 2007. Danksagung Mein besonderer Dank gilt Herrn Prof. Dr. Arndt von Haeseler für die Überlassung des inter- essanten Themas, seine stete Bereitschaft für konstruktive sowie kritische Diskussionen, die mich während der Arbeit motivierend begleitet haben. Für die grosszügige Ermöglichung des Ontoverse-Projekts und den Teilnahmen an Konferenzen bin ich sehr dankbar. Herrn Prof. Dr. Martin Lercher danke ich sehr für das entgegengebrachte Interesse und die Übernahme des Korreferats. Sehr herzlich danke ich meiner Arbeitsgruppe für die schönen Jahre. Insbesondere: Ingo, Katrin und Jochen. Meiner lieben Familie und meinen Freunden danke ich dafür, dass sie da sind; meine Eltern, meine Schwestern, die Ahmadinejads, Deniz, Timo, Nicki, Julia, Paola, Ilija, Tatjana, Elena, Nicki, Julia, Jürgen, Birgit und alle restlichen Mainzer, Schourens ... Danke Nahal. Contents 1. Introduction ...................................................... 1 1.1 Data Explosion in the Life Sciences and Multimedia Content Management 1 1.2 Ontologies and Semantic Applications . ........................... 4 1.2.1 Deep Integration . ...................................... 5 1.3 Thesis Outline ................................................ 6 2. Background ...................................................... 9 2.1 The Semantic Web and its Concepts . ...................... 9 2.1.1 Ontologies . .......................................... 11 2.1.2 The Resource Description Framework (RDF) ................. 13 2.1.3 The Resource Description Framework Schema RDFS . ........ 13 2.1.4 Web Ontology Language (OWL) ........................... 14 2.1.5 Rules ................................................. 16 2.2 Description Logic .............................................. 17 2.3 The ONTOVERSE Project . ...................................... 17 2.4 Bio-ontologies ................................................ 18 2.4.1 Open Biomedical Ontologies (OBO) ......................... 19 2.5 Multimedia Content and the Semantic Web .......................... 19 2.5.1 Ontology-Based Multimedia Content Indexing ................. 19 2.5.2 Ontology-Based Multimedia Content Retrieval ................ 20 2.6 The PELLET Reasoner .......................................... 20 2.7 Semantic Web Frameworks . ..................................... 20 2.7.1 JENA2 ................................................ 20 2.7.2 OWL API . ...................................... 21 2.7.3 ACTIVERDF ........................................... 21 ii Contents 2.8 The Dynamic Programming Language RUBY ........................ 22 2.8.1 RUBY Classes, Objects, and Variables . ....................... 22 2.8.2 RUBY Modules . ...................................... 23 2.8.3 Reflection and Metaprogramming . ...................... 23 2.9 Deep Integration ............................................... 23 3. DEEP SEMANTICS ................................................. 25 3.1 Consistency Safeness . ......................................... 25 3.2 Architecture .................................................. 27 3.2.1 Implemented RUBY classes and modules ..................... 29 3.3 The Abstract Syntax of OWL LITE and its contribution to DEEP SEMANTICS ............................................. 33 3.4 Director: Coordinating Data Workflow and Deep Integration Process . 45 3.5 Triple Parser: Mapping OWL Triples to an Abstract Syntax Based Representation in RUBY ........................................ 48 3.5.1 The Triple Parsing Process . .............................. 48 3.6 Deep Integration Builder . .............................. 55 3.6.1 Conversion of the OWL Ontology Definition into a RUBY Module . 57 3.6.2 Conversion of OWL Properties into RUBY Objects ............. 58 3.6.3 Conversion of OWL Classes into RUBY Classes ............... 59 3.6.4 TBox Deep Integration – Assembling of the Deep Integrated RUBY Properties and Classes into a consistent RUBY Representation of the Ontology . ............................................ 60 3.6.5 ABox Deep Integration – Conversion of Instances into RUBY Objects 64 3.7 Utilization of the Deep Integrated Ontology . ....................... 67 3.7.1 Using DEEP SEMANTICS to convert an Ontology into a RUBY Representation . ....................................... 67 3.7.2 Working with OWL Classes, Properties and Instances in DEEP SEMANTICS ....................................... 67 3.7.3 XPERIMENTR– A Simple Semantic Application using DEEP SEMANTICS ....................................... 71 3.8 Comparison of DEEP SEMANTICS with other Semantic Web Frameworks . 81 3.8.1 DEEP SEMANTICS versus OWL API ........................ 81 3.8.2 DEEP SEMANTICS versus JENA2 ........................... 87 3.8.3 DEEP SEMANTICS versus ACTIVERDF ...................... 92 3.9 Discussion ................................................... 96 3.9.1 Programming Complexity ................................. 98 3.9.2 Runtime and Main Memory Complexity ...................... 98 3.9.3 Handling Multiple Ontologies in DEEP SEMANTICS ............ 99 3.9.4 Conclusions and Outlook .................................. 99 4. DEEP SEMANTICS in Action: IKEN and the BIO2ME ................... 101 4.1 IKEN ....................................................... 101 4.1.1 The IKEN ontology ...................................... 102 4.1.2 The IKEN Architecture ................................... 116 4.1.3 The IKEN Graphical User Interface . ...................... 119 4.2 The BIO2ME Ontology and Information System . ................. 120 4.2.1 BIO2ME Ontology ...................................... 120 4.2.2 BIO2ME Information System . ............................ 121 4.3 Discussion ................................................... 121 4.3.1 Semantic Image Management with IKEN ..................... 122 4.3.2 Experiences applying DEEP SEMANTICS ..................... 122 4.3.3 Conclusions and Outlook .................................. 123 5. Summary ......................................................... 125 6. Zusammenfassung ................................................. 127 Bibliography ...................................................... 129 Appendix ......................................................... 135 6.1 XPERIMENTR Ontology: Classes, Properties and Instances ............. 135 6.2 Abstract syntax of OWL LITE ................................... 139 6.3 Reference Test Implementations .................................. 142 6.3.1 Test 1: list all classes of the ontology ........................ 142 6.3.2 Test 2: find all instances matching a certain search term . ....... 145 iv Contents List of Figures 1.1 Growth of GENBANK sequence entries. ............. 2 1.2 Chronological development of the number of databases listed by the Nucleic Acids Research online Molecular Biology Database Collection. 3 2.1 Internet as a maze 1. ................... 10 2.2 Internet as a maze 2. ................... 11 2.3 The Semantic Web layers. .................... 12 2.4 Schematic representation of the constructs of an ontology. ....... 13 2.5 Screenshot

Load more