An Ontology Driven Knowledge Discovery Framework for Dynamic
Total Page:16
File Type:pdf, Size:1020Kb
An Ontology Driven Knowledge Discovery Framework for Dynamic Domains: Methodology, Tools and a Biomedical Case. Paulo Gottgtroy A thesis submitted to Auckland University of Technology in fulfilment of the requirements for the degree of Doctor of Philosophy (PhD) 2010 School of Computing and Mathematical Sciences Primary Supervisor: Prof. Nikola Kasabov Co-supervisor: Prof. Stephen MacDonell i Table of Contents INTRODUCTION ....................................................................................................................... 1 1.1. INTRODUCTION ............................................................................................................. 1 1.2. RESEARCH QUESTIONS .................................................................................................. 5 1.3. RESEARCH METHODOLOGY .......................................................................................... 8 1.3.1. Detailed Research Process ............................................................................... 12 1.4. SCOPE OF THE RESEARCH ............................................................................................ 16 1.4.1. Implementation Environment ........................................................................... 19 1.4.2. Requirements Identification .............................................................................. 23 1.5. SUMMARY .................................................................................................................. 28 CHAPTER 2 RESEARCH LITERATURE REVIEW ....................................................... 31 2.1. INTRODUCTION ........................................................................................................... 31 2.2. ELEMENTS OF THE RESEARCH ..................................................................................... 32 2.2.1. Ontology ........................................................................................................... 32 2.2.2. Ontology Engineering ...................................................................................... 34 2.2.3. Knowledge discovery in databases ................................................................... 37 2.2.4. Knowledge Discovery Life Cycle ...................................................................... 41 2.2.5. Ontology and Knowledge Discovery ................................................................ 45 2.2.6. Domain Knowledge and KDD .......................................................................... 49 2.2.7. Ontologies, Data Models and Data Warehouses ............................................. 52 2.2.8. Ontology and KDD integration ........................................................................ 54 2.2.9. Understanding Change ..................................................................................... 56 2.2.10. Ontology Engineering and Ontology Change .................................................. 59 2.2.11. Flexibility vs. formalism ................................................................................... 61 2.2.12. The Ontology Environment - Protégé Editor.................................................... 65 2.3. SUMMARY .................................................................................................................. 68 ii CHAPTER 3 A NOVEL META-KNOWLEDGE MODEL FOR EVOLVING ONTOLOGY 69 3.1. INTRODUCTION ........................................................................................................... 70 3.2. DEFINITION OF EVOLVING ONTOLOGY ....................................................................... 71 3.3. EVOLVING ONTOLOGY META-KNOWLEDGE ............................................................... 76 3.3.1. EO-Concept_Meta-data ................................................................................... 78 3.3.2. Evolving_Ontology_Meta-Class ....................................................................... 86 3.3.3. Model Integration ............................................................................................. 90 3.3.4. Knowledge Map ................................................................................................ 92 3.4. BIOMEDICAL ONTOLOGY CASE STUDY ON LEUKAEMIA CANCER DATA ...................... 97 3.4.1. The Leukaemia Gene Regulatory Network Map ............................................. 104 3.5. NUTRIGENOMICS EVOLVING ONTOLOGY CASE STUDY............................................... 107 3.6. SUMMARY ................................................................................................................ 109 CHAPTER 4 A METHODOLOGY FOR KNOWLEDGE DISCOVERY FROM EVOLVING ONTOLOGY ................................................................................................................ 112 4.1. INTRODUCTION ......................................................................................................... 113 4.2. ONTOLOGY DRIVEN KNOWLEDGE DISCOVERY - ODKD .......................................... 113 4.2.1. Ontology Preparation ..................................................................................... 121 4.2.2. Ontology Analysis ........................................................................................... 133 4.2.3. Instance Preparation ...................................................................................... 139 4.2.4. Modelling ....................................................................................................... 144 4.2.5. Evaluation ...................................................................................................... 146 4.3. SUMMARY ................................................................................................................ 150 CHAPTER 5 ONTOLOGY DRIVEN KNOWLEDGE DISCOVERY IMPLEMENTATION FRAMEWORK ........................................................................................... 155 5.1. INTRODUCTION ......................................................................................................... 156 5.1.1. Research methodology.................................................................................... 158 5.2. ODKD IMPLEMENTATION FRAMEWORK .................................................................. 162 iii 5.2.1. System Architecture ........................................................................................ 163 5.2.2. Ontology Preparation ..................................................................................... 167 5.2.3. Ontology Analysis ........................................................................................... 176 5.2.4. Instance Preparation ...................................................................................... 186 5.2.5. Modelling ....................................................................................................... 200 5.2.6. Evaluation ...................................................................................................... 201 5.3. SUMMARY ................................................................................................................ 203 CHAPTER 6 BIOMEDICAL APPLICATION ................................................................. 208 6.1. INTRODUCTION ......................................................................................................... 209 6.2. CASE STUDY LIFE CYCLE ........................................................................................... 211 6.3. THE BRAIN GENE ONTOLOGY - BGO ....................................................................... 213 6.3.1. Biomedical Informatics Sources ..................................................................... 215 6.3.2. Brain Gene Ontology Design Criteria ............................................................ 221 6.4. A GENERIC COMPUTATIONAL NEUROGENETIC (CNG) MODEL ................................... 229 6.4.1. Determination of the GN transition matrix W ................................................ 233 6.4.2. Model defined GN corresponding to the desired NN behaviour .................... 235 6.5. A CNG MODEL OF A SPIKING NEURAL NETWORK (SNN) MODEL DESCRIPTION ...... 236 6.5.1. Some preliminary experimental results .......................................................... 240 6.6. A BIOMEDICAL KNOWLEDGE DISCOVERY TOOL – BGO SYSTEM............................. 244 6.6.1. CNGM simulation results ............................................................................... 246 6.6.2. Importing Gene Regulatory Network data ..................................................... 251 6.7. SUMMARY ................................................................................................................ 254 CHAPTER 7 EVALUATION OF RESEARCH ............................................................... 258 7.1. INTRODUCTION ......................................................................................................... 258 7.2. DESIGN SCIENCE ....................................................................................................... 260 7.3. SYSTEM DEVELOPMENT ........................................................................................... 263 7.3.1. Knowledge Representation ............................................................................. 266 7.3.2. Conceptual Framework .................................................................................. 267 7.3.3. Ontology Driven Methodology ....................................................................... 268 iv 7.3.4. Ontology