An Object-Oriented Data and Query Model

J Vella An object-oriented data and query model This thesis is submitted for the degree of Doctor of Philosophy 2013 An object-oriented data and query model Joseph Vella This thesis is submitted for the degree of Doctor of Philosophy, in the Department of Computer Science, University of Sheffield, March, 2013. An object-oriented data and query model Joseph Vella Abstract OODBs build effective databases with their development peaking in 2000. A reason given for its neglect is of not having a declarative and procedural language. Relevant declarative languages include ODMG OQL and Flora-2, a first order and object-oriented logic programming system based on F-Logic. Few procedural object algebras have been proposed and ours is one and none are offered together. The characteristics of the algebra proposed and implemented with Flora-2 are: it is closed; it is typed; its ranges and outputs are homogeneous sets of objects; operators work on either values or logical identifiers; a result set is asserted; and a query expression’s meta details are asserted too. The algebra has ten operators and most have algebraic properties. A framework was developed too and it has its object base loaded with methods that maintain it and our algebra. A framework’s function is to read an EERM diagram to assert the respective classes and referential constraints. The framework then sifts for non- implementable constructs in the EERM diagram and converts them into implementable ones (e.g. n-ary relationships) and translate the object base design into ODMG ODLs. This translation’s correctness and completeness is studied. The framework implements run-time type checking as Flora-2 lacks these. We develop type checking for methods that are static, with single arity, polymorphic (e.g. overloaded and bounded), and recursive structures (e.g. lists) through well-known and accepted techniques. A procedure that converts a subset of OQL into an algebraic expression is given. Once created it is manipulated to produce an optimised expression through a number of query re- writing methods: e.g. semantic, join reduction, and view materialisation. These techniques are aided by the underlying object base constructs; e.g. primary key constraint presence is used to avoid duplicate elimination of a result set. We show the importance of tight coupling in each translation step from an EERM to an algebraic expression. Also we identify invariant constructs, e.g. primary key through a select operand, which do not change from a query range to a query result set. Dedication: To our Kim Jamie and Kyle Matthew Acknowledgements: I am deeply indebted to my supervisors: Anthony JH Simons, Vitezlav Nezval, and Georg Struth. Their support and guidance is much appreciated. Also I have to say how good it feels to be part of the department at the University of Sheffield. Additionally I need to acknowledge and whole- heartedly thank the examiners for viewpoints and suggestions given. A great thank you goes to Michael Kifer for patiently guiding me, if not illuminating, to aspects of F-Logic and Flora-2. A thank you goes to the EyeDB community for their occasional help. I have to say thanks to Dropbox too; it is a life saver. My employer provided financial support and a two year studies leave; thanks very much. Special hug and kisses goes to Adriana; you are the one that I love! Object-Oriented Data and Query Models Table of Contents Title Page Abstract Acknowledgements Table of Contents ix 1 Introduction 2 1.1 Thesis Statement 4 1.2 Plan of the Thesis 4 2 Conceptual Modelling 8 2.1 ERM and EERM 8 2.1.1 Entities 9 2.1.2 Relationships 11 2.1.3 Other ERM Features 13 2.2 Enhanced ERM (EERM) 15 2.2.1 Entities and Sub-Classes 15 2.2.2. Categories / Union Records 19 2.2.3 Aggregation 20 2.3 How and How not to Use ERM 25 2.3.1. ERMs and CASE tools 26 2.4 Problems with ERM/EERM diagram 27 2.4.1 Connection Traps 28 2.4.2 Many to Many Relationships 29 2.4.3 Are Three Binary Relationships Equivalent to One Ternary Relationship? 30 2.4.4 Formalisation of the EERM 30 2.5 An Example 31 2.6 Summary 33 3 Object-Oriented Paradigm – Object Basics 36 3.1 Encapsulation 38 3.1.1 External Interfaces 38 3.1.2 Data Independence and Encapsulation 39 3.1.3 Encapsulation and Inheritance 39 3.2 Objects and Values 39 3.2.1 Basic Sets 40 3.2.2 Tuple Structured Values 41 3.2.3 Non First Normal Form Structured Values 41 3.2.4 Complex Structured Values 42 3.2.5 What gains? 43 3.3 Object Identity 45 3.3.1 Logical Aspects of Identification 48 3.3.2 Complex Value Structure and Nested Relational with Identifiers 48 3.3.3 Identifiers and Referential Integrity 50 3.3.4 Identification’s Pragmatics 50 3.4 Message Passing 52 3.4.1 The Message Passing and Method Determination Mechanisms 53 3.4.2 Message Passing and Function Calls 54 3.4.3 Method code 55 3.4.4 Concurrency and Message Passing 56 3.4.5 Message Passing in Object-Oriented Databases 56 3.5 Summary 56 Table of Contents – Page ( ix ) Object-Oriented Data and Query Models 4 Object-Oriented Paradigm – Classification and 60 Inheritance 4.1 Classes and Classification 60 4.1.1 Class and Generalisation Abstraction 62 4.1.2 Classes and Aggregation Abstraction 65 4.1.3 Prebuilt Class Hierarchy 66 4.1.4 Classes and their Composition 66 4.1.5 Class Implementations and Data Types 68 4.1.6 Classification Critique 68 4.2 Inheritance 69 4.2.1 Inheritance Mechanism 70 4.2.2 What to inherit exactly? 70 4.2.3 Incremental Development through Re-use 73 4.2.4 Single or Multiple Inheritance? 73 4.2.5 Inheritance in Conceptual Design and Implementation Design 74 4.2.6 Other Points with Inheritance 75 4.3 Data Types 75 4.3.1 Data Typing 76 4.3.2 Data Type Representation 78 4.3.3 Data Type Inference 79 4.3.4 Data Type Theory 80 4.3.5 Object-Oriented Data Typing 81 4.3.6 Advantages of Data Types 86 4.3.7 Inheritance, Sub Typing and Sub Classing are not the same thing 87 4.4 Summary 88 5 Object-Oriented Paradigm – OODB and the ODMG 92 Standard 5.1 OODB Schema and Instance 92 5.2 Path Expressions 95 5.2.1 Path Expressions in more Detail 96 5.2.2 Variables in Path Expressions 97 5.2.3 Heterogeneous set of Objects 98 5.2.4 Physical Implementation of Path Expressions 98 5.2.5 Path Expression and other Possibilities 99 5.3 The Object Data Standard: ODMG 3.0 100 5.3.1 The Generic Chapters 101 5.3.2 The OMDG Object and Data Model 102 5.3.3 Critique of ODMG’s Object Model and ODL 114 5.3.4 EyeDB 114 5.3.5 ODL & OQL Interactive Session 115 5.4 Summary 117 6 Deductive & Object-Oriented Databases 120 6.1 Deductive Databases and Datalog 120 6.1.1 Logic and Databases 121 6.1.2 Logic Programming 121 6.1.3 Deductive Databases 122 6.1.4 Datalog 124 6.1.5 Queries and Safe Answers 128 6.1.6 Datalog Evaluation: Datalog to Algebra 129 6.1.7 Extending Datalog with Negation 132 6.1.8 Extending Datalog with Functions 134 6.1.9 Datalog and Relational Algebra 135 6.2 Algebras as a target for Declarative Languages 136 6.2.1 Relational Algebra, Nested Relational Algebras & Object Algebras 136 6.3 F-logic 138 6.3.1 F-logic: the Language 139 6.3.2 F-logic's Semantics 142 6.3.3 F-logic and Predicates 143 Table of Contents – Page ( x ) Object-Oriented Data and Query Models 6.3.4 F-logic's Proof Theory 143 6.3.5 Logic Programming with F-logic 143 6.3.6 F-logic and Typing 144 6.3.7 F-logic and Inheritance 146 6.3.8 F-logic Implementation: Flora-2 149 6.3.9 Flora-2 Session 149 6.4 Summary 151 7 Integrity Constraints 154 7.1 Integrity Constraints Representation 155 7.1.1 Check Constraint 156 7.1.2 Primary Key Constraint 156 7.1.3 Not Null Constraint 156 7.1.4 Referential Constraint 157 7.1.5 Functional Dependency and Multi-Valued Dependency 157 7.1.6 Aggregate Constraint 158 7.1.7 Transitional Constraint 158 7.2 Integrity Constraints and Databases 159 7.2.1 Efficiency 159 7.2.2 Data Modelling Relationships 160 7.3 Converting Integrity Constraints into Queries of Denials 164 7.4 Other Issues 165 7.4.1 Where to “attach” ICs 165 7.4.2 Intra-IC Consistency 165 7.4.3 Redundancy of ICs 166 7.4.4 When to Apply IC Enforcement 167 7.4.5 ICs in Query Processing 167 7.4.6 Other Issues - IC in CASE Tools / Database Design 169 7.5 Summary 169 8 Framework for an Object Database Design 172 8.1 Basic Object Database Framework 172 8.1.1 Data Requirements of Framework 175 8.1.2 Logic Programming 176 8.2 Schema in F-logic 178 8.2.1 Asserting EERM Constructs into Flora-2 178 8.3 Summary 200 9 Translating an EERM into an ODL Schema 202 9.1 The Problem Definition 202 9.1.1 General Procedure 203 9.1.2 Entities, and Weak Entities 204 9.1.3 ISA Relationship 205 9.1.4 Binary Relationships 207 9.1.5 Entity Attributes 209 9.2 Sample Output 213 9.3 Completeness and Correctness 214 9.3.1 The ISA Relationship 215 9.3.2 Classes, Structures & Attributes 216 9.3.3 Binary Relationships 217 9.4 EERM to ODMG ODL Mapping 219 9.4.1 The Problem Definition 219 9.4.2 %odl_schema Output 230 9.4.3 Completeness and Correctness 230 9.4.4 Reverse Engineer an EERM Diagram from an ODL Schema 236 9.5 Summary 237 10 Type Checking in Object Database Framework 240 10.1 Problem Definition 240 10.2 Views and Flora-2 Data-Type Signatures 241 10.2.1 Views for Classifying Methods for Data-Type Checking 242 10.3 Method Signature’s Cardinality 24 Table of Contents – Page ( xi ) Object-Oriented Data and Query Models 10.4 Scalar Method Data Type Checking 248 10.5 Arity Method Data Type Checking 251 10.6 Recursive Data Type Checking 253 10.7 F-Bounded Polymorphism 258 10.8 Data Type Checking & Inference in our Framework 259 10.9 Summary 260 11 Object-Oriented Query Model (I) 262 11.1 Basic Query

Load more