Introduction: OO principles • Abstraction in OO Languages and DB 3 Object Oriented Systems – hide representation, encapsulation – use interface operations 3.1 Introduction class Angel { private double angel; // radians 3.2 The ODMG data model public double getDegree() { 3.3 OQL in a nutshell return java.lang.math.PI * angel /180;} 3.4 Persistence Architectures public double getRad(){return angel;} public void set(double val){…} 3.5 The Object Protocol Model public void setDegree(double val) { … } …} – Operation interface not forced by most PL or OODB – Database terminology: "behavior" of entities if they OO DB model: see Kemper / Eickler; define operations Cattell et al: The Object Data Standard: ODMG 3.0, Acedemic Press, 2000 OPM : papers 1,2 HS / bioDBS04-3-OODB 4

Data models: the spectrum Introduction: OO principles • More structure – Rigid data models: relational, object-oriented • Inheritance – Database conformant to schema – Semantics of query q: subset of database – Type extension / Specialization / Generalization – No texts, images, ... (oríginally) We are here today class publication { • Semi structured DB / XML private String title; – Schema more flexible – if any private String[] authors; – Many schema items public print(); – Text plays a big role …} – Semantics of queries: substructure of DB • Information Retrieval class book extends publication { – Data model: objects are sequences of terms private String isbn; – No modeling restrictions (natural language!?) private int printing;... – Semantics of query q: DB entries ordered by similarity to q public print(){...} ... } (Ranking) • Natural Language – type polymorphism and dynamic binding – VERY difficult to process automatically (set_of publication) mySet; ... not Java! Ö not really an option for each d in mySet { d.print(); ...} Less structure HS / bioDBS04-3-OODB 2 HS / bioDBS04-3-OODB 5

3.1 Introduction: Why OO Databases Motivation for Objects in databases • Data modeling: What is appropriate for bio data? What is missing in the Relational Model? – No clear answer… • Type constructors – flexibility, ease of use, few restrictions…. – Tuple types • Object orientation: much more flexibility than – Compound structures, recursive structures relational systems class Project { Set_Of Sequence sequences; • The essence of Object orientation Project project_parent;...} – data abstraction – inheritance • Object Identity • OO: a programming language paradigm - Identification of "objects" in RDB by unique value … adapted for databases - May change over time? - Frequently artificial keys as a substitute

HS / bioDBS04-3-OODB 3 HS / bioDBS04-3-OODB 6

1 Motivation for Objects 3.2 The ODMG Data Model

• Data abstraction • Environment of the OO Database Model • Encapsulation of state C++ Java (*) • Hiding of representation Object- • "Behaviour" application DBMS • Type extensions and inheritance application • Part of the EER Model model • Emulated by relations Smalltalk • Type versus extent ? application

Compatibility required: oodb Model and oo language models Defined bindings: C++, Java, Smalltalk, C#

HS / bioDBS04-3-OODB 7 drawing by FJ Kemper HS / bioDBS04-3-OODB 10

Object oriented Programming Languages… ODMG Type System

• … have all the desired features • Types – Atomic literal types • "Impedance mismatch" • Primitive types – View of data and operations on data integer, float,.. character, char strings, in DBS and in PL differs heavily boolean, enum Literal – Collection literal types Types • set, bag • list, array of any (single) type • dictionary – structures (record, C / C# struct) struct conditions { float humitiy; float pressure ; float temperature;}; • Missing: persistence, concurrency, indexing set

Object oriented Programming Environment ODMG Type System Persistent PLang as a substitute for DBS? • Class Types • Data model tightly bound to OO paradigm – Define "state and behaviour " of objects of – … but data live independent of any programming model this type – Data model should be suited for all kinds of programming models – Behaviour: operations (methods, functions) – Simplistic RDM a good compromise? – State : attributes and relationships • No common OO data model • Class types can be instantiated: new... – Eiffel, Java, C++, C#, Smalltalk,… class Experiment { – Standardization needed! attribute struct {float humitiy; float pressure ; • Object Data Management Group (ODMG) * float temperature;} cond; data model (1993) attribute time t ; attribute People experimenter ; // an object • Competing approach: extension of the RDM by OO attribute set concepts experimentType } * Different from Object Management Group (OMG) for Corba standardizationHS / bioDBS04-3-OODB 9 HS / bioDBS04-3-OODB 12

2 ODMG Data Model ODMG Data Model

• Relationships • Extents – Associate one object of class C1 to one or more of class C2 – Class type may be associated with an extent • Only binary relationships, recursive relationships (C1 == C2) – Extent of class type T: set of all objects of T allowed class People (extent ThePeople class type and its extent relationship People experimenter key name, first_name must be named differently inverse Experiment::makes; ) {.... • Relationships preserve integrity ("foreign keys") attribute string name;... • in contrast to attributes: attribute People experimenter; } – 1:1, 1:N , M:N relationships •Keys – implementation of relationships encapsulated by – class type may have a key of one or more of its public operations for drop and form members attributes – values must be unique HS / bioDBS04-3-OODB 13 HS / bioDBS04-3-OODB 16

Relationships ODMG Data Model • Inheritance • M:N relationships – Single inheritance supported by classes ("extends") class People{ – Inheritance of state and behaviour ... relationship set makesExperiments class People {.... inverse Experiment::experimenter; attribute string name;... } void writeReport(in: RType typOfRep )raises (noSuchType);) class Experiment{ }; ... relationship set experimenter class Scientist extends People inverse People::makesExperiments; {... } attribute set fields; ... }; - collection types different from set can be used: define enum subject (chem,bio,phys,bioInf) bag, list, array

HS / bioDBS04-3-OODB 14 HS / bioDBS04-3-OODB 17

ODMG Data Model: Operations ODMG Data Model

• Behaviour: Operations • Inheritance and extents – Method specifications (syntax ) with in, out, inout – Subset relationship of extents to be maintained by parameters system class People – Syntax identical to Corba interface definition (extent ThePeople) – exceptions {.... attribute string name;... – Implementation in binding language void writeReport(in: RType typOfRep )raises class Experiment{.... (noSuchType);); attribute string name;... }; void writeReport(in: RType typOfRep )raises class Scientist extends People (noSuchType); (extent Scientists // specification as comment. {... // implementation: binding language attribute set fields; ... ; ... }; } HS / bioDBS04-3-OODB 15 For each valid DB state Scientists ⊆ ThePeople HS / bioDBS04-3-OODB 18

3 ODMG Data Model OQL path expressions

• Multiple interitance by interfaces (sic!) • Navigation through object graph using predefined relations Employees Students isOfferedBy worksIn Course Prof Department is-a No way to express by class inheritance "Find name of department which offers the course "Topology" Tutors Select x.isOfferedBy.worksIn.name EmpIF from course x where x.name = "Topology" )

Employees Students Only allowed for n:1 / 1: 1 relationships Example: exactly one prof for course, exactly one dep for prof implements interface inherits Tutors HS / bioDBS04-3-OODB 19 HS / bioDBS04-3-OODB 22

3.3 ODMG / Object Query Language (OQL) OQL path expressions

• Language characteristics • Navigation through relationship sets – Query language with SQL flavour hasProf teaches – Declarative access to objects from application Department Professor Course

"Find professors from all departments with their offered courses, define bioScientists as Defines an extent provided the course is an introduction and more than 1 is offered" select x Extent is from Scientists x queried select * where bio in x.fields ) from departments x, x.hasProfs as y, Explicit iterator variables y,z – No updates: use methods which manipulate state y.offers as z – Means to traverse the object graph where z.name like "%ntroduction%" ) – OQL may invoke methods programmed in the language of the binding, e.g. Java, C++ Result type: bag

HS / bioDBS04-3-OODB 20 HS / bioDBS04-3-OODB 23

OQL typed result sets OQL methods in queries and subselects

• Simple queries and result sets "Find employees x with higher salary than 10K and for each x those who are paid higher" ThePeople // collection of all people objects

select x // has type bag select struct (n: x.name, from Scientists x hiPaid : (select y from employees where bio in x.fields where y.sal.monthlyPayment > x.sal. monthlyPayment )) select distinct x.name // has type set from employees x, from Scientists x where x.sal.monthlyPayment > 10000 ) where set (math,bio) in x.fields Result type:bag )> select distinct struct (a: x.name, b:x.fields) from Scientists x // Type: set y are equivalent notations // b:set}> HS / bioDBS04-3-OODB 21 HS / bioDBS04-3-OODB 24

4 OQL collection expressions OQL Polymorphism • Quantifiers • Polymorphic type system of ODL for all x in e1:e2 – Late binding to carry out operations on polymorphic exists x in e1: e2 collections are expression, if e1 denotes a collection and – Single dispatch and e2 is an expression of type boolean select struct (n: e.name, e: p.sal) // assumed: operation sal select x.name from People e from Experiments x sal operation is dispatched according to dynamic type of where for all y in x.experimenter: math in x.fields an People object. • Aggregation People max, min, sum,avg may be applied to collection e of numeric type Assistant Scientist count (select x.Experimenters from Experiments x ) count can be applied to any collection type object HS / bioDBS04-3-OODB 25 HS / bioDBS04-3-OODB 28

OQL Partitioning OQL

• Partitioning by means of group by • Type specifiers • Groups may be referred to by partition variables … as in Java and other oo languages find average salary of all departments with more than 10 emps OQL: select deptName, select ( (Scientist) e).subject avg(select p.sal.monthlyPayment from partition p) from People e from employees e where e.sal.monthlyPayment > 4000 group by deptName: e.dep.name Emp Dep // if only scientists earn more than 4000 having count (partition) > 10 • Constructing objects

Type of result of group by operator: People(name: "Abel", id: 23, sal: type( group by f1:e1, …fk:ek) = Salary(1000,0,0,0,0), fields: set(bio,math), set )> makesExperiments: set()) and type( e ) = t , type ( partition) = B, where B is type of from-clause i i HS / bioDBS04-3-OODB 26 HS / bioDBS04-3-OODB 29

OQL : more language features OQL language design

• Set expressions • OQL is a purely functional language union, except, intersect employees except profs • SQL expressions • Indexed expressions: – Ad hoc - in contrast to relational algebra – e of type list(t) or array(t), e' of type integer, then e[e'] is of type t and x[i] denotes the i+1 – th element of the indexed • OQL expressions: collection x – Typed – Freely composed by operators • Joins? - as long as type system is respected – – Necessary at all? – => Orthogonality of the language – Yes, if not all relationships anticipated in the schema select p from Prof p, Student s where p.name = s.name HS / bioDBS04-3-OODB 27 HS / bioDBS04-3-OODB 30

5 OO database systems Persistence Framework • Features •Systems – Implicit persistence: – Poet, Ontos, O2, ObjectStore, ... - framework automatically makes business objects – Do not implement all features of ODMG standard persistent – decreasing importance (?) - no action taken by application program • Examples: Enterprise Java Beans, Java Data Objects (container managed) • Significance – Explicit persistence – Useful for modeling complex domains - application program indicates when to save – Object relational systems more important • Most common in Persistence frameworks – Object relational Mapping architectures use – DB access generated.. Standard RDBS • dynamically: flexible • at compile time: simpler, better performance • during system start up

HS / bioDBS04-3-OODB 31 (example: Hibernate framework) HS / bioDBS04-3-OODB 34

3.4 Persistent objects 3.5 OPM • Design principles • Object Protocol Model – Keep language clean from DB access code – data model for genomic data (1993) – "persistence manager" between client code and DB – subset of oo data models – DB specific code generated for use by the persistent • no inheritance, set operations, explicit relationships, .... mgr – "protocol classes" • intended for modeling the flow of data / control in experiments DB access code • useful for modeling steps and their interrelationships e.g. Object- of experiments -Relational mapping, … – Implemented on top of relational (or oo) DBS Database layer any type allowed: Client space Persistence layer -Relational -Object oriented -Object relational HS / bioDBS04-3-OODB 32 HS / bioDBS04-3-OODB 35

Persistence Framework Protocols • Mapping between DB data and objects defined in a Meta • Experiments Data Repository (XML document, database, …) – process chain / graph of activities each having • Basic functionality: create, read, update, write, transaction • input support • attributes of the activity (what, when, ... dependent on activity • Enhanced: fault tolerance, mapping generation, caching etc. • output

– relational / object oriented modeling? – OPM designed to model data and data flow, not operations – basic innovation: interconnection of activites

HS / bioDBS04-3-OODB 33 HS / bioDBS04-3-OODB 36 Figure © S. Amber

6 Example Summary • Object models much more flexible than relational • OO programming languages have (different) underlying models • OO DB model standardized (Definition ODL, Query OQL) • Mapping to language model essential • Significance decreasing with object-relational mapping frameworks • ... and with XML? cf. Chen, Markowitz: the Object-Protocol Model, TR LBL-32738, Lawrence Berkeley lab

HS / bioDBS04-3-OODB 37 HS / bioDBS04-3-OODB 40

Example: Definition of a Protocal

HS / bioDBS04-3-OODB 38

Example (cont)

HS / bioDBS04-3-OODB 39

7