Object-Oriented Databases Commercial OODBMS: Part 2

• Versant for Java • OODBMS Architectures, Revisited and Defended

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 1 Versant

ƒ Company founded in 1988 ƒ Objec t Da ta base Managemen t Sys tems ƒ highly scalable and distributed object-oriented architecture ƒ patented caching algorithm ƒ (, C++ and Java) ƒ market leader in object databases ƒ current version 7.0.1.3 ƒ available for many platforms ƒ high availability option and tools ƒ Versant FastObjects .NET (Microsoft .NET Framework 2.0) ƒ taken over from the merger with Poet in 2004 ƒ current version 10.0 ƒ 55MB5.5 MB memory foo tpri n t

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 2 Worldwide Installations

ƒ Telecommunications ƒ Alca tel -Lucen t, AT&T, E ri csson, Si emens, N ort el , F rance T el ecom, Verizon, Samsung, Keymile, NEC ƒ Defense ƒ BAE Systems, ESA, Lockheed Martin, FGM, Qinetiq, Raytheon, Northrup Grumman, Thales ƒ Financial services ƒ BNP/Paribas, JP Morgggan, AMEX, ING Barings, LCH Clearnet ƒ Transportation ƒ British Airways, Sabre Grouppp, Air France, GE Transportation, Qantas, Amadeus ƒ Other ƒ Biomerieux, Factiva, EDS, Quantel, Oracle, Ovid

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 3 Versant Object Database Architecture

Versant C Versant C++ Versant Java Other Tools, Interface Interface Interface Interfaces etc.

Versant Object Manager

Versant Network Layer

Versant Network Layer

Versant Server

Virtual System Layer

Raw Devices, File Systems RAID, SAN, NAS

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 4 Versant Dual Cache Architecture

Object Cache

t User Interface nn

Application Logic nt Clie aa

Vers Versant Object Front End Manager Profile

Daya Back End Versant Storage Page Cache rr Profile Manager

t Serve Logical Log File Roll- Forward Log Versan Phyygsical Log File Database Volumes

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 5 Versant Multi-Threaded Architecture

Client Process Server Process

Client Server Page Cache Thread Session Object Lock Object Cache Client Server Table Thread Thread Client Session Object Thread Object Cache Server Thread

Client Process Log Buffer asynchronous I/O of Thread non-commit buffer Client writes Thread Session Object Background writes modified Client Object Cache Page pages to disk Thread Flusher

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 6 Java Versant Interface (JVI)

ƒ Provide easy-to-use storage of persistent Java objects ƒ pure JtdtiJava syntax and semantics ƒ instances of nearly all classes can be stored and accessed ƒ works seamlessly with the Java garbage collector ƒ multiple threads can work in shared or independent transactions ƒ Client-server architecture ƒ provide access to the Versant object database ƒ client libraries cache objects for faster access and navigation ƒ database queries are executed on the server ƒ Support for Java Develo pment Kit ƒ Version 6.0.5 supports JDK 1.3 ƒ Version 7.0.1 supports JDK 1.4 and 1.5

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 7 JVI Architecture

M Java Object Cache Java Object Cache VV Java lient CC JNI er t ersant Object Cache Object Cache cc gg VV Obje Mana

TCP/IP Server Versant

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 8 JVI Layers

ƒ Fundamental Layer ƒ dtbdatabase-centitric ƒ objects manipulated indirectly through handles ƒ package com.versant.fund ƒ Transparent Layer ƒ language-centric ƒ layered on top of fundamental binding ƒ package com.versant.trans ƒ ODMG Layer ƒ langgguage-centric ƒ ODMG 2.0 database and transaction model, ODMG collections ƒ layered on top of transparent binding ƒ package com.versant.odmg

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 9 Application Development with Versant

ƒ Develop Java classes ƒ makd“itke code “persistence aware ” ƒ sessions, transactions and concurrency ƒ Compile Java classes to generate byte -code ƒ persistence behaviour inherited from base class com.versant.trans.Persistent ƒ Create configuration file for enhancer program ƒ specify the persistence category for each Java class ƒ Run enhancer to make byte-code changes ƒ Create database ƒ Run application

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 10 First and Second Class Objects

ƒ First Class Objects (FCO) ƒ can bddtididdtltdlbjtbe saved and retrieved independently as standalone objects. ƒ have Logical Object Identifiers (LOID) ƒ can be the subject of queries ƒ changes to existing instances are saved automatically ƒ references are always valid ƒ fields marked with transient are not saved in the database ƒ Second Class Objects (SCO) ƒ can be saved only as part of an FCO ƒ cannot be the subject of queries ƒ if a SCO d oes not h ave a correspondi ng V ersant att rib ut e t ype it i s stored as serialized Java byte stream ƒ fields marked with transient are not saved in the database

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 11 Persistence Categories (FCO)

ƒ Persistent always (p) ƒ becomes persitisten t at obj bjtiect ins tanti titiation it itlfself ƒ Persistence capable (c) ƒ new itinstances are iitilltinitially trans itbtient, but may become pers ittistent ƒ makeRoot(), makePersistent() or persistence by reachability ƒ object is automatically marked dirty when modified ƒ Superclass of a “p” or “c” class as must also be “p” or “c” ƒ unless the superclass is Object ƒ note that this rule is recursive

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 12 Persistence Categories (SCO)

ƒ Transparent dirty owner (d) ƒ changes to object aut omati call y markitk its owner o bjec t as ditdirty ƒ used for serialized collections ƒ Persistence aware (a) ƒ can directly modify attributes of an FCO ƒ will call automatically mark that FCO as dirty upon any changes ƒ Not persistent (n) ƒ no byte code enhancement. ƒ cannot directly access the fields of a persistent object ƒ access to such fields will throw an IllegalAccessError

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 13 Connecting to a Database

ƒ Applications perform database operations in sessions ƒ access tdtbto databases, meth thddttods, data types and persi ittbjtstent objects ƒ must be closed before application terminates ƒ one or more sessions can be open at the same time ƒ In each JVI layer, a session implementation exists ƒ Client session elements ƒ object cache ƒ cached object descriptor table ƒ Server session elements ƒ associated with each connected database is a page cache for recently accessed pages ƒ server page cache is in shared memory of the machine containing the connected database

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 14 Transaction Model

ƒ Upon starting a session, Versant is always in a transaction ƒ a new transaction is started automatically after commit() or rollback() ƒ endSession() commits the last transaction ƒ Transactions have the following characteristics ƒ atomic,,,p, consistent, independent, durable ƒ coordinated: objects are locked for coordination with other users ƒ distributed: two-phase commit for working with multiple databases ƒ ever-present: application code is always in a transaction ƒ Committing units of work ƒ commit(): releases locks and flushes cache ƒ checkpointCommit(): retains locks and retains cached objects ƒ commitAn dRe ta in(): releases locks and retains cached objects

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 15 Example

// use the transparent layer TransSession session = new TransSession("publications");

// find a previously defined root Set< ? > pubs = (Set< ? >) session.findRoot("pubs");

// create a new author assuming that the Author class is either “p” or “c” Author moira = new Author("Moira C. Norrie"); for (Object pub: pubs) { Pu blica tion p = (Pu blica tion ) pu b; p.addAuthor(moira); }

// commit the changes session.commit();

// end the session session.endSession();

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 16 Updating Objects

ƒ First Class Objects ƒ changes to firs t c lass o bjec ts are au tomati call y app lie d to the database upon commit ƒ database objjpyects are modified transparently ƒ values of basic types are copied to database ƒ Second Class Objects ƒ Transparent Dirty Owner: changes to objects are automatically applied to the database upon commit ƒ PittAPersistent Aware and NtPNot Pers itisten tAt Aware: modifica tion o f objects requires explicit dirty of owner object using method TransSession.dirtyyjObject() ƒ The reason for this is that second class objects are serialised into the first class object that contains them

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 17 Deleting Objects

ƒ First Class Objects ƒ delete explicitly with Trans Sess ion. dltObjtdeleteObject() and TransSession.groupDeleteObjects() ƒ these methods refer to database objj,ects, Java instances will be garbage collected by the JVM ƒ JVM calls finalize() upon garbage collection, not deletion ƒ Second Class Objects ƒ deleted implicitly by setting reference to null ƒ memory will be gar bage co llec te d by the JVM ƒ upon commit, the containing first class object will not serialise the second class object

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 18 Versant Query Language (VQL)

ƒ VQL 6 ƒ VQL 6 queries are a subset of OQL as specified by ODMG 2. 0 ƒ no sorting, no extensions for new capabilities, limited API ƒ as of Versant 7.0, VQL 6 queries are deprecated ƒ VQL 7 ƒ support for com plex ex pressions ƒ support for server-side sorting ƒ improved indexing capabilities ƒ VQL queries are specified as a query string that is compiled and executed on the database server ƒ Queries can be parameterised ƒ parameter starts with a $ followed by characters, digits or underscore ƒ parameters are bo und to val u es u sing the bin d() method

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 19 VQL 7 Example

// create a new publication, assuming the Publication class is “p” Publication pub = new Publication("Web 2.0 Survey");

// find authors Stefania Leone and Moira C. Norrie String queryString = "select selfoid from Author where name = $name"; Query query = new Query(session, queryString); query.bind("name", "Stefania Leone"); QueryResult result = query.execute(); Object author = result. next(); if (author != null) { pub.addAuthor((Autor) author); } query.bind ("name", "Moira C. Norrie"); result = query.execute(); author = result. next(); if (author != null) { pub.addAuthor((Author) author); }

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 20 VQL 7 Example

// find all publications by Moira C. Norrie and Michael Grossniklaus String queryString = "lt"select selfo id "+" + Currently, collections can "from Publication " + "where Publication::authors subset_of $authors"; neither contain strings nor be parameters . Hence, this // precompile the query on the server example is not possible. Query query = new Query(session, queryString);

// bin d query to se t o f a lrea dy ex is ting au thor o bjec ts query.bind("authors", new Object[] { moira, michael }); QueryResult result = query.execute();

// print out the names of the publications for (Object pub = result.next; pub != null; ) { Publication p = (Publication) pub; System.out.println(p.getTitle()); }

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 21 JVI Client Cache Loader

ƒ Versant uses a client-side object cache ƒ contai ns query result s an d o bjec ts access throug h nav iga tion ƒ server tracks which object are cached by the clients ƒ Automatic object loading through closure ƒ given a starting point, closure is defined as the identification and retrieval of related objjgpects relative to the starting point ƒ each time an object is dereferenced, the object manager decides if closure is required and will then locate and load the related object ƒ The JVI Client Cache Loader API can be used to control how and when objects are loaded ƒ each dereference consists of network RPC, object lookup and I/O ƒ efficiency can be improved by loading multiple objects at once ƒ however, introduced vendor-specific code into domain classes

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 22 Breadth, Depth and Path Loading

Breadth

Level 1

pth Level 2 ee D Level 3

Path ƒ Client closure helper classes provide two simple API calls ƒ groupReadObjects() and getClosure() ilin class Loader ƒ Load policies provide control out application code ƒ policies to control the loading of object specified in XML file ƒ XML “compiled” by Versant PolicyMaker utility ƒ load() in class Loader loads objects based on the specified policy

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 23 OODBMS Architectures

ƒ RDBMS architectures are very similar ƒ server centitric, in dex-bdltillbbased, relational algebra execu tiition engine ƒ performance and scalability numbers vary by small percentages ƒ OODBMS architectures vary considerably and exhibit wildly different characteristics ƒ performance and scalability numbers may vary by orders of magnitude ƒ OODBMS architectures and their impact on expectations of early adopters can be seen as one of the factors for the, initia ll y, limi ted success o f OO DBMS ƒ It is important to consider application characteristics and understand which OODBMS architecture is best suited

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 24 The Major Factors

ƒ Key distinguishing implementation differences lead to vastly different runti me ch aract eri sti cs ƒ Primary areas impacting on performance include ƒ core architecture ƒ concurrency model ƒ network model ƒ query implementation ƒ identityyg management ƒ Other feature functionality may also have an impact depending on application characteristics

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 25 Core Architecture

ƒ Core architecture determines the following aspects ƒ caching ƒ query processing ƒ transaction management ƒ object life cycle management, i.e. tracking of new, dirty and deleted ƒ Three architectural variants are popular in OODBMS ƒ container-based ƒ page-based ƒ object-based ƒ Name of the architecture reflects both ƒ unit of transfer in network calls ƒ lowest level of locking granularity

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 26 Container-Based Architecture

Server Container Cache Client Locks Remote Query

Query Disk NFS Requests Container Cache Client NFS Pages ƒ Ships disk segments (containers) across the network ƒ Client libraries implement database functionality ƒ container caching, query processing, object life cycle management andtd transac tions ƒ All objects must reside inside a container ƒ Container model is layered over application domain model

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 27 Page-Based Architecture

Server Page Cache Client Lock Coordinator Swap Query

Server Page Requests Page Client Disk Cache Page Page Responses Swap ƒ Ships pages of disk across the network ƒ pages get address translated into virtual memory of an application ƒ Client libraries implement database functionality ƒ container caching, query processing, object life cycle management and transactions ƒ Typically, object placement strategies are implemented

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 28 Object-Based Architecture

Server Query Object Cache Client Page Cache Server tt

Disk Object Requests Objec Object Cache Client Objects ƒ Ships objects across the network ƒ Caching and behaviour in both client and server ƒ server: page cache, indexes, locks, queries and transactions ƒ client libraries: object caching, local locking and object life cycles ƒ No object placement strategies have to be implemented

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 29 Concurrency Model

ƒ The three core architectures have differing concurrency models to provide transaction isolation ƒ container concurrency ƒ page concurrency ƒ object concurrency ƒ All implementations of these models ƒ are tightly coupled with network characteristics ƒ can cache locks locally at the client across transaction boundaries ƒ are likely to require lock coordination and cache consistency operations for updates ƒ In container and page-based systems, object placement and locking are tightly coupled ƒ In object-based systems , these issues are orthogonal

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 30 Container Concurrency

ƒ Separate lock server coordinates concurrent access to dtdata i n same con titainer ƒ Locking algorithm ƒ clients requests lock before caching a container ƒ locks are generally released at transaction boundaries ƒ write request will establish a queue if there already read requests ƒ subsequent read and write requests inserted in queue ƒ after write reqq,quest has been filled, other requests are filled ƒ queue disappears if no further write requests ƒ clients caching an updated container must refresh it before read ƒ As containers hold many objects, possibility of false waits and deadlocks

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 31 Page Concurrency

ƒ Page server coordinates concurrent access ƒ tkhihlititracks which applications are access ing eac h page ƒ grants permission to lock pages locally at the client ƒ Locking algorithm ƒ client request lock before caching a page ƒ write request causes server to use lock callbacks ƒ clients either release lock and give up permission to cache page, or request blocks on each objecting client for a specified timeout ƒ clients that have released lock will refresh page on next access ƒ As locks and permissions are taken out at the page level, false waits and deadlocks can occur

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 32 Object Concurrency

ƒ Server maintains queues at object level to control concurrency oftthbjtf access to the same object ƒ Locking algorithm ƒ clients request lock before caching object locally ƒ locks are generally released at transaction boundaries ƒ write request will establish queue, if read locks already exist ƒ all subsequent requests are inserted into queue ƒ after write reqq,quest has been filled, other requests are filled ƒ queue disappears if no further write requests ƒ clients caching an updated object must refresh it before read ƒ Since locking is done at object level, no false waits or deadlocks can occur

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 33 Network Model

ƒ Core architectures imply three different network models ƒ contitainer ne twork mod dlel ƒ page network model ƒ object network model ƒ In a distributed system, different network models affect ƒ bandwidth utilisation ƒ performance ƒ Network model is closely tied to locking model ƒ information transfer occurs upon lock, if not already cached locally ƒ lowest ggyranularity level of network transfer is e qual to lowest granularity level of locking

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 34 Network Models

ƒ Container network model ƒ objec t reques t t ransl at ed t o cont ai ner request ƒ entire container transferred, even if not all objects are accessed ƒ locks are place on the whole container ƒ Page network model ƒ object request translated to page request ƒ entire page transferred, even if not all objects are accessed ƒ locks are placed on the whole page ƒ Object network model ƒ unit of transfer is an objjject or collection of objects ƒ object or collection request may include a depth request ƒ locks are placed on one object or all objects within a collection

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 35 Query Implementation

ƒ OODBMS focus on supporting seamless navigation btbetween rel ltdated o bjtbjects us ing language cons truc ts ƒ native support of navigational access is a key advantage of OODBMS over the RDBMS complex join concept ƒ relationships are a static part of the system rather than runtime computed ƒ Querying data in OODBMS consists of two steps ƒ access first level objects of a use case ƒ use navigation to access related objects ƒ Query implementation impacts on ƒ where does the query execution take place ƒ flexibility of what can be queried ƒ indexing capabilities

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 36 Container Query

ƒ Query processing at the client ƒ clien t may be ano ther remo te c lien t ƒ client is, however, a process separate from the NFS page server ƒ “opposite” architecture compared to a relational database ƒ Query processing ƒ all objects involved in the query must be identified by the database or container they reside in (which also includes potential indexes) and loaded into the client process for query execution ƒ query retthtihldithltbjtturns the containers holding the result objects ƒ From network and locking perspective, result may contain object s th at did not act uall y sati sf y th e query pre dica te

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 37 Page Query

ƒ Query processing at the client ƒ Query processing ƒ all pages containing objects involved in the query are loaded ƒ query returns ref erences t o obj ect s th at sati sf y th e query predi cat e ƒ pages containing these objects have, implicitly, already been loaded across the network and translated into the client memory ƒ Indexed query processing ƒ all index pages relevant to the query are loaded ƒ query returns references to objects that satisfy the query predicate ƒ pages containing these objects may have to be loaded across the network and translated into the client memory ƒ From network and locking perspective, result may contain objects th at did not actua lly sat is fy t he query pre dicate

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 38 Object Query

ƒ Query execution engine runs within database server ƒ any object i s reach abl e v ia query, even if it has no re la tions hips ƒ indexes for object attributes maintained on server ƒ Query processing ƒ query statement sent to server from client ƒ query executed on server using optimiser and indexes ƒ result set of objects satisfying the query predicate returned to client ƒ Only query statement and objects satisfying the query are transferred across the network

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 39 Identity Management

ƒ OODBMS use object identity for establish uniqueness and implementing relationships ƒ Identity management impacts on ƒ long-term operational behaviour ƒ flexibility and data scalability ƒ Physical identity ƒ unique identifier is dependent on the physical location ƒ mutable, reusable, immobile and rigid ƒ dereferencing very fast ƒ Logical identity ƒ unique identifier is independent of the physical location ƒ immutable, never reused, mobile and flexible ƒ lilflogical references nee dtbtds to be trans ltdithilflated into physical reference

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 40 OODBMS Architectures Revealed

ƒ Objectivity/DB ƒ contitainer-bdhittbased architecture ƒ physical identity ƒ ObjectStore Enterprise ƒ page-based architecture (queries can be executed on the server) ƒ physical identity ƒ Versant Object Database ƒ object-based architecture ƒ logical identity ƒ ƒ object-based architecture ƒ ppyhysical identit y

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 41 Literature

ƒ Versant Object Database ƒ http:// www.versant .com/ ƒ Robert Greene: OODBMS Architectures ƒ http:// www.odb ms.org/ expert s.ht ml# arti cl e9 ƒ Adrian Marriott: OODBMS Architectures Revisited ƒ http: //www.odb ms.org/ expert s.ht ml# arti cl e 11 ƒ Robert Greene: OODBMS Architectures Defended ƒ http: //www.odb ms.org/ expert s.ht ml# arti cl e 12

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 42 Next Week Storage and Indexing

• Type Hierarchy Indexing • Aggregation Path Indexing • Collection Operations

October 31, 2008 Michael Grossniklaus – Department of Computer Science – [email protected] 43