CS317 File and Database Systems
Total Page:16
File Type:pdf, Size:1020Kb
CS317 File and Database Systems http://dilbert.com/strips/comic/1995-10-11/ Lecture 14 – OODBMS and UML Concepts November 9, 2015 Sam Siewert Reminders PLEASE FILL OUT COURSE EVALUATIONS ON CANVAS [5 points bonus on Assignment #6] Assignment #6, DBMS Project of Your Interest – POSTED – Work with your Chosen Team – Self-Directed – Autonomy, Mastery, Purpose and Life-long Learning and Requires Some Research on Your Part Assignment #4 & #5 Grading In Progress Assignment #6 Assessed with Final Grading Sam Siewert 2 Interdisciplinary Nature of DBMS CS332 – C++ & Java Final Lecture – Week 14 SE300/310 – OOA/OOD/OOP File Operating Systems Systems Programming Security Languages DBMS (SQL, OOP) Storage Networking ? (SAN, NAS, DAS) (Clusters, DR, Analytics Data Client/Server) Big MySQL Connectors CS332 – “R” C/C++, Java, … Sam Siewert 3 OODBMS and UML Concepts CHAPTER 27 Sam Siewert 4 Next Generation Database Systems First Generation DBMS: Network and Hierarchical – Required complex programs for even simple queries. – Minimal data independence. – No widely accepted theoretical foundation. Second Generation DBMS: Relational DBMS – Helped overcome these problems. Third Generation DBMS: OODBMS and ORDBMS. [NoSQL] Pearson Education © 2014 5 History of Data Models Pearson Education © 2014 6 Object-Oriented Data Model No one agreed object data model. One definition: Object-Oriented Data Model (OODM) – Data model that captures semantics of objects supported in object-oriented programming. Object-Oriented Database (OODB) – Persistent and sharable collection of objects defined by an ODM. Object-Oriented DBMS (OODBMS) – Manager of an ODB. OMG [Object Management Group], CORBA Pearson Education © 2014 7 Object-Oriented Data Model Zdonik and Maier present a threshold model that an OODBMS must, at a minimum, satisfy: – It must provide database functionality. DBMS – It must support object identity. Name space – It must provide encapsulation. – It must support objects with complex state. Distributed Object Stores – E.g. Ceph and Amplidata, OODBMS, Object-Relational Mapping Systems Pearson Education © 2014 8 Object-Oriented Data Model Khoshafian and Abnous define OODBMS as: – OO = ADTs + Inheritance + Object identity – OODBMS = OO + Database capabilities. Parsaye et al. gives: 1.High-level query language with query optimization. 2.Support for persistence, atomic transactions: concurrency and recovery control. 3.Support for complex object storage, indexes, and access methods. – OODBMS = OO system + (1), (2), and (3). Pearson Education © 2014 9 Commercial OODBMSs GemStone/S from Gemstone Systems Inc., Objectivity/DB from Objectivity Inc., ObjectStore from Progress Software Corp., Versant Object Database db40 and FastObjects from Versant Corp. Amplidata? - Object Store Open Source Object Stores and NoSQL ( R DBMS) – Ceph, Mongo DB NoSQL Pearson Education © 2014 10 Recall Object Definition – From Week 4 Instance of a Class [Hierarchy from Abstract that Can’t Be instantiated to Concrete Sub-classes] Classes Define Public and Private Data and Methods to Operate on that Data Sub-classes Inherit from Parent Classes and Refine Data Abstraction and Methods OOPS – Java, C++, …, back to Smalltalk and Lisp CLOS Encapsulation and Abstraction is the Goal Implementation Hiding and Interface Defintion Each OOP Has Variations on Support [E.g. Multiple Inheritance, Abstract Classes and Methods, Polymorphism [Parametric, Ad-hoc, Operator Overloading] E.g. Oracle’s Java Object Tutorial Sam Siewert 11 UML Example OOP – CS225, OOA/OOD – SE310 UML – Standardized OO Analysis and Design CASE Tools (Like MySQL Workbench) for C++, Java Sam Siewert 12 UML Class Compared to EER Schema EER – Entities, Attributes, Relationships (Schema) – Specifically Missing Operations on Entities – Does Include Inheritance and Aggregation UML Class Diagram (Specifies C++, Java, … OOP Class Hierarchy for Abstract and Concrete Classes) – Encapsulates Operations (that can be inherited) along with Attributes in Class Hierarchy – Includes Aggregation, Relationships and Hierarchy (like EER) Sam Siewert 13 Origins of the Object-Oriented Data Model Pearson Education © 2014 14 Persistent Programming Languages (PPLs) Language that provides users with ability to (transparently) preserve data across successive executions of a program, and even allows such data to be used by many different programs. • In contrast, database programming language (e.g. SQL) differs by its incorporation of features beyond persistence, such as transaction management, concurrency control, and recovery. Pearson Education © 2014 15 Persistent Programming Languages (PPLs) PPLs eliminate impedance mismatch [interface between application and DBMS] by extending programming language with database capabilities. – In PPL, language’s type system provides data model, containing rich structuring mechanisms. In some PPLs procedures are ‘first class’ objects and are treated like any other object in language. – Procedures are assignable, may be result of expressions, other procedures or blocks, and may be elements of constructor types. – Procedures can be used to implement ADTs. Pearson Education © 2014 16 Persistent Programming Languages (PPLs) PPL also maintains same data representation in memory as in persistent store. – Overcomes difficulty and overhead of mapping between the two representations. Addition of (transparent) persistence into a PPL is important enhancement to IDE, and integration of two paradigms provides more functionality and semantics. Pearson Education © 2014 17 Alternative Strategies for Developing an OODBMS Extend existing OO programming language. – GemStone extended Smalltalk. Provide extensible OODBMS library. – Approach taken by Ontos, Versant, and ObjectStore. Embed OODB language constructs in a conventional host language. – Approach taken by O2,which has extensions for C. – O2 Merged into Infromix, which was Acquired by IBM – http://www-01.ibm.com/software/data/informix/ Pearson Education © 2014 18 Single-Level v. Two-Level Storage Model With a traditional DBMS, programmer has to: 1. Decide when to read and update objects. 2. Write code to translate between application’s object model and the data model of the DBMS. 3. Perform additional type-checking when object is read back from database, to guarantee object will conform to its original type. Pearson Education © 2014 19 Single-Level v. Two-Level Storage Model Difficulties occur because conventional DBMSs have two-level storage model: storage model in memory, and database storage model on disk. In contrast, OODBMS gives illusion of single-level storage model, with similar representation in both memory and in database stored on disk. – Requires clever management of representation of objects in memory and on disk (called “pointer swizzling”). Pearson Education © 2014 20 Two-Level Storage Model for RDBMS Pearson Education © 2014 21 Single-Level Storage Model for OODBMS Pearson Education © 2014 22 No Swizzling Easiest implementation is not to do any swizzling. Objects faulted into memory, and handle passed to application containing object’s OID. OID is used every time the object is accessed. System must maintain some type of lookup table - Resident Object Table (ROT) - so that object’s virtual memory pointer can be located and then used to access object. Inefficient if same objects are accessed repeatedly. Acceptable if objects only accessed once. Pearson Education © 2014 23 Resident Object Table (ROT) Pearson Education © 2014 24 Object Referencing Need to distinguish between resident and non- resident objects. Most techniques variations of edge marking or node marking. Edge marking marks every object pointer with a tag bit: – if bit set, reference is to memory pointer; – else, still pointing to OID and needs to be swizzled when object it refers to is faulted into. Pearson Education © 2014 25 Object Referencing Node marking requires that all object references are immediately converted to virtual memory pointers when object is faulted into memory. First approach is software-based technique but second can be implemented using software or hardware-based techniques. Pearson Education © 2014 26 Pointer Swizzling Techniques The action of converting object identifiers (OIDs) to main memory pointers. • Aim is to optimize access to objects. • Should be able to locate any referenced objects on secondary storage using their OIDs. • Once objects have been read into cache, want to record that objects are now in memory to prevent them from being retrieved again. • Could hold lookup table that maps OIDs to memory pointers (e.g. using hashing). • Pointer swizzling attempts to provide a more efficient strategy by storing memory pointers in the place of referenced OIDs, and vice versa when the object is written back to disk. Pearson Education © 2014 27 Interesting to Ponder… How Will Persistent Main Memory Impact? – E.g. Phase Change Memory, PCM – Memristor – Nand Programming Time is Too Slow for Persistent Memory Solution How Will SSD and Nand Flash I/O Cards Impact? – Fusion IO, Micron, Virident [Purchased by WD], Intel, etc. – Persistent PCI Express [NVM Express] Block Device – SSD – SAS/SATA Device with Nand Internal Storage – Emulates HDD with Faster Random Access Sam Siewert 28 Accessing an Object with a RDBMS Pearson Education © 2014 29 Accessing an Object with an OODBMS Pearson Education © 2014 30 Persistent Schemes Consider three persistent schemes: – Checkpointing. – Serialization. – Explicit Paging. Note, persistence can also be applied to (object) code and to the