Introduction to the Relational Model
Total Page:16
File Type:pdf, Size:1020Kb
Today… § Relational Data Model- Definitions CS 2451 • High level concepts • We discuss how these are implemented in SQL Database Systems: More details when we cover SQL Relational Data Model § Front end tools – HTML/CSS/… § Next class - start with formal query languages before moving to SQL http://www.seas.gwu.edu/~bhagiweb/cs2541 Spring 2020 Instructor: Dr. Bhagi Narahari & R. Leontie Based on slides © Ramakrishnan&Gerhke, R. Lawrence 1 2 Database “People” – i.e., roles Some Terminology § There are several types of database ‘personnel’: § Database: • Database administrator (DBA) - responsible for installing, • A collection of related data. maintaining, and configuring the DBMS software. § Data: • Known facts that can be recorded and have an implicit meaning. o gets paid the big bucks ! § Mini-world: • Data administrator (DA) - responsible for organizational policies on data creation, security, and planning. • Some part of the real world about which data is stored in a database. For example, student grades and transcripts at a university. • Gets to be on the hot seat if data leaks happen.. § Database Management System (DBMS): • Database designer - defines and implements a schema for a • A software package/ system to facilitate the creation and maintenance database and associated applications. of a computerized database. Logical/Conceptual database designer - interacts with users to Example: MySQL, Oracle, MongoDB YOU determine data requirements, constraints, and business rules. § Database System: Physical database designer - implements the logical design for a • The DBMS software together with the data and (usually) the data model on a DBMS. Defines indexes, security, and constraints. applications • DBMS developer - writes the DBMS software code. • Application developer - writes code that uses the DBMS. • User - uses the database directly or through applications. 3 4 1 How will your database system be architected: Recent Developments Three-Tier Client-Server Architecture (Recent in my timeline J ) n Social Networks started capturing a lot of information about Tier 1: Client (Web/mobile) people and their communications (tweets, photos, videos..) •User Interface (using HTML/CSS) n - Facebook, Twitter, Linked-In,…. n All of the above constitutes data Tier 2: Application Server •Business logic – written using PHP n Search Engines: Google, Bing, Yahoo : collect their own •Data processing logic repository of web pages for searching purposes Database § New Technologies emerging to manage vast amounts of Tier 3: Database Server data generated on the web: •Data storage/management •Using MySQL • Big Data storage systems involving large clusters • NOSQL (Not Only SQL) systems • A large amount of data now resides on the “cloud” which means it is in huge data centers using thousands of machines. 5 6 Data Models..more definitions More terminology..Levels of Data Models § Data Model: § Conceptual (high-level, semantic) data models: • A formal framework to describe the data and structure, • Provide concepts that are close to the way many users constraints, of a database, and perceive data. • the operations for manipulating these structures § Logical level: • Constructs used to define the database structure • Provide concepts used by DBMS implementations • *In earlier definition we mixed conceptual and logical Data types, format (table),…ex: name is a char string § Physical (low-level, internal) data models: § Constraints: • details of how data is stored in the DBMS/computer. • Constraints specify some restrictions on valid data; these • These are usually specified in an ad-hoc manner through constraints must be enforced at all times..ex: GWID is unique DBMS design and administration manuals § Data Model Operations: § Self-Describing Data Models: • These operations are used for specifying database retrievals and • Combine the description of data with the data values. updates by referring to the constructs of the data model. Examples include JSON/XML, key-value stores and some Ex: find student name with GWID= abcd NOSQL systems. 7 8 2 Conceptual Data Model: The Entity-Relationship ( ER) Model Example: ER Design for mini-banner: § Provide database design that is easy to interpret by a wide Entities = Professors, Students, Courses class of users Relationship between entities • Not just database/CS experts fid PROFESSOR name • You want to provide a design to a “client” using a representation they can understand “Who’s taking what, and what grade do they § What’s a natural way to provide an easy to interpret expect?” representation…. Teaches • Fill in the blanks: A ________ is worth a thousand words STUDENT COURSE § “Visual” representation of the data, how it interacts, Takes constraints, etc. • And can be automatically mapped to a data model (relational) sid name exp-grade cid subj semester § ER-Model is one such data model One picture provides info on what your system stores and models • We will return to it after we cover the relational model UML anyone ? 9 10 Some more terminology….. Schemas versus Instances Database Schema for a COMPANY § Similar to types and variables in programming languages Database § Database Schema: structure of the database • The description of a database. • Includes descriptions of the database structure, data types, and the constraints on the database. • Schema Diagram: illustrative display of a database schema. § Database instance/state: actual data (content) stored in a database at a particular moment in time • Initial state: when database is loaded • Valid state: A state that satisfies the structures and constraints of the database… job of DBMS to ensure valid entries • The database schema changes very infrequently. Preferably never • The database state changes every time the database is updated. 11 12 3 Example of a Database Schema Populated database state/instance for COMPANY 13 14 History of Data Models Example Instance & Schema § Network Model STUDENT Takes COURSE § Hierarchical Model sid name sid exp-grade cid cid subj sem § Relational Model 1 Ross 1 A 550-0103 550-0103 DB S13 2 Lee 1 A 700-1003 700-1003 AI S13 § Object-oriented Data Models 3 Emily 3 C 500-0103 500-0103 Arch F12 • Object-Relational Models § NoSQL/ Big Data technologies PROFESSOR Teaches • Schema-less designs fid name fid cid § Our focus now: relational schema – set 1 Wood 1 550-0103 of tables 2 Heller 2 700-1003 § Can have other kinds of 8 Narahari 8 500-0103 schemas – XML, object, … Slide 2- 16 15 16 4 History of Data Models Next: Relational Model Concepts § Network Model: § The relational Model of Data is based on the concept of a • Network of records Relation • The first network DBMS implemented by Honeywell in 1964-65 • The strength of the relational approach to data management • Can model complex relationships between records i.e., graphs! comes from the formal foundation provided by the theory of § Hierarchical Model: relations • First implemented by IBM and Rockwell Intl. 1965 (?) § start with review of essentials of the formal relational model • Hierarchical network of records (trees like) • Abstract view of an Org Chart – models hierarchy in the data § In practice, there is a standard model based on SQL § Query language: COBOL • Hear about the Y2K bug/problem ? § Note: There are several important differences between the § Disadvantages: formal model and the practical model • Database contains complex set of pointers that thread through sets • And, there are variations in SQL features provided on different DBMS of records systems Little scope for “query optimization”, Procedural nature of processing Oracle SQL, MySQL, MS-SQL,… Do you like working with pointers ? J 17 18 Why Did It Take So Many Years to Implement Relational Model Concepts Relational Databases? § The model was introduced by Dr. E.F. Codd of IBM Research in 1970 § Codd’s original work: 1969-70 "A Relational Model for Large Shared Data Banks," Communications § Earliest relational database research: ~1976 of the ACM, June 1970 § Commercial Relational DBMSs: ~mid 1980s Describes the data minimally and mathematically – A relation describes an association between data items – tuples with § Widespread deployment” mid-1990’s attributes § Why the gap? Top 10 reasons… Uses standard mathematical (logical) operations over the data – 1. “You could do the same thing in other ways” relational algebra or relational calculus 2. “Nobody wants to write math formulas” § It contributed: data independence, query languages, query 3. “Why would I turn my data into tables?” optimization 4. “It won’t perform well” § The above paper caused a major revolution in the field of 5. … database management and earned Dr. Codd the coveted § What do you think? ACM Turing Award 19 20 5 Relational Model Definitions Example of a Relation § A relation is a table with columns and rows. § An attribute is a named column of a relation. § A tuple is a row of a relation. § A domain is a set of allowable values for one or more attributes. § The degree of a relation is the number of attributes it contains. § The cardinality of a relation is the number of tuples it contains. § A relational database is a collection of normalized relations with distinct relation names. Degree= 7, Cardinality = 5 21 22 Relational Model: Formal Definition Relation Schemas and Instances § Formally, a table is a relation over K sets (domains) § A relation schema is a definition of a single relation. • R Í A1 × A2 …. × AK Subset of the cartesian product of the K domains § A database schema is a set of relation schemas (modeling a particular domain). • Tuple= (t1,t2,…,tK), where ti Î Ai § A relation instance denoted r(R) over a relation schema R(A1, A2, …, § A database is a collection of relations An) is subset of the Cartesian product of the domains of all attributes in § Theoretically: a relation is a set of tuples; no tuple can occur the relation schema more than once § r(R) Í dom(A1) ´ dom(A2) ´ … ´ dom(An) • Real systems may allow duplicates for efficiency or other reasons – • i.e., a set of n-tuples <d1, d2, ..., dn> where each di is an element of we’ll ignore this for now dom(Ai) or is null.