INFO2120 – INFO2820 – COMP5138 Systems Week 3: The Relational Data Model (Kifer/Bernstein/Lewis - Chapter 3; Ramakrishnan/Gehrke - Chapter 3)

Dr. Uwe Röhm School of Information Technologies

Outline

! Introduction to the Relational Data Model ! Creating Schemas using SQL ! Mapping E-R Diagrams to Relational Schemas ! Levels of Abstraction / View Definitions

Based on slides from Kifer/Bernstein/Lewis (2006) “Database Systems” and from Ramakrishnan/Gehrke (2003) “Database Management Systems”, and including material from Fekete and Röhm. 03-2 How can one system support such diverse applications?

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-3

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-4 Relational Data Model

! The was first proposed by Dr. E.F. ‘Ted’ Codd of IBM in 1970 in: "A Relational Model for Large Shared Data Banks”, Communications of the ACM, June 1970. ! This paper caused a major revolution in the field of database management and earned Ted Codd the coveted ACM Turing Award in 1981. Photo of Edgar F. Codd" ! The relational model of data is based on the mathematical concept of . ! Studied in Discrete Mathematics

! The strength of the relational approach to data management comes from its simple way of structuring data, based on a formal foundation provided by the theory of relations.

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-5

Being an IT professional and not knowing the relational data model is like practising medicine without a license.

Chris Date

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-6 Data Model vs. Schema

! Data model: a collection of concepts for describing data ! Structure of the data ! Operations on the data ! Constraints on the data

! Schema: a description of a particular collection of data at some abstraction level, using a given data model

! Relational data model is the most widely used model today ! Main concept: relation, basically a with rows and columns ! Every relation has a schema, which describes the columns, or fields

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-7

Definition of Relation

! Informal Definition: A relation is a named, two-dimensional table of data ! Table consists of rows (record) and columns (attribute or field)

! Example: Attributes (also: columns, fields)

Student sid name login gender address

5312666 Jones ajon1121@cs m 123 Main St Tuples 5366668 Smith smith@mail" m 45 George (rows, records) 5309650" Jin" ojin4536@it" f" 19 City Rd"

Conventions: we try to follow a general convention that relation names begin with a capital letter, while attribute names begin with a lower-case letter INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-8 Some Remarks

! Not all tables qualify as a relation. ! Requirements: ! Every relation must have a unique name. ! Attributes (columns) in tables must have unique names. => The order of the columns is irrelevant. ! All tuples in a relation have the same structure; constructed from the same set of attributes ! Every attribute value is atomic (not multivalued, not composite). ! Every is unique (cant have two rows with exactly the same values for all their fields) ! The order of the rows is immaterial

! The restriction of atomic attributes is also known as (1NF).

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-9

Example

! Is this a correct relational table? id name Name gender address phones 1234 Peter Pan M Neverland 0403 567123

5658 Dan Murphy M Alexandria 02 67831122 0431 567312 9876 Jin Jiao F Jkdsafas sdf asdjf st 8877 Sarah Sandwoman F Glebe 02 8789 8876

1234 Peter Pan M Neverland 0403 567123

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-10 Formal Definition of a Relation

! A Relation is a mathematical concept based on the ideas of sets. ! Relation R

Given sets D1, D2, …, Dn, a relation R is a subset of D1 x D2 x…x Dn Thus, a relation is a set of n-tuples (a1, a2, …, an) where ai ∈ Di ! Example: If customer-name = {Jones, Smith, Curry, Lindsay} customer-street = {Main, North, Park} customer-city = {Sydney, Brisbane, Perth} then R = { (Jones, Main, Sydney), (Smith, North, Perth), (Curry, North, Brisbane), (Lindsay, Park, Sydney) } is a relation over customer-name x customer-street x customer-city

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-11

Relation Schema vs. Relation Instance

! A relation R has a relation schema: ! specifies name of relation, and name and data type of each attribute.

! A1, A2, …, An are attributes

! R = ( A1, A2, …, An ) is a relation schema ! e.g. Students(sid: string, name: string, login: string, addr: string, gender: char)

! A relation instance: a set of tuples (table) for a schema

! D1, D2, …, Dn are the domains ! each attribute corresponds to one domain:

dom(Ai) = Di , 1 <= i <= n

! R ⊆ D1 x D2 x…x Dn ! #rows = cardinality, #fields = degree (or arity) of a relation

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-12 Theory vs. Technology

! A relational DBMS supports data in a form close to, but not exactly, matching the mathematical relation ! RDBMS allows null ‘values’ for unknown information ! RDBMS allows duplicate rows ! The formal relational model does not allow for duplicates

! plus some more differences which we will see later (e.g. RDBMS support an order of tuples or attributes)

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-13

The Special NULL ‘Value’

! RDBMS allows a special entry NULL in a to represent facts that are not relevant, or not yet known ! Eg a new employee has not yet been allocated to a department ! Eg salary, hired may not be meaningful for adjunct lecturers

lname fname salary birth hired

Jones Peter 35000 1970 1998

Smith Susan null 1983 null

Smith Alan 35000 1975 2000

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-14 Pro and Con of NULL

! Pro: NULL is useful because using an ordinary value with special meaning does not always work ! Eg if salary=-1 is used for “unknown” in the previous example, then averages won’t be sensible

! Con: NULL causes complications in the definition of many operations ! We shall ignore the effect of null values in our main presentation and consider their effects later

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-15

Creating and Deleting Relations in SQL

! Creation of tables (relations): CREATE TABLE name ( list-of-columns ) ! Example: Create the Students relation. Observe that the type (domain) of each field is specified, and enforced by the DBMS whenever tuples are added or modified. CREATE TABLE Student ( sid INTEGER, name VARCHAR(20), login VARCHAR(20), gender CHAR, address VARCHAR(50) );

! Deletion of tables (relations): DROP TABLE name ! the schema information and the tuples are deleted. ! Example: Destroy the Students relation DROP TABLE Students

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-16 Base Datatypes of SQL

Base Datatypes Description Example Values SMALLINT INTEGER Integer values 1704, 4070 BIGINT DECIMAL(p,q) Fixed-point numbers with precision p and q 1003.44, 160139.9 NUMERIC(p,q) decimal places FLOAT(p) REAL floating point numbers with precision p 1.5E-4, 10E20 DOUBLE PRECISION CHAR(q) alphanumerical character string types VARCHAR(q) of fixed size q respectively ‚The quick brown fix jumps…’, ’INFO2120’ CLOB(q) of variable length of up to q chars BLOB(r) binary string of size r B’01101’, X’9E’ DATE date DATE ’1997-06-19’, DATE ’2001-08-23’ TIME time TIME ’20:30:45’, TIME ’00:15:30’ TIMESTAMP timestamp TIMESTAMP ’2002-08-23 14:15:00’

INTERVAL time interval INTERVAL ’11:15’ HOUR TO MINUTE

(cf. Türker, ORDBMS 2004/2005)

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-17

Create Table Example

Student Enrolled UnitOfStudy sid name sid ucode semester ucode title credit_pts

CREATE TABLE Student ( sid INTEGER, name VARCHAR(20) ); CREATE TABLE UnitOfStudy ( ucode CHAR(8), title VARCHAR(30), creditPoints INTEGER ); CREATE TABLE Enrolled ( sid INTEGER, ucode CHAR(8), semester VARCHAR );

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-18 SQL Schemas & Table Modifications

! Database Servers typically shared by multiple users ! We want separate schemas per user (naming: schema.tablename) ! CREATE SCHEMA … command ! E.g. http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_6014.htm or http://www.postgresql.org/docs/8.3/static/sql-createschema.html ! If not provided, automatic schema by user name (cf. Oracle)

! Several base data types available in SQL ! E.g. INTEGER, REAL, CHAR, VARCHAR, DATE, … ! but each system has its specialities such as specific BLOB types or value range restrictions ! E.g. Oracle calls a string for historical reasons VARCHAR2 ! cf. online documentation

! Existing schemas can be changed ALTER TABLE name ADD COLUMN … | ADD CONSTRAINT…| … ! Huge variety of vendor-specific options; cf. online documentation

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-19

Modifying Relations using SQL

! Insertion of new data into a table / relation ! Syntax: INSERT INTO table [“(”list-of-columns“)”] VALUES “(“ list-of-expression “)” ! Example: INSERT INTO Student (sid, name) VALUES (12345678, ‘Smith’)

! Updating of tuples in a table / relation ! Syntax: UPDATE table SET column“=“expression {“,”column“=“expression} [ WHERE search_condition ] ! Example: UPDATE Student SET address = ‘4711 Water Street’ WHERE sid = 123456789

! Deleting of tuples from a table / relation More details on ! Syntax: those in the SQL chapter. DELETE FROM table [ WHERE search_condition ] ! Example: DELETE FROM Student WHERE name = ‘Smith’

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-20 Relational Database

! Data Structure: A relational database is a set of relations (tables) with tuples (rows) and fields (columns) - a simple and consistent structure. ! The union of the corresponding relational schemata is the relational database schema.

! Data Manipulation: Powerful operators to manipulate the data stored in relations.

! Data Integrity: facilities to specify a variety of rules to maintain the integrity of data when it is manipulated.

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-21

Integrity Constraints

! Integrity Constraint (IC): condition that must be true for any instance of the database; e.g., domain constraints. ! ICs can be declared in the schema ! They are specified when schema is defined. ! All declared ICs are checked whenever relations are modified. ! A legal instance of a relation is one that satisfies all specified ICs. ! If ICs are declared, DBMS will not allow illegal instances. ! If the DBMS checks ICs, stored data is more faithful to real-world meaning. ! Avoids data entry errors, too!

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-22 Non-Null Columns

! One domain constraint is to insist that no value in a given column can be null ! The value can’t be unknown; The concept can’t be inapplicable

! In SQL CREATE TABLE Student ( sid INTEGER NOT NULL, name VARCHAR(20), login VARCHAR(20) NOT NULL, gender CHAR, birthdate DATE )

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-23

ICs to avoid Duplicate Rows

! In a SQL-based RDBMS, it is possible to insert a row where every attribute has the same value as an existing row ! The table will then contain two identical rows ! This isn’t possible for a mathematical relation, which is a set of n-tuples

lname fname salary birth hired

Jones Peter 35000 1970 1998

Smith Susan 75000 1983 2006 Identical rows Smith Alan 35000 1975 2000

Jones Peter 35000 1970 1998

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-24 Duplicate Rows are bad

! Duplicating information ! waste of storage ! Huge danger of inconsistencies if we miss duplicates during updates

! Need for a mechanism to avoid duplicated data…

=> Key Constraints

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-25

Relational Keys

! Primary keys are unique, minimal identifiers in a relation. ! Examples include employee numbers, social security numbers, etc. This is how we can guarantee that all rows are unique. ! There may be several candidate keys to choose from ! If we just say key, we typically mean

! Foreign keys are identifiers that enable a dependent relation (on the many side of a relationship) to refer to its parent relation (on the one side of the relationship) ! Must refer to a candidate key of the parent relation ! Like a `logical pointer

! Keys can be simple (single attribute) or composite (multiple attributes) ! Keys usually are used as indices to speed up the response to user queries (more on this later in the semester)

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-26 Example: Relational Keys

Primary key identifies each tuple of a relation. Composite Primary Key consisting of more than one attribute.

Student Enroll Units_of_study sid name sid ucode semester ucode title credit_pts

31013 John 31013 I2005I2120 2005S1 I2120 DB Intro 4

Foreign key is a (set of) attribute(s) in one relation that `refers’ to a tuple in another relation (like a `logical pointer’).

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-27

Relational Keys in more Detail

! A set of fields is a key for a relation if : ! 1. No two distinct tuples can have same values in all key fields, and ! 2. This is not true for any subset of the key. ! Part 2 false? A . ! If theres >1 key for a relation, we call them each a candidate key, and one of the keys is chosen (by DBA) to be the primary key.

! E.g., sid is a key for Student. ! What about name? ! And the set {sid, gender}? This is a superkey.

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-28 Summary of Key Constraints in SQL

! Primary keys and foreign keys can be specified as part of the SQL CREATE TABLE statement: ! The PRIMARY KEY clause lists attributes that comprise the primary key. ! The clause lists the attributes that comprise the foreign key and the name of the relation referenced by the foreign key. ! The UNIQUE clause lists attributes that comprise a candidate key. ! By default, a foreign key references the primary key attributes of the referenced table FOREIGN KEY (sid) REFERENCES Student ! Reference columns in the referenced table can be explicitly specified ! but must be declared as primary or candidate keys FOREIGN KEY (lecturer) REFERENCES Lecturer(empid)

! Tip: Name them using CONSTRAINT clauses CONSTRAINT Student_PK PRIMARY KEY (sid)

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-29

Create Table Example with PKs/FKs

Student Enrolled Unit_of_Study sid name sid ucode semester ucode title credit_pts

CREATE TABLE Student ( sid INTEGER, … , CONSTRAINT Student_PK PRIMARY KEY (sid) ); CREATE TABLE UoS ( ucode CHAR(8), … , CONSTRAINT UoS_PK PRIMARY KEY (ucode) ); CREATE TABLE Enrolled ( sid INTEGER, ucode CHAR(8), semester VARCHAR, CONSTRAINT Enrolled_FK1 FOREIGN KEY (sid) REFERENCES Student, CONSTRAINT Enrolled_FK2 FOREIGN KEY (ucode) REFERENCES UoS, CONSTRAINT Enrolled_PK PRIMARY KEY (sid,ucode) );

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-30 Naming Constraints

Tip: give a name to constraints (this makes error messages clearer)

CREATE TABLE Student ( sid INTEGER, … , CONSTRAINT Student_PK PRIMARY KEY (sid) ); CREATE TABLE UoS ( cid CHAR(8), … , CONSTRAINT UoS_PK PRIMARY KEY (cid) ); CREATE TABLE Enrolled ( sid INTEGER, cid CHAR(8), CONSTRAINT Enroll_FK1 FOREIGN KEY (sid) REFERENCES Student, CONSTRAINT Enroll_FK2 FOREIGN KEY (cid) REFERENCES UoS, CONSTRAINT Enroll_PK PRIMARY KEY (sid,cid) );

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-31

Choosing the Correct Key Constraints

! Careful: Used carelessly, an IC can prevent the storage of database instances that arise in practice! ! Example: vs. CREATE TABLE Enrolled ( CREATE TABLE Enrolled ( sid INTEGER, sid INTEGER, cid CHAR(8), cid CHAR(8), grade CHAR(2), grade CHAR(2), PRIMARY KEY (sid), PRIMARY KEY (sid,cid) ) UNIQUE (cid, grade) )

! For a given student and course, ! Students can take only one course there is a single grade. and receive a single grade for that course; further, no two students in a course receive the same grade.

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-32 Keys and NULLs

! PRIMARY KEY ! Must be unique and do not allow NULL values ! UNIQUE (candidate key) ! Possibly many candidate keys (specified using UNIQUE) ! According to the ANSI standards SQL:92, SQL:1999, and SQL:2003, a UNIQUE constraint should disallow duplicate non-NULL values, but allow multiple NULL values. ! Many DBMS (e.g. Oracle or SQL Server) implement only a crippled version of this, allowing a single NULL but disallowing multiple NULL values…. ! FOREIGN KEY ! By default allows nulls ! If there must be a parent tuple, then must combine with NOT NULL constraint

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-33

Foreign Keys &

! Referential Integrity: for each tuple in the referring relation whose foreign key value is α, there must be a tuple in the referred relation with a candidate key that also has value α ! e.g. sid is a foreign key referring to Student: Enrolled(sid: integer, ucode: string, semester: string) ! If all foreign key constraints are enforced, referential integrity is achieved, i.e., no dangling references

Q: Can you name a data model w/o referential integrity?

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-34 Enforcing Referential Integrity

! Consider Student and Enrolled; sid in Enrolled is a foreign key that references Student.

! What should be done if an Enrolled tuple with a non-existent student sid is inserted? (Reject it!)

! What should be done if a Student tuple is deleted? Choices: ! Also delete all Enrolled tuples that refer to it. ! Disallow deletion of a Student tuple that is referred to. ! Set sid in Enrolled tuples that refer to it to a default sid. ! (In SQL, also: Set sid in Enrolled tuples that refer to it to a special value null, denoting `unknown or `inapplicable.)

! Similar if primary key of Student tuple is updated.

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-35

Referential Integrity in SQL

! SQL/92, SQL:1999 and SQL: CREATE TABLE UnitOfStudy 2003 support all 4 options on ( uosCode CHAR(8), deletes and updates. title VARCHAR(80), ! Default is NO ACTION (delete/ update is rejected) credits INTEGER, ! CASCADE (also delete all tuples taughtBy INTEGER DEFAULT 1, that refer to deleted tuple) PRIMARY KEY (uosCode), ! SET NULL / SET DEFAULT (sets FOREIGN KEY (taughtBy) foreign key value of referencing REFERENCES Professor tuple) ON UPDATE CASCADE ON DELETE SET DEFAULT ) UnitOfStudy uosCode title credits taughtBy

Professor empid name

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-36 This Weeks Agenda

! Introduction

! Creating Relational Database Schemas using SQL

! Mapping E-R Models to Relations

! Levels of Abstractions / Views

INFO2120/2820 & COMP5138 "Database Systems I" - 2012 (U. Röhm) 03-37

Correspondence with E-R Model

! Relations (tables) correspond to entity types (entity sets) and to many-to-many relationship types/sets.

! Rows correspond with entities and with many-to-many relationships.

! Columns correspond with attributes or one-to-many relationships.

! Note: The word relation (in relational database) is NOT the same as the word relationship (in E-R model).

INFO2120/2820 & COMP5138 "Database Systems I" - 2012 (U. Röhm) 03-38 Mapping E-R Diagrams into Relations

! Mapping rules for ! Strong Entities ! Weak Entities ! Relationships ! One-to-many, Many-to-many, One-to-one ! Unary Relationships ! Ternary Relationships ! ISA Hierarchies

! We will concentrate in the lecture on typical examples…

INFO2120/2820 & COMP5138 "Database Systems I" - 2012 (U. Röhm) 03-39

1. Mapping Regular Entities to Relations

! Each entity type becomes a relation ! Simple attributes E-R attributes map directly onto the relation ! Composite attributes Composite attributes are flattened out by creating a separate field for each component attribute => We use only their simple, component attributes ! Multi-valued attribute Becomes a separate relation with a foreign key taken from the superior entity

INFO2120/2820 & COMP5138 "Database Systems I" - 2012 (U. Röhm) 03-40 Example: Mapping Entity Types

! Employee entity type with composite/multi-valued attributes

street empId name city address Employee skills state

zipcode

Employee Skills empId name street city state zipcode empId skill

PK-/FK reference between Employee table and table for multi-valued Skills attribute. INFO2120/2820 & COMP5138 "Database Systems I" - 2012 (U. Röhm) 03-41

2. Mapping of Weak Entity Types

! Weak Entity Types ! become a separate relation with a foreign key taken from the superior entity ! primary key composed of: ! Partial key (discriminator) of weak entity ! Primary key of identifying relation (strong entity) ! Mapping of attributes of weak entity as shown before

INFO2120/2820 & COMP5138 "Database Systems I" - 2012 (U. Röhm) 03-42 Example: Mapping of Weak Entity

! Weak entity set Dependent with composite partial key

given empIid name family Employee Policy Dependent name birthdate

NOTE: the domain constraint for the foreign Foreign key key of a weak entity should not allow NULL Employees Dependent empId name empId given family birthdate

Composite primary key

INFO2120/2820 & COMP5138 "Database Systems I" - 2012 (U. Röhm) 03-43

3. Mapping of Relationship Types

! Many-to-Many - Create a new relation with the primary keys of the two entity types as its primary key ! One-to-Many - Primary key on the one side becomes a foreign key on the many side ! Participation Constraint: total side becomes NOT NULL ! One-to-One - Primary key on the mandatory side becomes a foreign key on the optional side

! Relationship attributes - become fields of either the dependent, respectively new relation

INFO2120/2820 & COMP5138 "Database Systems I" - 2012 (U. Röhm) 03-44 Example: Mapping of Many-to-Many Relationship Types ! Many-to-many relationship between Student & UnitOfStudy

sid ucode

name Student enroll UnitOfStudy title birthdate country credit_ semester grade points

Student Enroll UnitOfStudy

sid name bdate cntry sid ucode semester grade ucode title credit_pts

INFO2120/2820 & COMP5138 "Database Systems I" - 2012 (U. Röhm) 03-45

Example: Mapping of One-to-Many Relationships ! Key Constraint: One-to-many relationship type

did ssn Department WorksIn Employee dname name since

Department Employee

did dname ssn name did since

! Participation Constraint: NOT NULL on foreign key

INFO2120/2820 & COMP5138 "Database Systems I" - 2012 (U. Röhm) 03-46 Further Examples

manager wife husband Employee manages Person Married_to

Employee Person

ssn name manager name married_to

INFO2120/2820 & COMP5138 "Database Systems I" - 2012 (U. Röhm) 03-47

4. Mapping of ISA-Hierarchies

! Standard way (works always): ! Distinct relations for the superclass and for each subclass ! Superclass attributes (including key and possible relationships) go into superclass relation ! Subclass attributes go into each sub-relation; primary key of superclass relation also becomes primary key of subclass relation ! 1:1 relationship established between superclass and each subclass, with superclass as primary table ! I.e. primary keys of subclass relations become also a foreign key referencing the superclass relation ! Special case in presence of a total covering constraint: ! Distinct relations for each subclass with primary key of superclass ! All superclass attributes are included in each subclass relation ! No foreign-keys needed

INFO2120/2820 & COMP5138 "Database Systems I" - 2012 (U. Röhm) 03-48 Example: Mapping of ISA-Hierarchy

! Table for superclass name street city ! Primary key + common attributes ! Separate tables for subclasses Person ! Primary key of superclass (= foreign key) + specific attributes ISA

salary Credit_rating

Employee Customer

Person

name street city

Employee Customer name salary name credit_rating

INFO2120/2820 & COMP5138 "Database Systems I" - 2012 (U. Röhm) 03-49

This Weeks Agenda

! Introduction

! Creating Relational Database Schemas using SQL

! Mapping E-R Models to Relations

! Levels of Abstraction / Views

INFO2120/2820 & COMP5138 "Database Systems I" - 2012 (U. Röhm) 03-50 Levels of Abstraction

! It is convenient to organise data View 1 View 2 View n at different levels of abstraction. ! External Schemas (views) describe how users see the data Logical Schema ! The Logical Schema (also: Conceptual Schema) defines logical structure ! The Physical Schema describes Physical Schema the files and indexes used

! Schemas are defined using DDL

! Data is modified and queried using DB DML

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-51

Example: University Database

! Logical schema: ! Student(sid:integer, name:string, login:string, bdate: date, sex:char) ! UoS(ucode:string, cname:string, credits: integer) ! Enrolled(sid:integer, ucode :string, semester: integer, grade:string)

! Physical Schema: ! describes details of how data is stored: ! E.g. relations stored as unordered files ! Index on first column of Students ! non-database applications typically have only on a physical schema

! External Schema (view) ! UoS_info(ucode :string, enrolment: integer)

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-52 Data Independence

! Applications insulated from how is structured and stored

! Logical data independence: Protection from changes in logical structure of data ! by relying on external schemas using views

! Physical data independence: Protection from changes in physical structure of data ! Physical schema can be changed without changing the application ! The DBMS maps from conceptual to physical schema automatically and transparently

" One of the most important benefits of using a DBMS!

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-53

Relational Views

! A view is a virtual relation, but we store a definition, rather than a set of tuples. ! The contents of a view is computed when it is used within an SQL statement ! Mechanism to hide data from the view of certain users. ! Views can be used to present necessary information (or a summary), while hiding details in underlying relation(s). ! Users can operate on the view as if it were a stored table ! DBMS calculates the value whenever it is needed

! Syntax: CREATE VIEW name AS ! where ! is any legal query expression (can even combine multiple relations)

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-54 View Examples

! A view on the students showing their age. CREATE VIEW ageStudents AS SELECT sid, name, extract(year from sysdate) - extract(year from birthdate) AS age FROM Student

! A view on the female students enrolled in 2009sem1 CREATE VIEW FemaleStudents2009 (name, grade) AS SELECT S.name, E.grade FROM Student S, Enrolled E WHERE S.sid = E.sid AND S.gender = FAND E.semester = 2012sem1

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-55

Updates on Views

! Create a view of the enrolled relation, hiding the grade attribute CREATE VIEW Enrolled_Students AS SELECT sid AS student, uos_code FROM Enrolled

! Add a new tuple to view enrolled_students INSERT INTO Enrolled_Students VALUES (200421567,‘INFO2120’)

This insertion means the insertion of the tuple (200421567, ‘INFO2120’, null, null) into the enrolled relation

! Updates on more complex views are difficult or impossible to translate, and hence are disallowed. ! Updatable Views: SQL-92 allows updates only on simple views (SELECT- FROM-WHERE without aggregates or distinct) defined on a single relation ! SQL:1999 allows more, but most implementations dont support this yet

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-56 Updating Views

! Question: Since views look like tables to users, can they be updated? ! Answer: Yes – a view update changes the underlying base table to produce the requested change to the view

CREATE VIEW CsReg (StudId, CrsCode, Semester) AS SELECT T.StudId, T. CrsCode, T.Semester FROM Transcript T WHERE T.CrsCode LIKE ‘CS%’ AND T.Semester=‘S2000’

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-57

Updating Views - Problem 1

INSERT INTO CsReg (StudId, CrsCode, Semester) VALUES (1111, ‘CSE305’, ‘S2000’)

! Question: What value should be placed in attributes of underlying table that have been projected out (e.g., Grade)? ! Answer: NULL (assuming null allowed in the missing attribute) or DEFAULT

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-58 Updating Views - Problem 2

INSERT INTO CsReg (StudId, CrsCode, Semester) VALUES (1111, ‘ECO105’, ‘S2000’)

! Problem: New tuple will not be visible in view ! Solution: Allow insertion (assuming the WITH CHECK OPTION clause has not been appended to the CREATE VIEW statement)

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-59

Updating Views - Problem 3

! Update to a view might not uniquely specify the change to the base table(s) that results in the desired modification of the view (ambiguity)

! Example: CREATE VIEW ProfDept (PrName, DeName) AS SELECT P.Name, D.Name FROM Professor P, Department D WHERE P.DeptId = D.DeptId

! Can we do the following? DELETE FROM ProfDept WHERE PrName=‘Smith’ AND DeName=‘CS’

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-60 Updating Views - Problem 3 (cont’d)

! Tuple can be deleted from ProfDept by: ! Deleting row for Smith from Professor (but this is inappropriate if he is still at the University) ! Deleting row for CS from Department (not what is intended) ! Updating row for Smith in Professor by setting DeptId to null (seems like a good idea, but how would the computer know?)

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-61

You should now be able to:

! The Relational Model ! Design a relational schema for a simple use case ! Map an Entity-Relationship diagram to a relational schema ! Identify candidate and primary keys for a relational schema ! Explain the basic rules and restrictions of the relational data model ! Explain the difference between candidate, primary and foreign keys ! Create and modify a relational database schema using SQL ! including domain types, NULL constraints and PKs/FKs

! Levels of Abstraction ! Explain the different levels of database abstractions, and know what role those play for physical and logical program-data independence ! Define an external schema using SQL views

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-62 References

! Kifer/Bernstein/Lewis (2nd edition) ! Chapter 3 ! Ramakrishnan/Gehrke (3rd edition - the ‘Cow’ book) ! Chapter 3.1-3.4 and 3.6-3.7, plus Chapter 1.5 ! Our running example is from this book; it starts with conceptual E-R modelling, which we do later ! Ullman/Widom (3rd edition) ! Chapter 2.1 - 2.3, Section 7.1 and Chapter 8.1-8.2 ! views and foreign keys come later, instead relational algebra is introduced very early on; also briefly compares RDM with XML, ! Silberschatz/Korth/Sudarshan (5th edition - ‘sailing boat’) ! Chapter 2.1 - 2.2; Chapter 3.1-3.2 and 3.9 ! starts with relational algebra early on which we do next week ! Elmasri/Navathe (5th edition) ! Chapter 2.1-2.3; Chapter 5; Section 8.8 ! talks first more about system architectures and conceptual modeling

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-63

Next Week

! The Structured Query Language ! Relational Algebra ! Foundations of Declarative Querying ! Relational Algebra - a formal query language for the relational data model

! Readings: ! Kifer/Bernstein/Lewis book, Chapter 5 ! Or alternatively (if you prefer those books): ! Ramakrishnan/Gehrke, Chapter 5 & Section 4.2 ! Ullman/Widom, Chapter 6

INFO2120/2820 & COMP5138 "Database Systems I" - 2012 (U. Röhm) 03-64 Administrativa

! Tutorials and Weekly Homework run as usual ! In addition: SQL Challenge starting this week too ! Deadline for both Hwk and Challenge: Monday next week, end-of-day ! Electronic submission in ! SQL Challenge itself or ! eLearning (menu item ‘Homework’, then ‘Week 3: Relational Model’)

! Practical Assignment, Part 1, will start this week too ! To be notified by email and announced on Piazza forums ! Important: Form teams of 2-3 students

! Check regularly our Piazza forums for tips and hints!

INFO2120/2820 & COMP5138 "Database Systems I" - 2013 (U. Röhm) 03-65