Why Study the ? The Relational Model ❖ Most widely used model. – Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. ❖ “Legacy systems” in older models – E.G., IBM’s IMS ❖ Recent competitor: object-oriented model – ObjectStore, Versant, Ontos, O2 – A synthesis emerging: object-relational model ◆ Informix Universal Server, UniSQL, Oracle, DB2

1 2

Relational : Definitions Example Instance of Students

: a set of relations sid name login age gpa ❖ Relation: made up of 2 parts: 53666 Jones jones@cs 18 3.4 – Instance : a , with rows and columns. #Rows = cardinality, #fields = degree / arity. 53688 Smith smith@eecs 18 3.2 – Schema : specifies name of relation, plus name and 53650 Smith smith@math 19 3.8 type of each .

◆ E.G. Students(sid: string, name: string, login: string, age: integer, gpa: real) ❖ Cardinality = 3, degree = 5, all rows distinct ❖ Can think of a relation as a set of rows or ❖ Do all columns in a relation instance have to (i.e., all rows are distinct). be distinct?

3 4

Creating Relations in SQL Integrity Constraints (ICs)

❖ Creates the Students CREATE TABLE Students ❖ IC: condition that must be true for any instance relation. Observe that the (sid: CHAR(20), of the database; e.g., domain constraints. CHAR(20) type (domain) of each field name: , – ICs are specified when schema is defined. login: CHAR(10), is specified, and enforced by – ICs are checked when relations are modified. age: INTEGER, the DBMS whenever tuples gpa: REAL) ❖ A legal instance of a relation is one that satisfies are added or modified. all specified ICs. ❖ As another example, the – DBMS should not allow illegal instances. Enrolled table holds CREATE TABLE Enrolled (sid: CHAR(20), ❖ If the DBMS checks ICs, stored data is more information about courses cid: CHAR(20), faithful to real-world meaning. that students take. grade: CHAR(2)) – Avoids many data entry errors, too!

5 6 Constraints Primary and Candidate Keys in SQL ❖ Possibly many candidate keys (specified using ❖ A set of fields is a superkey for a relation if: UNIQUE), one of which is chosen as the primary key. – No two distinct tuples have the same values in all CREATE TABLE Enrolled fields of the superkey ❖ “For a given student and course, there is a single grade.” vs. (sid CHAR(20) ❖ A superkey is a (candidate) key if : “Students can take only one cid CHAR(20), – No proper subset of it is a superkey course, and receive a single grade grade CHAR(2), PRIMARY KEY (sid,cid) ) ❖ If there’s >1 for a relation, one for that course; further, no two of the keys is chosen (by DBA) to be the students in a course receive the CREATE TABLE Enrolled same grade.” (sid CHAR(20) primary key. ❖ Used carelessly, an IC can prevent cid CHAR(20), ❖ E.g., sid is a key for Students. (What about the storage of database instances grade CHAR(2), name?) The set {sid, gpa} is a superkey. that arise in practice! PRIMARY KEY (sid), UNIQUE (cid, grade) ) 7 8

Foreign Keys, Foreign Keys in SQL

❖ Only students listed in the Students relation should ❖ : Set of fields in one relation that is used be allowed to enroll for courses. to `refer’ to a in another (or the same) relation. (Must correspond to primary key of the CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), second relation.) Like a `logical pointer’. PRIMARY KEY (sid,cid), ❖ E.g. sid is a foreign key referring to Students: FOREIGN KEY (sid) REFERENCES Students ) – Enrolled(sid: string, cid: string, grade: string) Enrolled sid cid grade Students – If all foreign key constraints are enforced, referential 53666 Carnatic101 C sid name login age gpa integrity is achieved, i.e., no dangling references. 53666 Reggae203 B 53666 Jones jones@cs 18 3.4 – Can you name a data model w/o referential integrity? 53650 Topology112 A 53688 Smith smith@eecs 18 3.2 ◆ Links in HTML! 53666 History105 B 53650 Smith smith@math 19 3.8

9 10

Enforcing Referential Integrity Referential Integrity in SQL/92 ❖ Consider Students and Enrolled; sid in Enrolled is a foreign key that references Students. ❖ SQL/92 supports all 4 CREATE TABLE Enrolled ❖ options on deletes and (sid CHAR(20), What should be done if an Enrolled tuple with a non- updates. existent student id is inserted? (Reject it!) cid CHAR(20), – Default is NO ACTION grade CHAR(2), ❖ What should be done if a Students tuple is deleted? (delete/update is rejected) PRIMARY KEY (sid,cid), – Also delete all Enrolled tuples that refer to it. – CASCADE (also delete FOREIGN KEY (sid) – Disallow deletion of a Students tuple that is referred to. all tuples that refer to REFERENCES Students – Set sid in Enrolled tuples that refer to it to a default sid. deleted tuple) ON DELETE CASCADE – ON UPDATE SET DEFAULT ) (In SQL, also: Set sid in Enrolled tuples that refer to it to a – SET NULL / SET DEFAULT special placeholder , meaning `unknown’ or `inapplicable’) (sets foreign key value ❖ Similar if primary key of Students tuple is updated. of referencing tuple)

11 12 Where do ICs Come From? Logical DB Design: ER to Relational ❖ ICs are based upon the semantics of the real- world enterprise that is being described in the ❖ Entity sets to tables. database relations. ❖ We can check a database instance to see if an IC is violated, but we can NEVER infer that name an IC is true by looking at an instance. ssn lot – An IC is a statement about all possible instances! Employees – From example, we know name is not a key, but the CREATE TABLE Employees assertion that sid is a key is given to us. (ssn CHAR(11), ❖ Key and foreign key ICs are the most name CHAR(20), common; more general ICs supported too. lot INTEGER, PRIMARY KEY (ssn)) 13 14

Relationship Sets to Tables Translating ER Diagrams with Key Constraints CREATE TABLE Works_In( CREATE TABLE Manages( ssn CHAR(1), ❖ Map relationship to a ❖ In translating a relationship ssn CHAR(11), did INTEGER, table: did INTEGER, set to a relation, attributes of since DATE, – Note that did is since DATE, the relation must include: PRIMARY KEY (did), the key now! PRIMARY KEY (ssn, did), FOREIGN KEY (ssn) REFERENCES Employees, – Keys for each FOREIGN KEY (ssn) – Separate tables for FOREIGN KEY (did) REFERENCES Departments) participating entity set REFERENCES Employees, Employees and (as foreign keys). Departments. FOREIGN KEY (did) CREATE TABLE Dept_Mgr( ◆ This set of attributes REFERENCES Departments) ❖ Since each did INTEGER, forms a superkey for department has a dname CHAR(20), budget REAL, the relation. unique manager, we could instead ssn CHAR(11), – All descriptive attributes. since DATE, combine Manages PRIMARY KEY (did), and Departments. FOREIGN KEY (ssn) REFERENCES Employees)

15 16

Participation Constraints in SQL Translating Weak Entity Sets ❖ We can capture participation constraints involving ❖ Weak entity set and identifying relationship one entity set in a binary relationship, but little else set are translated into a single table. (without resorting to CHECK constraints). – When the owner entity is deleted, all owned weak entities must also be deleted. CREATE TABLE Dept_Mgr( CREATE TABLE Dep_Policy ( did INTEGER, pname CHAR(20), dname CHAR(20), age INTEGER, budget REAL, cost REAL, ssn CHAR(11) NOT NULL, ssn CHAR(11) NOT NULL, since DATE, PRIMARY KEY (pname, ssn), PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees, FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE CASCADE) ON DELETE NO ACTION)

17 18 Relational Query Languages The SQL

❖ A major strength of the relational model: ❖ Developed by IBM (system R) in the 1970s supports simple, powerful querying of data. ❖ Need for a standard since it is used by many ❖ Queries can be written intuitively, and the vendors DBMS is responsible for efficient evaluation. ❖ Standards: – The key: precise semantics for relational queries. – SQL-86 – Allows the optimizer to extensively re-order – SQL-89 (minor revision) operations, and still ensure that the answer does – SQL-92 (major revision, current standard) not change. – SQL-99 (major extensions)

19 20

The SQL Query Language Querying Multiple Relations

❖ What does the following query compute? ❖ To find all 18 year old students, we can write: SELECT S.name, E.cid FROM Students S, Enrolled E SELECT * sid name login age gpa WHERE S.sid=E.sid AND E.grade=“A” FROM Students 53666 Jones jones@cs 18 3.4 WHERE age=18 53688 Smith smith@ee 18 3.2 Given the following instance sid cid grade of Enrolled (is this possible if 53831 Carnatic101 C •To find just names and logins, replace the first line: the DBMS ensures referential 53831 Reggae203 B integrity?): 53650 Topology112 A SELECT name, login 53666 History105 B

we get: S.name E.cid Smith Topology112

21 22

Adding and Deleting Tuples Destroying and Altering Relations

❖ Can insert a single tuple using: DROP TABLE Students INSERT INTO Students (sid, name, login, age, gpa) ❖ Destroys the relation Students. The schema VALUES (53688, ‘Smith’, ‘smith@ee’, 18, 3.2) information and the tuples are deleted. ❖ Can delete all tuples satisfying some condition (e.g., name = Smith): ALTER TABLE Students ADD COLUMN firstYear: integer DELETE FROM Students ❖ The schema of Students is altered by adding a WHERE name = ‘Smith’ new field; every tuple in the current instance ☛ Powerful variants of these commands are available; more later! is extended with a null in the new field. 23 24 Relational Model: Summary

❖ A tabular representation of data. ❖ Simple and intuitive, currently the most widely used. ❖ Integrity constraints can be specified by the DBA, based on application semantics. DBMS checks for violations. – Two important ICs: primary and foreign keys – In addition, we always have domain constraints. ❖ Guidelines to translate ER to relational model ❖ Powerful and natural query languages exist.

25