Management Systems

Chapter 3 Part 1

The

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Why Study the Relational Model?

 Most widely used model. . Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc.  Simple data representation and it can express complex queries.  Recent competitor: object-oriented model . ObjectStore, Versant, Ontos . A synthesis emerging: object-relational model • Informix Universal Server, UniSQL, O2, Oracle, DB2

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 2 : Definitions

 Relational database: a set of relations with distinct names.  Relation: made up of 2 parts: . Instance : a table, with rows and columns. #Rows = cardinality, #fields = degree. . Schema : specifies name of relation, plus name and domain/type of each column. • E.G. Students(sid: string, name: string, login: string, age: integer, gpa: real).  Can think of a relation as a set of rows or tuples (i.e., all rows are distinct).

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 3

Example Instance of Students Relation

sid name login dob gpa 53666 Jones jones@cs 2000-09-01 3.4 53688 Smith smith@eecs 2001-08-01 3.2 53650 Smith smith@math 2000-10-10 3.8

 Cardinality = 3, degree = 5, all rows distinct  The order of fields does not matter if the fields are named.  A relation is a set of rows, so the order of rows is not important. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 4 Relational Query Languages

 A major strength of the relational model: supports simple, powerful querying of data.  Queries can be written intuitively, and the DBMS is responsible for efficient evaluation. . The key: precise semantics for relational queries. . Allows the optimizer to extensively re-order operations, and still ensure that the answer does not change.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 5

The SQL

 Developed by IBM (system R) in the 1970s  Need for a standard since it is used by many vendors  Standards: . SQL-86 . SQL-89 (minor revision) . SQL-92 (major revision) . SQL-99 (major extensions, current standard)

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 6 Creating Relations in SQL

 Creates the Students relation. Observe that the type (domain) of each field is specified, and enforced by the DBMS whenever tuples are added or modified.

CREATE TABLE Students string type with (sid CHAR(20), different length name CHAR(30), login CHAR(20), DOB DATE, gpa REAL);

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 7

Creating Relations in SQL

 As another example, the Course table holds information about courses that students take.

CREATE TABLE Course (cid CHAR(20), cname CHAR (20), description CHAR( 100) DEFAULT ‘Course Description’ );

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 8 Creating Relations in SQL

 As another example, the Enrolled table holds information about courses that students take.

CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(10));

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 9

Adding Tuples

 We can insert a single tuple into the Students table as follows:

INSERT INTO Students (sid, name, login, DOB, gpa) VALUES (‘53666’, ‘Jones’, ‘jones@cs’, ‘2000-09-01’, 3.4);

sid name login DOB gpa 53666 Jones jones@cs 2000-09-01 3.4 53688 Smith smith@eecs 2001-08-01 3.2 53650 Smith smith@math 2000-10-10 3.8

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 10 Deleting Tuples

 We can delete all tuples satisfying some condition (e.g., name = Smith):

DELETE S FROM Students S WHERE S.name = ‘Smith’;

sid name login dob gpa 53666 Jones jones@cs 2000-09-01 3.4 53688 Smith smith@eecs 2001-08-01 3.2 53650 Smith smith@math 2000-10-10 3.8 Powerful variants of these commands are available; more later!

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 11

Updating Tuples

 We can modify the column values in an existing row: Old value UPDATE Students S SET S.gpa = S.gpa -1 WHERE S.sid = 53688;

sid name login Dob gpa 2.2 53666 Jones jones@cs 2000-09-01 3.4 53688 Smith smith@eecs 2001-08-01 3.2 53650 Smith smith@math 2000-10-10 3.8

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 12 Updating Tuples

 Where clause is applied first and determines which rows to be modified.

UPDATE Students S SET S.gpa = S.gpa - 0.1 WHERE S.gpa >= 3.3;

sid nam e login age gpa 3.3 53666 Jones jones@cs 18 3.4 53688 Sm ith sm ith@ eecs 18 3.2 53650 Sm ith sm ith@ m ath 19 3.8 3.7

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 13

Integrity Constraints (ICs)

 An IC is a condition specified on a database schema and restricts the data that can be stored in an instance of the database, such as domain constraints. . ICs are specified when schema is defined. . ICs are checked when relations are modified.  A legal instance of a relation is one that satisfies all specified ICs. . DBMS should not allow illegal instances.  If the DBMS checks ICs, stored data is more faithful to real-world meaning. . Avoids data entry errors, too!

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 14 Primary Key Constraints

 A set of fields is a candidate key for a relation if : 1. No two distinct tuples can have same values in all key fields, and 2. No subset of the set of fields in a key is a unique identifier for a tuple. . Part 2 false? A superkey (a set of fields containing a key). . If there’s >1 key for a relation, one of the keys is chosen (by DBA) to be the primary key.  E.g., sid is a key for Students. (What about name?) The set {sid, gpa} is a superkey.  Every relation is guaranteed to have a key.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 15

Primary and Candidate Keys in SQL

 UNIQUE: Candidate keys (you could have many)  PRIMARY KEY: one of candidate keys is chosen as the primary key.

CREATE TABLE Students (sid CHAR(20), name CHAR(30), login CHAR(20), age INTEGER, gpa REAL, UNIQUE (name, age), PRIMARY KEY (sid));

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 16 Name a constraint

 We could name a constraint in a table. If the constraint is violated, the constraint name is returned and can be used to identify the error. . CONSTRAINT constraint-name CREATE TABLE Students (sid CHAR(20), name CHAR(30), login CHAR(20), dob DATE, gpa REAL, UNIQUE (name, dob), CONSTRAINT StudentsKey PRIMARY KEY (sid)); Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 17

Primary and Candidate Keys in SQL

 “For a given student and course, there is a single grade.” vs. CREATE TABLE Enrolled “Students can take only one (sid CHAR(20) course, and receive a single grade cid CHAR(20), for that course; further, no two grade CHAR(2), students in a course receive the PRIMARY KEY (sid, cid) ); same grade.” CREATE TABLE Enrolled  Used carelessly, an IC can prevent (sid CHAR(20) the storage of database instances cid CHAR(20), that arise in practice! grade CHAR(2), PRIMARY KEY (sid), UNIQUE (cid, grade) ); Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 18 Foreign Keys, Referential Integrity

 Foreign key : Set of fields in one relation that is used to `refer’ to a tuple in another relation. (Must correspond to primary key of the second relation.) Like a `logical pointer’.  E.g. sid is a foreign key referring to Students: . Enrolled(sid: string, cid: string, grade: string) . If all foreign key constraints are enforced, referential integrity is achieved, i.e., no dangling references.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 19

Foreign Keys in SQL

 Only students listed in the Students relation should be allowed to enroll for courses. CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid, cid), FOREIGN KEY (sid) REFERENCES Students(sid) ); Enrolled sid cid grade Students 53666 Carnatic101 C sid name login age gpa 53666 Reggae203 B 53666 Jones jones@cs 18 3.4 53650 Topology112 A 53688 Smith smith@eecs 18 3.2 53666 History105 B 53650 Smith smith@math 19 3.8

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 20 Enforcing Integrity Constraints

 ICs are specified when a relation is created and enforced when a relation is modified.  Violating the primary key constraint, the transaction will be rejected. . Inserting a tuple with an existing sid. INSERT INTO Students (sid, name, login, age, gpa) VALUES (53688, ‘Mike’, ‘mike@ee’, 17, 3.4); . Inserting a tuple with primary key as null. INSERT INTO Students (sid, name, login, age, gpa) VALUES (null, ‘Mike’, ‘mike@ee’, 17, 3.4); . Updating a tuple with an existing sid. UPDATE Students S SET S.sid = 50000 WHERE S.sid=53688;

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 21

Enforcing Integrity Constraints

 Deletion of Enrolled tuples do not violate referential integrity, but insertions could. . Inserting a tuple with an un-exist sid in Students.

INSERT INTO Enrolled (sid, cid, grade) VALUES (51111, ‘Hindi101’, ‘B’);

 Insertion of Students tuples do not violate referential integrity, but deletions could.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 22 Ways to handle foreign key violations

 If an Enrolled row with un-existing sid is inserted, it is rejected.  If a Students row is deleted/updated, . Option 1: Delete/Update all Enrolled rows that refer to the deleted sid in Students (CASCADE). Both are affected . Option 2: Reject the deletion/updating of the Students row if an Enrolled row refers to it (NO ACTION ). [The default action for SQL]. None is affected. . Option 3: Set the sid of Enrolled to some existing (default) sid value in Students for every involved Enrolled row (SET NULL / SET DEFAULT ). Both are affected.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 23

Referential Integrity in SQL

 When a Students row CREATE TABLE Enrolled is deleted, all (sid CHAR(20), Enrolled cid CHAR(20), rows that refer to it are grade CHAR(2), to be deleted as well. PRIMARY KEY (sid,cid), FOREIGN KEY (sid)  When a Students sid is REFERENCES Students (sid) modified, the update is ON DELETE CASCADE ON UPDATE NO ACTION ); to be rejected if an Enrolled row refers to the modified Students row.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 24 The SQL Query Language

 A relational database query is a question about the data, and the answer consists of a new relation containing the result.  To find information about all 17 years old students : SELECT * sid name login Dob gpa FROM Students S 53666 Jones jones@cs 2000-09-01 3.4 WHERE year(S.dob)=2000; 53650 Smith smith@math 2000-10-10 3.8  To find just names and logins:

SELECT S.name, S.login name login FROM Students S Jones jones@cs WHERE year(S.dob)=2000; Smith smith@ee

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 25

Destroying and Altering Relations

 Destroys the relation Students. The schema information and the tuples are deleted.

DROP TABLE Students;

 The schema of Students is altered by adding a new field; every tuple in the current instance is extended with a null value in the new field.

ALTER TABLE Students ADD COLUMN firstYear: integer;

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 26 Views

 A view is just a relation, but we store a definition, rather than a set of tuples. CREATE VIEW YoungActiveStudents (name, grade) AS SELECT S.name, E.grade FROM Students S, Enrolled E WHERE S.sid = E.sid and S.age<21;  Views can be dropped using the DROP VIEW command. . How to handle DROP TABLE if there’s a view on the table? • View becomes invalid if its base table is dropped.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 27

Views and Security

 Views can be used to present necessary information (or a summary), while hiding details in underlying relation(s). . Given YoungStudents, but not Students or Enrolled, we can find students s who have are enrolled, but not the cid’s of the courses they are enrolled in.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 28