Database Management Systems Chapter 3 Part 1 Why Study The
Total Page:16
File Type:pdf, Size:1020Kb
Database Management Systems Chapter 3 Part 1 The Relational Model Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Study the Relational Model? Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. Simple data representation and it can express complex queries. Recent competitor: object-oriented model . ObjectStore, Versant, Ontos . A synthesis emerging: object-relational model • Informix Universal Server, UniSQL, O2, Oracle, DB2 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 2 Relational Database: Definitions Relational database: a set of relations with distinct names. Relation: made up of 2 parts: . Instance : a table, with rows and columns. #Rows = cardinality, #fields = degree. Schema : specifies name of relation, plus name and domain/type of each column. • E.G. Students(sid: string, name: string, login: string, age: integer, gpa: real). Can think of a relation as a set of rows or tuples (i.e., all rows are distinct). Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 3 Example Instance of Students Relation sid name login dob gpa 53666 Jones jones@cs 2000-09-01 3.4 53688 Smith smith@eecs 2001-08-01 3.2 53650 Smith smith@math 2000-10-10 3.8 Cardinality = 3, degree = 5, all rows distinct The order of fields does not matter if the fields are named. A relation is a set of rows, so the order of rows is not important. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 4 Relational Query Languages A major strength of the relational model: supports simple, powerful querying of data. Queries can be written intuitively, and the DBMS is responsible for efficient evaluation. The key: precise semantics for relational queries. Allows the optimizer to extensively re-order operations, and still ensure that the answer does not change. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 5 The SQL Query Language Developed by IBM (system R) in the 1970s Need for a standard since it is used by many vendors Standards: . SQL-86 . SQL-89 (minor revision) . SQL-92 (major revision) . SQL-99 (major extensions, current standard) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 6 Creating Relations in SQL Creates the Students relation. Observe that the type (domain) of each field is specified, and enforced by the DBMS whenever tuples are added or modified. CREATE TABLE Students string type with (sid CHAR(20), different length name CHAR(30), login CHAR(20), DOB DATE, gpa REAL); Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 7 Creating Relations in SQL As another example, the Course table holds information about courses that students take. CREATE TABLE Course (cid CHAR(20), cname CHAR (20), description CHAR( 100) DEFAULT ‘Course Description’ ); Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 8 Creating Relations in SQL As another example, the Enrolled table holds information about courses that students take. CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(10)); Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 9 Adding Tuples We can insert a single tuple into the Students table as follows: INSERT INTO Students (sid, name, login, DOB, gpa) VALUES (‘53666’, ‘Jones’, ‘jones@cs’, ‘2000-09-01’, 3.4); sid name login DOB gpa 53666 Jones jones@cs 2000-09-01 3.4 53688 Smith smith@eecs 2001-08-01 3.2 53650 Smith smith@math 2000-10-10 3.8 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 10 Deleting Tuples We can delete all tuples satisfying some condition (e.g., name = Smith): DELETE S FROM Students S WHERE S.name = ‘Smith’; sid name login dob gpa 53666 Jones jones@cs 2000-09-01 3.4 53688 Smith smith@eecs 2001-08-01 3.2 53650 Smith smith@math 2000-10-10 3.8 Powerful variants of these commands are available; more later! Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 11 Updating Tuples We can modify the column values in an existing row: Old value UPDATE Students S SET S.gpa = S.gpa -1 WHERE S.sid = 53688; sid name login Dob gpa 2.2 53666 Jones jones@cs 2000-09-01 3.4 53688 Smith smith@eecs 2001-08-01 3.2 53650 Smith smith@math 2000-10-10 3.8 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 12 Updating Tuples Where clause is applied first and determines which rows to be modified. UPDATE Students S SET S.gpa = S.gpa - 0.1 WHERE S.gpa >= 3.3; sid nam e login age gpa 3.3 53666 Jones jones@cs 18 3.4 53688 Sm ith sm ith@ eecs 18 3.2 53650 Sm ith sm ith@ m ath 19 3.8 3.7 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 13 Integrity Constraints (ICs) An IC is a condition specified on a database schema and restricts the data that can be stored in an instance of the database, such as domain constraints. ICs are specified when schema is defined. ICs are checked when relations are modified. A legal instance of a relation is one that satisfies all specified ICs. DBMS should not allow illegal instances. If the DBMS checks ICs, stored data is more faithful to real-world meaning. Avoids data entry errors, too! Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 14 Primary Key Constraints A set of fields is a candidate key for a relation if : 1. No two distinct tuples can have same values in all key fields, and 2. No subset of the set of fields in a key is a unique identifier for a tuple. Part 2 false? A superkey (a set of fields containing a key). If there’s >1 key for a relation, one of the keys is chosen (by DBA) to be the primary key. E.g., sid is a key for Students. (What about name?) The set {sid, gpa} is a superkey. Every relation is guaranteed to have a key. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 15 Primary and Candidate Keys in SQL UNIQUE: Candidate keys (you could have many) PRIMARY KEY: one of candidate keys is chosen as the primary key. CREATE TABLE Students (sid CHAR(20), name CHAR(30), login CHAR(20), age INTEGER, gpa REAL, UNIQUE (name, age), PRIMARY KEY (sid)); Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 16 Name a constraint We could name a constraint in a table. If the constraint is violated, the constraint name is returned and can be used to identify the error. CONSTRAINT constraint-name CREATE TABLE Students (sid CHAR(20), name CHAR(30), login CHAR(20), dob DATE, gpa REAL, UNIQUE (name, dob), CONSTRAINT StudentsKey PRIMARY KEY (sid)); Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 17 Primary and Candidate Keys in SQL “For a given student and course, there is a single grade.” vs. CREATE TABLE Enrolled “Students can take only one (sid CHAR(20) course, and receive a single grade cid CHAR(20), for that course; further, no two grade CHAR(2), students in a course receive the PRIMARY KEY (sid, cid) ); same grade.” CREATE TABLE Enrolled Used carelessly, an IC can prevent (sid CHAR(20) the storage of database instances cid CHAR(20), that arise in practice! grade CHAR(2), PRIMARY KEY (sid), UNIQUE (cid, grade) ); Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 18 Foreign Keys, Referential Integrity Foreign key : Set of fields in one relation that is used to `refer’ to a tuple in another relation. (Must correspond to primary key of the second relation.) Like a `logical pointer’. E.g. sid is a foreign key referring to Students: . Enrolled(sid: string, cid: string, grade: string) . If all foreign key constraints are enforced, referential integrity is achieved, i.e., no dangling references. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 19 Foreign Keys in SQL Only students listed in the Students relation should be allowed to enroll for courses. CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid, cid), FOREIGN KEY (sid) REFERENCES Students(sid) ); Enrolled sid cid grade Students 53666 Carnatic101 C sid name login age gpa 53666 Reggae203 B 53666 Jones jones@cs 18 3.4 53650 Topology112 A 53688 Smith smith@eecs 18 3.2 53666 History105 B 53650 Smith smith@math 19 3.8 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 20 Enforcing Integrity Constraints ICs are specified when a relation is created and enforced when a relation is modified. Violating the primary key constraint, the transaction will be rejected. Inserting a tuple with an existing sid. INSERT INTO Students (sid, name, login, age, gpa) VALUES (53688, ‘Mike’, ‘mike@ee’, 17, 3.4); . Inserting a tuple with primary key as null. INSERT INTO Students (sid, name, login, age, gpa) VALUES (null, ‘Mike’, ‘mike@ee’, 17, 3.4); . Updating a tuple with an existing sid. UPDATE Students S SET S.sid = 50000 WHERE S.sid=53688; Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 21 Enforcing Integrity Constraints Deletion of Enrolled tuples do not violate referential integrity, but insertions could. Inserting a tuple with an un-exist sid in Students. INSERT INTO Enrolled (sid, cid, grade) VALUES (51111, ‘Hindi101’, ‘B’); Insertion of Students tuples do not violate referential integrity, but deletions could. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 22 Ways to handle foreign key violations If an Enrolled row with un-existing sid is inserted, it is rejected. If a Students row is deleted/updated, . Option 1: Delete/Update all Enrolled rows that refer to the deleted sid in Students (CASCADE). Both are affected . Option 2: Reject the deletion/updating of the Students row if an Enrolled row refers to it (NO ACTION ).