6 Slides Per Page In
Total Page:16
File Type:pdf, Size:1020Kb
Summary of Chapter 6 ■ Domain Constraints CMPT 354 ■ Referencial Integrity Database Systems and Structures ■ Assertions ■ Triggers Osmar R. Zaïane ■ Functional Dependencies Summer 1998 CMPT354 - Ch7 - summer98 CMPT354 - Ch7 - summer98 Contents Chapter 7 Objectives ■ Functional Dependencies (review) ■ Pitfalls in Relational Database Design Database Design ■ Decomposition ■ Normalization ■ Normalization using Functional Dependencies ■ Understand problems associated with Normalization using Multivalued Dependencies redundant information; Learn the purpose of normalization. CMPT354 - Ch7 - summer98 CMPT354 - Ch7 - summer98 Functional Dependencies Closure of a set of Functional Dependencies Functional dependencies play an important role in database design. Given a set F of functional dependencies, there are certain other functional dependencies that are logically implied by F. They describe relationships between attributes. The set of all functional dependencies logically implied by F is the +. They require that the value for a certain set of attributes determines closure of F denoted by F uniquely the value for another set of attributes. F+ can be found by applying Armstrong’s Axioms: α ⊆ β ⊆ Let R and R reflexifity if β ⊆ α , then α → β α → β γα → γβ α → β holds on R if augmentation if , then transitivity if α → β and β → γ , then α → γ ∀ ∈ α α β β t1, t2 r where r(R), if t1 [ ] = t2 [ ] then t1 [ ]=t2 [ ] these rules are sound and complete. CMPT354 - Ch7 - summer98 CMPT354 - Ch7 - summer98 1 Closure of a set of Functional Dependencies Closure of Attribute Sets We can further simplify the computation of F+ by using additional The closure of an attribute set α under F (denoted by α+) is the set of rules: attributes that are functionally determined by α under F. α → β is in F+ ⇔⊆βα+ union if α → β holds and α → γ ,then holds α → βγ α → γ + decomposition if holds, then α → β and hold Algorithm to compute α : pseudotransitivity if α → β and γβ → δ holds, then α γ → δ holds result := α; these rules can be inferred from Armstrong’s axioms. while (changes to result) do for each β → γ in F do begin ∪ γ if β ⊆ result then result:=result ; end CMPT354 - Ch7 - summer98 CMPT354 - Ch7 - summer98 Canonical Cover Database Design ■ Pitfalls in Relational Database Design To compute a canonical cover for F: ● Repetition of information ● Inability to represent certain information repeat ● Loss of information use the union rule to replace any dependencies in F α →β α→βα→ββ 11 and 12 with 112 α → β ■ To find desirable collection of relation schemas we should follow these find a functional dependency with an goals: extraneous attribute either in α or in β α → β ● avoid redundant data if an extraneous attribute is found, delete it from ● represent relationships among attributes until F does not change ● facilitate the checking of updates for violation of database integrity constraints CMPT354 - Ch7 - summer98 CMPT354 - Ch7 - summer98 Ship(S#, sname, status, city, P#, pname, colour, weight, qty, date) Consider a database containing shipments of parts from suppliers. Problems?? Ship(S#, sname, status, city, P#, pname, colour, weight, qty, date) INSERT We can not enter a new suplier in the DB until the supplier ships a part. S# sname status city P# pname colour weight qty date We can not enter a new part until it is shiped by one supplier. S1 Smith 20 London P1 Nut Red 12 200 980620 S1 Smith 20 London P1 Nut Red 12 700 980625 S2 Jones 10 Paris P3 Screw Blue 17 400 980620 DELETE S2 Jones 10 Paris P5 Bolt Green 17 300 980620 If we delete the only tuple of a particular supplier, we destroy not only S2 Jones 10 Paris P2 Screw Red 14 200 980621 the shipment information but also the information that the supplier is S3 Clark 20 Rome P1 Nut Red 12 300 980612 from a particular city. (example S5 and P1) S3 Clark 20 Rome P6 Cog Red 19 600 980612 S4 Blake 30 Athens P4 Cam Blue 12 200 980619 UDATE S4 Blake 30 Athens P1 Nut Red 12 300 980619 S4 Blake 30 Athens P3 Screw Blue 17 100 980620 The city of a supplier may appear many times in Ship. If we change the S5 Alex 10 Paris P1 Nut Red 12 250 980626 value of the city in one tuple but not all the other, it creates inconsistency in the database. CMPT354 - Ch7 - summer98 CMPT354 - Ch7 - summer98 2 The solution is to decompose Ship into three relations Shipment, Supplier(S#, sname, status, city) , pname, colour, weight) city → status Supplier, and Parts Parts(P# Supplier(S#, sname, status, city) Shipment(S#, P#, qty, date) Problems?? Parts(P#, pname, colour, weight) INSERT Shipment(S#, P#, qty, date) We can not enter the fact that a particular city has a particular status S# sname status city until we have a supplier from that city S1 Smith 20 London S2 Jones 10 Paris S# P# qty date DELETE S3 Clark 20 Rome S1 P1 200 980620 If we delete the only tuple with a particuar city, we destroy not only S4 Blake 30 Athens S1 P1 700 980625 the upplier information but also the information that that city has that S5 Alex 10 Paris S2 P3 400 980620 S2 P5 300 980620 particular status. P# pname colour weight S2 P2 200 980621 P1 Nut Red 12 S3 P1 300 980612 UDATE P2 Screw Red 14 S3 P6 600 980612 P3 Screw Blue 17 The city of a supplier may appear many times in Supplier (ther relation S4 P4 200 980619 P4 Cam Blue 12 still contains redundancy). If we change the status for Paris, we have to S4 P1 300 980619 P5 Bolt Green 17 S4 P3 100 980620 change it for all tuples his Paris as the city or we face inconsistency. P6 Cog Red 19 S5 P1 250 980626 CMPT354 - Ch7 - summer98 CMPT354 - Ch7 - summer98 The solution is to decompose Supplier into 2 relations Supplier and Status Relation Decomposition Supplier(S#, sname, city) Status(city, status) Parts(P#, pname, colour, weight) Shipment(S#, P#, qty, date) In order to solve the previous problems we dcomposed the S# sname city relations into smaller relations. S1 Smith London city status S2 Jones Paris London 20 S# P# qty date Paris 10 S3 Clark Rome S1 P1 200 980620 Relation decomposition can be dangerous we we can loose Rome 20 S4 Blake Athens S1 P1 700 980625 information. Athens 30 S5 Alex Paris S2 P3 400 980620 S2 P5 300 980620 Loss-less join decomposition P# pname colour weight S2 P2 200 980621 P1 Nut Red 12 S3 P1 300 980612 P2 Screw Red 14 S3 P6 600 980612 If we decompose a relation R into smaller relations, the join of P3 Screw Blue 17 S4 P4 200 980619 those relations should return R, not more, not less. P4 Cam Blue 12 S4 P1 300 980619 P5 Bolt Green 17 S4 P3 100 980620 P6 Cog Red 19 S5 P1 250 980626 CMPT354 - Ch7 - summer98 CMPT354 - Ch7 - summer98 Normalization First Normal Form Normilization is a technique for producing a set of relations An unnormalized form: a table that contains one or more with desirable properties, given the data requirments of an repeating groups enterprise. First Normal Form: a relation in which the intersection of each row and column contains one and only one value. 5NF 4NF BCNF 3NF 2NF 1NF CMPT354 - Ch7 - summer98 CMPT354 - Ch7 - summer98 3 Second Normal Form Third Normal Form Transitive dependency: A condition where A, B and C are attributes of a relation such that if A →→ B and B C, then C is transitively dependent on A via B. Second Normal Form: a relation that is in first normal form and every non-primary key attribute is fully functionally dependent Third Normal Form: a relation that is in first and second normal on the primary key. form, and in which no non-primary key attribute is transitively dependent on the primary key. CMPT354 - Ch7 - summer98 CMPT354 - Ch7 - summer98 Boyce/Codd Normal Form Designing a Student Database -1 1- Students take courses stud-ID 2- Students typically take more Determinant: an attribute or a group of attributes on which stud-name than one course some other attribute is fully functionally dependent. coop-dept 3- Students can fail courses and coop-advisor can repeat the same course in course different semesters => Students Boyce/Codd Normal Form: a relation is in BCNF if and only if credit repeated for each can take the same course more every determinant is a candidate key. semester course taken than once. grade 4- Students are assigned a grade course-room repeated for each for each course they take. instructor time a particular course is attempted instructor-office The data that needs to be retained has repeating groups Procedure: Remove repeating groups by adding extra rows to hold the CMPT354 - Ch7 - summer98 CMPT354 - Ch7 - summer98 repeated attributes. => Add redundancy. Designing a Student Database -2 Designing a Student Database -3 Student_course(stud-ID, stud-name, coop-dept, coop-advisor, course, credit, 1- stud-name, coop-dept, coop-advisor are dependent only semester, grade, course-room, instructor, instructor-office) upon stud-ID 2- credit is only dependent upon course and is independent of which semester it is offered and which student is taking it. 1NF: Tables should have no repeating groups 3- course-room, instructor and instructor-office only depend upon the course and the semester and are independent of which student is taking the course. 4- Only grade is dependent upon all 3 parts of the original key. •Redundancy •Insert anomalies In 1NF there are non-key attributes which depend •Delete anomalies on only part of the key.