(BCNF) Decomposition of a Relation Scheme Example (Same
Total Page:16
File Type:pdf, Size:1020Kb
Functional Dependencies (Review) • A functional dependency X → Y holds over relation Schema Refinement schema R if, for every allowable instance r of R: and Normalization t1 ∈ r, t2 ∈ r, π (t1) = π (t2) πX πX implies Y (t1) = Y (t2) (where t1 and t2 are tuples;X and Y are sets of attributes) CS 186, Fall 2002, Lecture 6 R&G Chapter 15 • In other words: X → Y means Given any two tuples in r, if the X values are the same, then the Y values must also be the same. (but not vice versa) Nobody realizes that some people • Can read “→” as “determines” expend tremendous energy merely to be normal. Albert Camus Normal Forms Boyce-Codd Normal Form (BCNF) • Back to schema refinement… • Reln R with FDs F is in BCNF if, for all X → A in F+ • Q1: is any refinement is needed??! –A ∈ X (called a trivial FD), or • If a relation is in a normal form (BCNF, 3NF etc.): – X is a superkey for R. – we know that certain problems are avoided/minimized. • In other words: “R is in BCNF if the only non-trivial FDs – helps decide whether decomposing a relation is useful. over R are key constraints.” • Role of FDs in detecting redundancy: • If R in BCNF, then every field of every tuple records – Consider a relation R with 3 attributes, ABC. information that cannot be inferred using FDs alone. • No (non-trivial) FDs hold: There is no redundancy here. – Say we know FD X → A holds this example relation: X Y A •Given A → B: If A is not a key, then several tuples could have the same A value, and if so, they’ll all have the same B value! • Can you guess the value of the x y1 a • 1st Normal Form – all attributes are atomic missing attribute? x y2 ? st nd rd • 1 ⊃2 (of historical interest) ⊃ 3 ⊃ Boyce-Codd ⊃ … •Yes, so relation is not in BCNF Decomposition of a Relation Scheme Example (same as before) S N L R W H • If a relation is not in a desired normal form, it can be 123-22-3666 Attishoo 48 8 10 40 decomposed into multiple relations that each are in that normal form. 231-31-5368 Smiley 22 8 10 30 131-24-3650 Smethurst 35 5 7 30 • Suppose that relation R contains attributes A1 ... An. A Hourly_Emps decomposition of R consists of replacing R by two or more 434-26-3751 Guldu 35 5 7 32 relations such that: 612-67-4134 Madayan 35 8 10 40 – Each new relation scheme contains a subset of the attributes of R, and •SNLRWH hasFDs S → SNLRWH and R → W – Every attribute of R appears as an attribute of at least one of the new relations. • Q: Is this relation in BCNF? No, The second FD causes a violation; W values repeatedly associated with R values. 1 Decomposing a Relation Problems with Decompositions • Easiest fix is to create a relation RW to store these associations, and to remove W from the main • There are three potential problems to consider: schema: 1) May be impossible to reconstruct the original relation! S N L R H (Lossiness) 123-22-3666 Attishoo 48 8 40 • Fortunately, not in the SNLRWH example. R W 231-31-5368 Smiley 22 8 30 2) Dependency checking may require joins. 810 131-24-3650 Smethurst 35 5 30 • Fortunately, not in the SNLRWH example. 57 434-26-3751 Guldu 35 5 32 3) Some queries become more expensive. 612-67-4134 Madayan 35 8 40 Wages • e.g., How much does Guldu earn? Hourly_Emps2 Tradeoff: Must consider these issues vs. •Q: Are both of these relations are now in BCNF? redundancy. •Decompositions should be used only when needed. –Q: potential problems of decomposition? An Aside – Natural Join Lossless Decomposition (example) S N L R H • Natural Join is a fundamental operator of relational 123-22-3666 Attishoo 48 8 40 algebra R W 231-31-5368 Smiley 22 8 30 ba 810 131-24-3650 Smethurst 35 5 30 ba • Semantics of R ba S are: 57 – Compute Cartesian Product of R and S 434-26-3751 Guldu 35 5 32 – Select out those tuples where the common attributes of R 612-67-4134 Madayan 35 8 40 and S have the same values – Keep all unique attributes of these tuples plus one copy of S N L R W H each of the common attributes. 123-22-3666 Attishoo 48 8 10 40 • More on this in the next lecture 231-31-5368 Smiley 22 8 10 30 = 131-24-3650 Smethurst 35 5 7 30 434-26-3751 Guldu 35 5 7 32 612-67-4134 Madayan 35 8 10 40 Lossy Decomposition (example) Lossless Join Decompositions A B C A B B C 1 2 3 12 2 3 • Decomposition of R into X and Y is lossless-join w.r.t. 4 5 6 a set of FDs F if, for every instance r that satisfies F: 45 5 6 π π 7 2 8 X(r) Y (r) = r 2 8 72 ⊆ π π • It is always true that r X (r) Y (r) A → B; C → B – In general, the other direction does not hold! If it does, the decomposition is lossless-join. A B C A B B C • Definition extended to decomposition into 3 or more 123 relations in a straightforward way. 12 2 3 456 ba 5 6 = • It is essential that all decompositions used to deal with 45 728 redundancy be lossless! (Avoids Problem #1) 2 8 72 1 2 8 7 2 3 2 More on Lossless Decomposition Lossless Decomposition (example) • The decomposition of R into X and Y is lossless with respect to F if and only if the A B C A C B C closure of F contains: 1 2 3 1 3 2 3 4 5 6 5 6 X ∩ Y → X, or 4 6 7 2 8 7 8 2 8 X ∩ Y → Y in example: decomposing ABC into AB and BC is A → B; C → B lossy, because intersection (i.e., “B”) is not a key of either resulting relation. A C B C A B C 1 3 1 2 3 • Useful result: If W → Z holds over R and W ∩ Z is 2 3 4 6 ba 4 5 6 empty, then decomposition of R into R-Z and WZ is 5 6 = 7 8 7 2 8 2 8 loss-less. But, now we can’t check A → B without doing a join! Dependency Preserving Decomposition Dependency Preserving Decompositions (Contd.) • Dependency preserving decomposition (Intuitive): • Decomposition of R into X and Y is dependency – If R is decomposed into X, Y and Z, and we ∪ + + preserving if (FX FY ) = F enforce the FDs that hold individually on X, on Y – i.e., if we consider only dependencies in the closure F + that and on Z, then all FDs that were given to hold can be checked in X without considering Y, and in Y without on R must also hold. (Avoids Problem #2 on considering X, these imply all dependencies in F +. our list.) • Important to consider F + in this definition: → → → • Projection of set of FDs F : If R is decomposed into –ABC, A B, B C, C A, decomposed into AB and BC. – Is this dependency preserving? Is C → A preserved????? X and Y the projection of F on X (denoted FX ) is the set of FDs U → V in F+ (closure of F , not just F ) such •note: F+ contains F ∪ {A → C, B → A, C → B}, so… that all of the attributes U, V are in X. (same holds → → → → for Y of course) •FAB contains A B and B A; FBC contains B C and C B •So, (FAB ∪ FBC)+ contains C → A Decomposition into BCNF • Consider relation R with FDs F. If X → Y violates BCNF, BCNF and Dependency Preservation decompose R into R - Y and XY (guaranteed to be loss-less). – Repeated application of this idea will give us a collection of relations • In general, there may not be a dependency preserving that are in BCNF; lossless join decomposition, and guaranteed to decomposition into BCNF. terminate. – e.g., CSZ, CS → Z, Z → C → → → – e.g., CSJDPQV, key C, JP C, SD P, J S – Can’t decompose while preserving 1st FD; not in BCNF. – {contractid, supplierid, projectid,deptid,partid, qty, value} • Similarly, decomposition of CSJDPQV into SDP, JS and –To deal with SD → P, decompose into SDP, CSJDQV. CJDQV is not dependency preserving (w.r.t. the FDs –To deal with J → S, decompose CSJDQV into JS and CJDQV JP → C, SD → P and J → S). – So we end up with: SDP, JS, and CJDQV • {contractid, supplierid, projectid,deptid,partid, qty, value} • Note: several dependencies may cause violation of BCNF. The – However, it is a lossless join decomposition. order in which we ``deal with’’ them could lead to very different sets of relations! – In this case, adding JPC to the collection of relations gives us a dependency preserving decomposition. • but JPC tuples are stored only for checking the f.d. (Redundancy!) 3 Third Normal Form (3NF) What Does 3NF Achieve? → • Reln R with FDs F is in 3NF if, for all X → A in F+ • If 3NF violated by X A, one of the following holds: – X is a subset of some key K (“partial dependency”) A ∈ X (called a trivial FD), or • We store (X, A) pairs redundantly. X is a superkey of R, or • e.g.