WHAT DOES BOYCE-CODD NORMAL FORM DO?+
Philip A. Bernstein Nathan Goodman
Aiken Computation Laboratory Harvard University Cambridge, MA 02138
Abstract The benefits of Boyce Codd normal form (BCNF) are well understood at an intuitive level. BCNF Normalization research has concentrated on defining and its precursor, third normal form (3NF), were normal forms for database schemas and developing introduced by Codd to eliminate "anomalous" up- efficient algorithms for attaining these normal date* behavior of relations. BCNF accomplishes forms. It has never been proved that normal forms this by "placing independent relationships into are good, i.e. that normal forms are beneficial to independent relations." (A more precise character- database users. This paper considers one of the ization appears later.) BCNF is widely recommended earliest normal forms (Boyce Codd normal form in practice for this purpose [Date, Mar], and there [Cod21) whose benefits are intuitively understood. is little doubt that BCNF is a good normal form in We formalize these benefits and attempt to prove many practical situations. that the normal form attains them. Instead we prove the opposite: Boyce-Codd normal fomn fails This paper formalizes the intuitive benefits to meet its goals except in trivia2 eases. This of BCNF and attempts to prove that these benefits counterintuitive result is a consequence of the are attained. The attempt fails. Instead we prove "universal relation assumption" upon which normal- that BCNF does not meet its goals, except in trivial ization theory rests. Normalization theory will eases. This result is quite disturbing because it remain an isolated theoretical area, divorced from contradicts BCNF's intuitive appeal and empirically database practice, until this assumption is observed benefits. Either the theory is wrong, or circumvented. database practitioners are making a serious error by using BCNF. Our opinion is that the theory is at fault--that normalization theory, as currently constituted, is inadequate to characterize the 1. Introduction practical use and benefits of normal forms. The source of inadequacy can be traced to the "universal Normalization is a schema design process. The relation assumption" upon which normalization theory input is a schema (i.e., database description) is based. that describes an application; the output is another schema that describes the same application, but in On the surface, this paper is an analysis of a "better" way. Normalization theory provides Boyce Codd normal form. Cur deeper purpose is to mathematical tools for defining schemas, for draw attention to a missing link in normalization characterizing the application that a schema re- theory. For normalization theory to be well presents, and for specifying the form a "good" grounded in the database problems it intends to schema must satisfy. Most normalization research solve, it must be possible to use the theory to has focused on defining good forms for schemas prove the benefits of normal forms. Unfortunately, (MoMnal forms) and developing efficient algorithms it appears that normalization theory, as currently for attaining these normal forms. However, normal- formulated, cannot do this.. Until this missing ization theory does not provide tools for explaining link is filled in, normalization theory will be why normal forms are good. Presumably, normal forms incapable of explaining what Boyce Codd normal help database users in some way, but these benefits form does. have never been characterized within the formal framework of normalization theory. This paper attempts such a characterization for one particular normal form, namely Boyce Codd normal form.
t This work was supported by the National Science Foundation under Grant Number MCS-77-05314 and by * the Advanced Research Projects Agency of the The word "update" is used generically to denote Department of Defense, Contract Number N00039-78- insertions, deletions, or replacements (i.e., G-0020. modification) of tuples in a relation.
w1534-7/80/0000-0245$00.75 0 1980 IEEE 245 2. Terminology D
In formalizing relational databases we distin- 2.2 Functional Dependencies guish the time-invariant description of a relation VS. its contents at a particular moment. The A functional dependency (abbr. FD) is a state- description of a relation is called a relation ment of the form f:X+Y, where f is the name of schema and consists of a set of attributes, and a the FD, and X,Y are sets of attributes. f is set of functional dependencies. R=
Figure 2.3. Inference Rules for Relation Schema: as defined in Fig. 2. Functional Dependencies. Relation: R
E# JC D# M# CT Within the confines of a single relation schema, F+ is precisely the set of FDS implied by 1 a x 11 g F: i.e., fE F+ iff f holds in every state of 2 c x 11g R in which F holds [Arm]. This fact is called 3 a y 12 n the coi??pZeteness property of FD-closure [BFH]. 4 b x 11 g 5 b ~12 n FDS also enjoy the uniqueness property within 6 c y 12 n a single relation schema, which states that syn- 7 a z 13 n tactically identical FDs are semantically equi- 8 c z 13 n valent as well [Bern].
Figure 2.2. An Example Relation. 2.3 The Universal Relation Assumption: Schema Equivalence Notationally, states of relation schema R Intuitively, two schemas are equivalent if are represented by R with possible superscript. they represent the same external application. For Elements of R (called tuples) are denoted r relation schemas this notion may be formalized in with possible superscript. If Xc_ATTR, we use terms of consistent states: relation schema to denote the projection of r on the r [xl R=
246 to substitute a "reduced" set of FDs for the ones is precisely representable by the set REP(D) = supplied by the user. Without this substitution, CUIgDEDOMQ), U= *;=l D(RiIl. Two schemas -D and many schemas cannot be normalized due to redun- dancies in the way FDs are expressed in the D’ represent the same external application, i.e., schema [Bern]. are equivalent under the UR-assumption if REP(D) = REP(D'). But REP(D) =DOM(U=
$ = f.mm? = <{C~URSE,PROF,TA,DEPT~, A special case of schema equivalence arises {COURSE+PROF; COURSE+TA; frequently in this paper. Let U=(ATTR,F> be a COURSE+DEPT; PROF'DEPT; (universal) relation schema, and-let D={Ri = TA+DEPT]>j
Figure 2.4. Completeness and Uniqueness Do Not The following facts are well known: Hold in Multi-Relation Databases. (i) J and P are total functions.
This problem is quite serious. It means that (ii) J(P(U)) 2U. the definition of schema equivalence used to con- (iii) P(J(D)) = D. struct a normalized schema g is not semantically (iv) UCU' implies P(U)
Let g= {Ri=(ATTRi,Fi>li=l,...,n) be a data- A stronger form of representation is used by some authors [Codll, [Risl. DRepd-represents g if base schema and let D be a database state for D. g RepZ-represents c and for all UEDOM(U_I, u satisfies the universal relation asswnption (0; is UR-consistent) if there exists a relation state J(P(U11 =U. Repl-representation implies that DOMQ) and DCM(Uj are isomorphic, with J and U for schema v=(ATTR,@ SUCK that UIATTRiI=D(Ri) P the respective-isomorphisms. Hence every con- D is COnSiStsnt if it is UR- for i=l,...,n. sistent state of u can be exactly reconstructed D(Ri)fDOM(Ri) for i=l,...,n. consistent, and from the corresponding state of g. DOM(D) denotes the set of all consistent states of i-i-' 2.5 Boyce Codd Normal Form By requiring that database states satisfy the UR-assumption, we can attain the desired notion of Let R=
247 if X'CX, because it holds in every state of a In this section we formalize this motivation, relation independent of F. and prove that BCNF relation sckemas attain it. In Sections 4 and 5 we will consider multi-relation 5 is in Boyce Co& Normal Form (abbr. BCNF) database schemas. if for all non-trivial FDs X+Y in F+, X is a superkey. In extending this notion to database schemas, we must be conscious of the UR-assumption. 3.1 Update Operations We say that Ri=
Let -R=ZATTR,F> be a schema. The update 3. Goals of Boyce Codd Normal Form operations we define on R are + (insertion), - (deletion), and & (replacement). Third normal form and later Boyce Codd normal form were introduced to eliminate anomalous update If REDOM@), then behavior. Let us quote from [Cod11 the motivation RU {r), if consistent +(R,r) = for these normal forms. R , otherwise
Looking at a sample instantaneous -(R,r) = R-{r) tabulation of R (Figure 2.2) the un- desirable properties of the R schema if consistent become immediately apparent. -We observe, , otherwise. for example, that, if the manager of department y should change, more than one tuple has to be updated. The actual 3.2 Insertion and Deletion Anomalies number of tuples to be updated can, and usually will, change with time. A Intuitively, R exhibits an insertion anomaly similar remarkapplies if department x if for some states R and tuples r, the operation is switched from government work (con- + (R,r) affects one collection of FDs, while for tract type g) to non-government work c-her states and tuples the insertion affects a (contract type n). different collection of FDs. The goal of normal- ization (with respect to insertions) is to eliminate Deletion of the tuple for an employee such anomalies. has two possible consequences: deletion of the corresponding department infor- In formalizing this notion a technical problem mation if his tuple is the sole one re- If R=@, maining just prior to deletion, and non- surfaces. the insertion of any tuple must affect every FD, while if R#@, the insertion deletion of the department information of any tuple already in R does not affect any FD. otherwise. Consequently, every schema exhibits insertion ano- malies in this trivial sense (unless ATTR=@). A Inserting an employee tuple has the complementary similar problem arises with deletions: if R={r}, problem: if the employee is the first member of a the deletion of r must affect every FD, while department, we must simultaneously insert new M# for any R, the deletion of a tuple not in R cannot and CT information for that department; inserting affect any FD. the second (third, etc.) employee into that depart- ment has no such requirement. A second problem arises with respect to trivial FDs. Since every attribute functionally determines Abstracting from the above discussion, we find itself, the "collection of FD?." affected by an up- p- to be an undesirable schema because the precise date will in general vary by these trivial FDs at effects of a given update operation cannot be pre- least. We therefore choose to discard trivial FDs dicted simply by examining the schema. One must in this context. examine the state of E to see the effects that a given update produces. When all of the effects of Formally, we say that +(R,r) affects f:X+Y an update can be determined by examining the schema if R[XUYl# (+(R,r)) [XUYI, and we define alone, we say that the update is syntacticaZZy pre- dietabLe (more precise definitions will appear Affect(+g) = {fEF+lf is non-trivial, and for some later). R#@, and some r, +(R,r) affects f1 In state R for example, replacing department y's manager by another value implies updating a NoAffect(+FJ) = {fE F+(f is non-trivial, and for some (syntactically) unpredictable number of tuples. R#@, and some r, +ULr) #R, Inserting an employee into (a new) department v yet +(R,r) does not affect f} requires creating new department-related data, while inserting an employee into (an existing) R is free of insertion m20VZaZieS iff Affect(+R) n department x does not have this effect. soAffect ~$5. The definition of deletion an% alies is similar and appears in Figure 3.1.
248 Affect(-g) = {fEF+lf is non-trivial, and for some leaving all other values unchanged. This replace- R and r, R#Ir}, yet -(R,r) ment was judged to be anomalous because an unpre- affects f1. dictable nwnber of tuples might have to be replaced to achieve this effect. If we scrutinize this NoAffect ={fEF+(f is non-trivial, and for some motivation, however, difficulties emerge. R and r, Rf {r} and -(R,r) #R, yet -(R,r) does not affect f). The operation used in the example is not a single-tuple replacement of the type defined in -R is free of dSlSi%On anOf&iSS iff Section 3.1. Instead, it attempts to modify a set Affect(-E) nNoAffect(-g) =@. of tuples that satisfy a boolean qualification, "D# = y" in this case. Arbitrarily.many tuples may Figure 3.1. Definition of Deletion Anomaly. satisfy an arbitrary qualification. So, nonpre- dictability of this sort can hardly be called "anomalous*'. Apparently, Codd did not have such It is conjectured in [Cod21 that BCNF schemas general operations in mind. are the only ones that can avoid insertion and deletion anomalies. Now that we have formalized The operation that seems to be intended is the these anomalies, we can prove this conjecture true. following. Let R=
only if: We prove that if R is not in BCNF, then NoAffect(+l+i)#@. If E is not in BCNF, 4. Update Operations Under UR-Assumption then there exists a non-trivial FD f:X+Y in F+ for which X is not a superkey. We will construct In the following two sections we attempt to an insertion that does not affect f. extend Theorem 1 to multi-relation database schemas. This attempt will fail, because of the be any consistent state containing Let R' UR-assumption. Instead, we shall prove that BCNF such that two (or more) tuples r and r' database schemas are not free of insertion and r[Xl =r'tXl. Such a state must exist by the com- deletion anomalies except in trivial cases. pleteness of Ff: since X+ATTR$?F+, there must exist a state R' in which F+ holds, but 4.1 Problems in Preserving UR-Consistency X+ATTR does not hold [Theorem 3, BFHI; for X+ATTR not to hold in R', distinct tuples r Recall the example of Section 3. R is not in and r' must be present in R' with r[X) =r'[Xl. BCNF and therefore exhibits update anomglies; the remedy recommended in [Cod11 is to decompose R Let R=R'-{r). R#@, +(R,r)=R'#R, yet into two relation schemas, Rl and R2, which-are Hnece, NoAffect #PI, +(R,r) does not affect f. in BCNF (see Fig. 4.1). No&e that the database as desired. schema D= (R1,R2} Rep4-represents 3 therefore g and E have equivalent representational power, (ii) The result for deletions follows by 0 provided the database system only permits UR- similar proof. consistent states of g to occur.
This leaves us the problem of designing update 3.3 Replacement Anomalies operations that preserve UR-consistency. For example, to insert
249 deleted too. In these two cases, the requirement 4.2 Update Operations that Preserve UR-Consistency of UR-consistency dictates a "natural" semantics for the update. However, in other cases, the Let D= {P,l ,...,%I be a database schema. choice of a reasonable semantics is less con- The update-operations defined on 2 are named strained. +Eit -Ri# and 'Ri for i=l,...,n. (We include the schema name, _Ri, as part of the operation name For example, suppose we want to insert for notational clarity.) If DEDOM@), then <9, d, v> into R~. One way to preserve UR- consistency is to insert <9, d, v> into Rl and +R. (D,r. denotes insertion of ri into D(Ri),
250 into a state that does not perform the insertion. COnSiStent states of a schema, but it is technically (A more precise statement is given in Appendix II.) better to quantify over reachabZe states only; in light of Theorem II.1 (see Apendix II), though, The SetR~tiCS of +Ri, -Fit and IRi can HOW this distinction has no impact. Second, since we be defined. Let DEDOM@). +Ri(D,ri) =u?z~ state may assume that D is in BCNF, we may interpret that nonextraneously performs +Ri (D,ri) a -3 and Affect and NoAffect as sets of relation s&emus & Ri have analogous definitions.. In general, instead of FDs: e.g., for any Ri, R. ED, +Ri (Dtri) and &FLi(D,ritrj) have many possible Ej EAffect(+Ei) iff every non-trivia '-1 FE in meanings and we treat these operations as non- F'[ATTRjI EAffect(+fji), because all external effects deterministic functions. By contrast, -zi (D.ri) is uniquely specified by this definition, a fact of +Ri are themselves insertions. Figure 5,l we prove in Appendix II. presents the definitions of Affect, NoAffect, and anomalies as used in this section.
.Database Schema: Affect(+zR) = (Ej EDI for some D other than the g= $=<{E#,JC,D#},{E#+JC,D#}>, empty state, for some tuple R2=<{D#,M#,CT},{D#'M#,CT;M#'D#,CT}>) ri, and for some meaning, +Ri(D,ri) affects Rj' Database: NoAffect = CRj ECj for some D other than the E# JC D# empty state, for some tuple ri, and for some meaning, lax +Ei(D,ri) #D, yet +Ri(D,ri) 2 c x does not affect Rj} 3 a Y 4 b x R1 = 5 b y 8 R2 = Affect(-Ri) = {Ej EDI for some D and ri, D(Ei)# (r.1, yet -Ri(D,ri) affects 6 c Y R. j 7 a 2 -1 8 c z l-----l NoAffect(-Ri) = iEj EDI for some D and ri, D(Ri)# are in BCNF and g Rep4-represents tril, ad -Ri (Dtri) #D, 9 and 52 f\l of Fig. 2.1. yet -Ei (D, ri) does not - affect Rjl. Figure 4.1. A Normalized Database Schema. D is free of insertion anomalies iff for all Ri E 2, Affect(+Ri) n NoAffect =!a 5. BCNF Does Very Little D is free of deZetion anomaZies iff for all _RiQ, Affect(-Ei) n NoAffect(-_Ri) =pI Having developed an update semantics that preserves UR-consistency, we are ready to study Figure 5.1. Definitions of Anomalies for UR There are two normalization in this context. Environment. issues: (1) characterizing database schemas that avoid insertion and deletion anomalies: and (2) the schema design question--characterizing A graphic representation of D helps relation schemas U for which there exists a data- characterize the effects of insertions. g, yet avoids in- Let G(c) base schema g that represents be an undirected graph whose Vertex set equals g, sertion and deletion anomalies. and whose e&e set contains (Rip Rj) iff ATTRinATTRj 10. 5.1 Database Schemas that Avoid Anomalies Lemma 1. (i) Affect(+R.) =tR.EDlR. and R. Throughout this section let D= {Rl,...,s}, are connected by a path in c&)).-' - -I -1 and U=(ATTR,F>, such that g Rep2-represents --u I? D is to avoid update anomalies it is (ii) NoAffect(+Ri) = {zj EgjATTRj +ATTRiEF+}. necessary for D to be in BCNF: otherwise the "internal effe&" of updates would exhibit anon- Proof. See Appendix 11.4. 0 alies as proved in Section 3. However, we shall prove that BCNF is not sufficient to prevent anom- Lemma 2. (i) Affect(-Ri) =Affect(+R?) alies caused by the "external effects" of these operations. (ii) NoAffect(-Ri) =NoAffeCt(+Ri) .
The definitions of insertion and deletion Proof. See Appendix II.4. anomalies require technical changes in the UR- environment. First, the definitions of Affect and NoAffect given in Section 3 quantify over all
251 Combining these results we obtain a precise (a) u = <{E#,D#,M#,PR~J},IE#'D#;M#~PRoJ}> characterization of database schemas that are free of insertion and deletion anomalies under the UR- g = IR~=<{E#,D#},{E#~D#}>, assumption. R2=<{M#,PROJ},{M#+PROJ}>} Theorem 3. D is free of insertion anomalies (or equivalently ;teletion anomalies) iff D is in (b) g = <(:SUPPLIER,PART,CITY},{SUPPLIER-+CITY~> BCNF, and J@OP all Ri, Rj, if ATTRin ATTRj #@ -D = {_Rl=<{SU~PLIE~,CI~~l,{~UP~~I~~+~~~~}> then ATTRi+ATTRjEF' and ATTRj~ATTRiEF'. R2=
5.2 Attainability of Normalized Schemas Proof. If c is in BCNF there is nothing to prove. To prove the converse, observe that if -D The remaining issue is one of schema design. Rep4-represents 2, the J operator must denote a Let U=
252 APPENDIX I since
253 If we adopt meaning (1) now, the deletion of The resulting state is
1. delete
2 D2:R2 R2 = 1 2. delete <2,13,ti from 22; to implement this deletion the database sysfem must delete ltiull,null,z> from Rl, ?,13,n> Observe that the deletion of
D: Rl = 1 v ( null 1 null1 CT [ VP operation: delete
I.3 Another Interpretation of Delete If the user says to insert
254 post expenses against which accounts. The FD A#+D# is present in R4 because the "chart of accounts" normally mirrors the organizational hierarchy; R5 has no FDs because each employee may generally charge to multiple accounts and vice versa.
Consider the state
E# JC D# 1 D1 R1= 1 a x ,...,R = :1/ 4 'r8 c z-f Suppose the user subsequently determines that de- partment 'v' has vice-president 'AGNEW'. TO L place this fact in the database, the usual pro- cedure is and suppose two new employees are hired into depart- ment '2' with job-codes 'd'. If we adopt the I. Find the tuple that describes depart- policy suggested in I.4 of inserting unique null R3 ment v's vice-president. I.e., values whenever a null value is needed, the resul- ting state is (i) restrict R2 by D#= 'v' (ii) joint the result of (i) with R3 IE# (iii) project OntO ATTR 3 = {CT,VP}. 2 114 D2:R2= ,R II. Replace the VP-projection of all tUpleS 1 5 9 null found in (I) by 'AGNEW'. 10 null'a5 =11 8 Ii 2 Applying this procedure to D , we obtain
D# 1 M# 1 CT 2 I The values of 4 and R5 are bizarre--it is highly unlikely that the hiring of every new employee D3.R3 would require the establishment of a new account. '1 However, the alternative of letting all new hires share one null account number suffers the problems described in I.4 and is wrong, tto.
I.6 Conclusion
Null values are an escape clause from the UR- assumption; they attempt to circumvent UR-consistency in cases where it seems most unrealistic. Since the UR-assumption is central to currently developed This state asserts, erroneously, that AGNEW is vice- database theories, it is not surprising that attempts president of department 'u'. to circumvent it are so troublesome. To avoid this error, the retrieval step (part I) of the procedure must be cognizant of null join values and must insert new null values as it goes. APPENDIX II It would certainly be simpler for the insert operator to insert these extra nulls in the first Properties of Updates Under UR-Assumption place.
I.5 Maximizing the Use of Null Values--A Problem II.1 Update Maps
On the other hand there are cases where maxi- Properties 3 and 4 of Section 4.2 are defined mizing the use of null values is also wrong. Let in terms of sequences of updates operating on in- us add the following relation schemas to D: dividual relations. This section formalizes this concept, refining it into the notion of "update Rq = <{A#,D#},{A#+D#}>, map". and =
255 W(D) = D' such that (ii) there exists a deletion map from D to D' iff D 2 D'; and
, for i#j (iii) there exists a replacement map from D D$i) to D' always. D'$) =
+(D(R.) ,rj), for i=j. II.2 Reachability -3 A database state is reackabZe.if it can be attained by applying a sequence of update operations to an initially empty database state. Reachability is complementary to consistency. By assumption, the consistent states of a schema are the states that A sequence of insertion commands, R+= 256 reachable from the empty state, by induction on Consider the database of Figure 11.2, and the -IuI' I. replacement. &Rl(D, Let U=U'-{u} for any UEU', and let ul: (al bl cl dxl exl>, where dxl, exl D= P(U). D is reachable from the empty state by are variables induction hypothesis, while D'=P(U') is reach- able from D by Lemma 11.1. Thus D' is reach- u2: = <{A,B,C),{A'BC)>, D = = <{A,B,C},CBC'A}>, D = {El - {R-1 R2 = <{B,D},{B+D)>, R2 = <{A,D,E),{A'D,DE->A}>, R3 = <{C,D).IC-+D1>) R3 = 257 II.4 Proofs of Lemmas 1 and 2 is also not affected by these meanings for the insert-on, establishing NoAffect(-Ri) 5 0 LEMMA 1. (i) Affect(+Ri) = I_Rj EDj_Ri and fcj NoAffect(+Ri). are COYZYZeCted by U path in G(D).- (ii) NoAffect(+Ri) = {Rj6glATTRj+ATTRieF+]. References Proof. (i) We first prove that if Ri and Rj are not connected, Rj gAffect(+$). Let D 1-W A.V. Aho, C. Beeri, and J.D. Ullman, "The be any non-empty consistent state, and consider an Theory of Joins in Relational Databases," arbitrary (defined) insertion, +Ri (D,r) . Let D' ACM Trans. on Database Syst., 4,3 (Sept., be any meaning of that insertion, let R, be the 1979), 297-314. insertion map from D to D', and let fii = {<+Rk,rk>EfiIRi and zk are connected). P-1 M.A. Arbib, and E.G. Manes, Arrows, Structures, Observe chat Ri(D) performs the given insertion, and Functors--The Categorical Imperative, and R.cR consequently R. =a,.. R. BAffect(+Ri) Academic Press, New York, 1975. since iRT cf&tains no comma,: of the-3form <+zj,rj> by constiuction. [Arm] W.W. Armstrong, "Dependency Structures of Database Relationships," PPOC. IFIP 74 We now prove that if Ri and -Rj are con- (19741, 580-583. nected, Rj EAffect(+R.). Let D be any non-empty consistent state, and -1 et u be a "thoroughly [BBGI C. Beeri, P.A. Bernstein, and N. Goodman, distinct" tuple relative to D, meaning that for "A Sophisticate's Introduction to Database all AEATTR, uIA1 e J(D)tA]. Let R+= {<+Rj, Normalization Theory," Proc. 4th Int'Z Conf. U[ATTRjI>(Ri,Rj are connected]. R+(D) performs on Very Large Databases (1978), 113-124. the insertion +Ri(D,U[ATTRi]) and affects every Rj connected to Fir while for all fl'cfi. R'(D) [BFHI C. Beeri, R. Fagin, and J.H. Howard, "A Com- does not perform the insertion. plete Axiomatization for Functional and Multivalued Dependencies ,II &OC. ACM SIGMOD (ii) If ATTRj-+ATTRiEF', every +Ri (D,ri) Conf. (Aug. 1977), 47-61. that has any effect must certainly affect R,; otherwise J(+Ri(D,ri)) would be inconsist&?t. [Bern] P.A. Bernstein, "Synthesizing Third Normal This establishes that NoAffect(+gi) 5 {Rj 6 Form Relations from Functional Dependencies," D(ATTRj~ATTRiBF"}. ACM Trans. on Database Sys., 4 (Dec. 1976), 277-298. TO prove inclusion in the opposite direction, suppose ATTRj 'ATTRiBF+. We will construct an [BGI P.A. Bernstein, and N. Goodman, "What Does insertion that does not affect Rj- By the Boyce-Codd Normal Form Do?" Tech. Rept. completeness property of FDs, there exists a con- 07-79, Center for Research in Computing sistent state U' in which ATTR' +ATTRi does Technology, Harvard University (May 1979). not hold. For this FD not to ho1 A , U' must con- tain distinct tuples u and u' such that [BMSU] C. Beeri, A. Mendelzon, Y. Sagiv, and J.D. U[ATTRj] =U'[ATTRj], yet U[ATTRi] fU'[ATTRiI. Let Ullman, "Equivalence of Relational Database U=U'-{u), let D=P(U), and consider the insertion Schemes," Proc. ACM Symp. on Theory of +l?i(D,U[ATTRiI)- This insertion has an effect and Computing (May 1979). P(U') performs it. Thus there exists a state that nonextraneously performs the insertion such that [Cod11 E.F. Codd, "Further Normalization of the D LEMMA 2. (i) Affect(-Ri) =Affect(+Ri) [Cod21 E.F. Codd, "Recent Investigations in Rela- tional Data Base Systems," Proc. IFIP 74 (ii) NoAffect(-Ri) =NoAffect(+R+). (1974), 1017-1021. Proof. Observe that -R. (+R.(D,r.),ri) =D [Date] C.J. Date, Introduction to Database Systems, whenever +Ei(D,ri) #D. Thisies&blishes Addison-Wesley, Reading, MA (1977). Affect(-Ri) ?Affect(+Ri) and NoAffect(-58) 2, Affect(-Ri). DBI U, Dayal, and P.A. Bernstein, "The Updata- bility of Relational Views," PPOC. 4th Ii'lt'Z To prove inclusion in the opposite direction, Conf. on Very Large Databases (1978). let -Ri (D,ri) be any deletion that has an effect, and let D' =-R, (3 r.) Observe that for all [DC] C. Delobel, and R.C. Casey, "Decomposition uEJ(D) -J(D'), -i P;JiD') U {U)) is a meaning of of a Data Base and the Theory of Boolean +Ri(D,ri). Hence any Rj affected by the deletion Switching Functions," IBM J. Of Res. and is also affected by one of these meanings of the Dev., 17, 5 (Sept. 1972), 370-386. insertion, establishing Affect(-l?i) C_ Affect(+Ri). Conversely, any Ej not affected by the deletion 258 [Fag11 R. Fagin, "Multivalued Dependencies and a New Normal Form for Relational Databases," ACM Trans. on Database Systems, 2, 3 (Sept. 1977), 262-278. Fag21 R. Fagin, "Normal Forms and Relational Database Operators," *OC. ACM SIGMOD conf. (May 1979). [HLYI P. Honeyman, R.E. Ladner, and M. Yannakakis, "Testing the Universal Instance Assumption," Inf. Proc. Letters, 10, 1 (Feb. 12, 1980), 14-19. [Marl J. Martin, Computer Database Organization, Prentice Hall, Englewood Cliffs, NJ (1975). [Ris] J. Rissanen, "Independent Components of Relations, II ACM Trans. on Database Sys., 2, 4 (Dec. 1977), 317-325. 259 I