Referential Integrity Revisited: an Object-Oriented Perspective
Total Page:16
File Type:pdf, Size:1020Kb
REFERENTIAL INTEGRITY REVISITED~AN OBJECT-ORIENTED PERSPECTIVE* VICTOR M. ~KOWITZ Lawrence Berkeley Laboratory Computer Science Research and Development Department 1 Cyclotron Road, Berkeley, CA 94720 ABSTRACT represented by relation tuples, while object connec- Referential integrity underlies the relational tions are represented by references between tuples. representation of objeceoriented structures. The con- Such references are enforced in relational databases cept of referential integrity in relational databases is by referential integrity constraints [2]. There are two hindered by the confusion surrounding both the con- approaches to the specification of these constraints in cept itself and its implementation by relational data- relational databases. In the Universal Relation (UR) base management systems (RDBMS). Most of this approach [ll], referential integrity constraints are confusion is caused by the diversity of relational implied by associating different relations with com- representations for object-oriented structures. We mon attributes; the referential integrity meaning of examine the relationship between these representa- relations sharing common attributes is defined by a tions and the structure of referential integrity con- set of rules, called UR assumptions. The UR assump- straints, and show that the controversial structures tions make the description of object structures either do not occur or can be avoided in the rela- extremely difficult and not entirely reliable, mainly tional representations of object-oriented structures. becaise they require excessively complex attribute Referential integrity is not supported uniformly name assignments. by RDBMS products. Thus, referential integrity con- A different approach to the specification of straints can be specified in some RDBMSs non- referential integrity constraints is ‘to associate expli- procedurally (declaratively) , while in other RDBMSs citly a foreign-key in one relation with the primary- they must be specified procedurally. Moreover, some key of another relation [5]. Such constraints are a RDBMSs do not allow the specification of certain special case of inclusion dependencies 141. Every referential integrity constraints. We discuss the explicit referential integrity constraint is usually referential integrity capabilities provided by three associated with a referential integrity rule which representative RDBMSs, DB2, SYBASE, and INGRES. defines the behavior of the relations involved in the constraint under insertion, deletion, and update. I. INTRODUCTION Explicit referential integrity constraints are easier to The database design process involves specifying the specify and understand than the implicit referential objects and object connections relevant to the data- integrity constraints of the UR approach, because base application. In relational databases objects are they are closer to the way users describe object struc- tures. However, the referential integrity concept is * This work was supported by the Office of Health and En- still surrounded by confusion, as illustrated by the vironmental Research Program and the Applied Mathematical Sci- successive modifications of the original definition of ences Research Program, of the Office of Energy Research, U.S. [2] (e.g. see [3], [5]). Thus, certain referential Department of Energy, under Contract DE-AC03-76SF00098. integrity structures have unclear semantics, and therefore must be ‘treated with caution’ [6]. Obvi- ously, a non technical user cannot be expected to manage the complexities of such a task. In this paper we examine the characteristics of referential integrity constraints involved in the rela- tional representation of object-oriented structures. We show that the controversial structures discussed in [6] can be avoided without any effect on the capa- bility of relational schemas to represent object struc- Proceedings of the 16th VLDB Conference tures. We explore the characteristics of referential Brisbane. Australia 1990 578 integrity constraints in the context of relational sche- rules in INGRES) for specifying such constraints pro- mas representing Eztended Entity-Relationship (EER) cedurally. We discuss in this paper the referential object structures. We have selected the EER model integrity capabilities of DB2, SYBASE, and INGRES. because of its widespread use in designing relational We show that some of the restrictions imposed by databases [19]. However, our results apply to any DB2 are too stringent. We compare the SYBASE and object-oriented data model that supports generaliza- INGRES mechanisms for specifying referential tion and aggregation [g]. integrity constraints, and discuss their limitations. We have shown in [15] that an EER schema The paper is organized as follows. In section 2 can be represented by a Boyce-Codd Normal Form we briefly review the relational and EER concepts (BCNF’) relational schema of the form (R, F IJ I ), used in this paper, and the relational representation where R denotes a set of relation-schemes, and F and of EER schemas. In section 3 we examine two contr- I denote sets of key and inclusion dependencies, oversial foreign-key structures in the context of rela- respectively. Informally, relation-schemes represent tional schemas representing EER object structures. object-sets, and inclusion dependencies represent the The semantics of referential integrity rules in the existence dependencies inherent to object connec- context of relational schemes representing EER sche- tions. The inclusion dependencies in these schemas mas, is explored in section 4. In section 5 we exam- are key-based, that is, are referential integrity con- ine the effect of merging relations on the structure of straints, and relation-schemes correspond to either referential integrity constraints. The referential unique or multiple (embedded) object-sets. In [12] we integrity capabilities of DB2, SYBASE, and INGRES have shown that the mapping process involved in are examined in section 6. We conclude with a sum- representing EER schemas by relational schemas, can mary. The procedures for mapping EER schemas be expressed as the composition of (i) the mapping of into relational schemas, and for merging relation- EER schemas into a relational schemas, where every schemes in relational schemas are given in the appen- relation-scheme corresponds to a unique EER object- dix. set, followed by (ii) relation-scheme mergings, that result in relation-schemes representing multiple II. PRELIMINARY DEFINITIONS object-sets. In this section we review briefly the relational and We examine the structure of referential Extended Entity-Relationship (EER) concepts used in integrity constraints involved in relational schemas this paper, and the representation of EER object whose relation-schemes represent unique. EER structures using relational constructs. object-sets. We show that in such schemas referential integrity constraints can be associated with one out 2.1 Relational Concepts. of four possible referential integrity rules. Next, we We use letters from the beginning of the alpha- examine the effect of merging on the structure of bet to denote attributes and letters from the end of referential integrity constraints, and show that merg- the alphabet to denote sets of attributes. We denote ing entails associating some referential integrity con- by t a tuple and by t[ w] the sub-tuple of t straints with an additional, fifth, referential integrity corresponding to the attributes of W. rule. In contrast, seven referential integrity rules are A relational schema is a pair (R,A), where R is defined in [3] and [5]; we show that the two extra a set of relation-schemes and A is a set of dependen- rules are not needed for representing EER object cies over R. We consider relational schemas with structures. A = F u I, where F and I denote sets of functional Currently, several relational database manage- and inclusion dependencies, respectively. A relation- ment systems (RDBMS), notably IBM’s DB2, SYBASE, scheme is a named set of attributes, R,(Xi), where Ri and INGRES, provide support for referential is the relation-scheme name and Xi denotes the set of integrity. Interestingly, these systems provide attributes. Every attribute is assigned a domain, and different capabilities for specifying referential every relation-scheme, Ri(Xi), is assigned a relation integrity constraints. Thus, DB2 [7] allows non- (value), ri. Th e p ro 3‘ec t iota of such a relation, r;, on procedural (declarative) specifications of referential a subset of GUI, W, is denoted zw(ri), and is equal to integrity constraints, but with certain restrictions on {t[W] 1 t E ri>. ‘I’ wo attributes are said to be com- the allowed structures. Conversely, SYBASE [18] patible if they are associated with the same domain, and INGRES [8] do not support declarative and attribute sets X and Y are said to be compatible specifications of referential integrity constraints, and iff there exists .a one-to-one correspondence of compa- provide instead mechanisms (triggers in SYBASE and tible attributes between X and Y. 579 Let Ri(Xi) b e a relation-scheme associated An EER schema can be represented as an acy- with relation ri. A junctional dependency over Ri is clic directed graph, called EER diagram: entity-sets, a statement of the form Ri: Y-Z, where Y and Z relationship-sets, and attributes, are represented by are subsets of Xi; Ri: Y-+Z is satisfied by ri iff for rectangle, diamond, and ellipse shaped vertices, any two tuples of ri, t and t ‘, t[ Y] =