Semantics of Time-Varying Attributes and Their Use
Total Page:16
File Type:pdf, Size:1020Kb
Semantics of TimeVarying Attributes and Their Use for Temp oral Database Design Christian S Jensen Richard T Sno dgrass 1 Department of Mathematics and Computer Science Aalb org University Fredrik Ba jers Vej E DK Aalb org DENMARK csjiesdaucdk 2 Department of Computer Science University of Arizona Tucson AZ USA rtscsarizonaedu Abstract Based on a systematic study of the semantics of temp oral attributes of entities this pap er provides new guidelines for the design The notions of observation and up date of temp oral relational databases patterns of an attribute capture when the attribute changes value and when the changes are recorded in the database A lifespan describ es when an attribute has a value And derivation functions describ e how the values of an attribute for all times within its lifespan are computed from stored values The implications for temp oral database design of the semantics that may b e captured using these concepts are formulated as schema decomp osition rules Intro duction Designing appropriate database schemas is crucial to the eective use of rela tional database technology and an extensive theory has b een develop ed that sp ecies what is a go o d database schema and how to go ab out designing such a schema The relation structures provided by temp oral data mo dels eg the recent TSQL mo del provide builtin supp ort for representing the temp o ral asp ects of data With such new relation structures the existing theory for e eective use of tem relational database design no longer applies Thus to mak p oral database technology a new theory for temp oral database design must b e develop ed We have previously extended and generalized conventional normalization But the resulting concepts are still lim concepts to temp oral databases ited in scop e and do not fully account for the timevarying nature of data Thus additional concepts are needed in order to fully capture and exploit the timevarying nature of data during database design This pap er prop oses con cepts that capture the timerelated semantics of attributes and uses these as a foundation for developing guidelines for the design of temp oral databases The prop erties of timevarying attributes are captured by describing their lifespans their time patterns and their derivation functions Design rules subsequen tly al low the database designer to use the prop erties for view logical and physical schema design The pap er is structured as follows Section rst reviews the temp oral data mo del used in the pap er It then argues that the prop erties of attributes are rela they describ e and then intro duces surrogates for representing tive to the ob jects realworld ob jects in the mo del The following subsections address in turn dif ferent asp ects of timevarying attributesnamely lifespans time patterns and derivation functions Section is devoted to the implications of the attribute se mantics for logical schema physical schema and view design The nal section summarizes and p oints to opp ortunities for further research Capturing the Semantics of TimeVarying Attributes This section provides concepts that allow the database designer to capture more precisely and concisely than hitherto the timevarying nature of attributes in temp oral relations The temp oral data mo del employed in the pap er is rst describ ed Then a suite of concepts for capturing the temp oral semantics of attributes are intro duced A Conceptual Data Mo del We describ e briey the relation structures of the Bitemp oral Conceptual Data Mo del BCDM see for a more complete description that is the data mo del of TSQL and which is used in this pap er We adopt a linear discrete b ounded mo del of time with a time line com p osed of chronons The schema of a bitemp oral conceptual relation R consists of an arbitrary numb er eg n of explicit attributes and an implicit timestamp bitemp oral attribute T dened on the domain of sets of bitemp oral chronons A b t v t chronon c c c is an ordered pair of a transactiontime chronon c and a v b validtime chronon c A tuple x a a j t in a relation instance r a n of schema R thus consists of n attribute values asso ciated with a bitemp oral timestamp value An arbitrary subset of the domain of valid times is asso ciated with each tuple meaning that the information recorded by the tuple is true in the modeled reality during each validtime chronon in the subset Each individ ual validtime chronon of a single tuple has asso ciated a subset of the domain of transaction times meaning that the information valid during the particular ent in the relation during each of the transactiontime chronons chronon is curr in the subset Any subset of transaction times less than the current time may that while the denition of a bitemp oral b e asso ciated with a valid time Notice chronon is symmetric this explanation is asymmetric reecting the dierent semantics of transaction and valid time We have thus seen that a tuple has asso ciated a set of socalled bitemp oral chronons in the twodimensional space spanned by transaction time and valid time Such a set is termed a bitemporal element We assume that a do main of surrogate values is available for representing realworld ob jects in the database Example Consider the relation instance empDep shown next EName Dept T Bob Ship f g Bob Load f g The relation shows the employment information for an employee Bob and two departments Ship and Load contained in two tuples In the timestamps we assume that the chronons corresp ond to days and that the p erio d of interest is Throughout we use integers some given month in a given year eg July as timestamp comp onents The reader may informally think of these integers as dates eg the integer in a timestamp represents the date July The current time is assumed to b e V alidtime relations and transactiontime relations are sp ecial cases of bitem p oral relations that supp ort only valid time and transaction time resp ectively or clarity we use the term snapshot relation for a conventional relation which F supp orts neither valid time nor transaction time This completes the description of the ob jects in the bitemp oral conceptual data mo delrelations of tuples timestamp ed with temp oral elements An asso ciated algebra and userlevel query language are dened elsewhere Using Surrogates An attribute is seen in the context of a particular realworld entity Thus when we talk ab out a prop erty eg the frequency of change of an attribute that prop erty is only meaningful when the attribute is asso ciated with a particular entity As an example the frequency of change of a salary attribute with re sp ect to a sp ecic employee in a company may reasonably b e exp ected to b e relatively regular and there will only b e at most one salary for the employee at each p oint in time In contrast if the salary is with resp ect to a department a signicantly dierent pattern of change may b e exp ected There will generally b e many salaries asso ciated with a department at a single p oint in time Hence it is essential to identify the reference ob ject when discussing the semantics of an attribute We employ surrogates for representing realworld entities in the database In this regard we follow the approach adopted in eg the TEER mo del by Elmasri Surrogates do not vary over time in the sense that two entities iden tied by identical surrogates are the same entity and two entities identied by dierent surrogates are dierent entities We assume the presence of surrogate attributes throughout logical design At the conclusion of logical design surro gate attributes may b e either retained replaced by regular key attributes or eliminated Lifespans of Individual TimeVarying Attributes In database design one is interested in the interactions among the attributes of the relation schemas that make up the database Here we provide a basis for relating the lifespans of attributes Intuitively the lifespan of an attribute for a sp ecic ob ject is all the times when the ob ject has inapplicable null for the attribute Note that lifespans a value distinct from i concern valid time ie are ab out the times when there exist some valid values To more precisely dene lifespans we rst dene an algebraic selection op er ator on a temp oral relation Dene a relation schema R A A jT and n b e a predicate dened on the A The let r b e an instance of this schema Let P i B B selection P on r r is dened by r fz j z r P z A A g n P P B It follows that r simply p erforms the familiar snapshot selection with the P addition that each selected tuple carries along its timestamp T Next we dene an auxiliary function vte that takes as argument a validtime relation r and v v returns the validtime element dened by vter fc j s s r c sTg The result validtime element is thus the union of all valid timestamps of the tuples in an argument validtime relation Let a relation schema R S A A j T b e given where S Denition n is surrogate valued and let r b e an instance of R The lifespan for an attribute A i n with resp ect to a value s of S in r is denoted lsr A s and is i i B dened by lsr A s vte r i s^A6? S i Lifespans are imp ortant b ecause attributes are guaranteed to not have any inapplicable null value during their lifespans Assume that we are given a relation EName Dept that records the names and departments schema empDep EmpS of employees represented by the surrogate attribute EmpS If employees always have a name