Temp oral Semantic Assumptions and Their Use in Database Query
Evaluation
Claudio Bettini X. Sean Wang Sushil Ja jo dia
Technical Rep ort: ISSE-TR-95-105
Department of Information and Software Systems Engineering
George Mason University
June 13, 1995
Abstract
Temp oral data explicitly stored in a temp oral database are often asso ciated with certain seman-
tic assumptions. Each assumption can b e viewed as a way of deriving implicit information from the
explicitly stored data. Rather than leaving the task of deriving p ossibly in nite implicit data to
application programs, as is the case currently, it is desirable that this b e handled by the database
management systems. Toachieve this, this pap er formalizes and studies twotyp es of semantic as-
sumptions: p oint-based and interval-based. The p oint-based assumptions include those assumptions
that use interp olation metho ds, while the interval-based assumptions include those that involve dif-
ferent temp oral typ es time granularities. In order to incorp orate semantic assumptions into query
evaluation, this pap er intro duces a translation pro cedure that converts a user query into a system
query such that the answer of this system query over the explicit data is the same as that of the user
query over the explicit and the implicit data. The pap er also investigates the niteness safety of
user queries and system queries.
1 Intro duction
A prominent feature of temp oral information is its richness in semantics asso ciated with its temp oral
domain. When querying a temp oral database, a user naturally assumes some \usual" semantics on
stored temp oral data. For instance, she exp ects that her bank account balance p ersists, i.e., the balance
stays the same unless a transaction|dep osit, withdrawal or accrual of interest|is p erformed. Therefore,
if she wishes to nd the balance at a particular time and no balance amount is stored for that time, she
lo oks for the balance of the last transaction that was p erformed b efore the time in question. Semantic
assumptions may also involve di erent temp oral typ es time granularities. For example, when the user
A preliminary version of this pap er app eared in the 1995 Pro ceedings of the ACM SIGMOD International Conference
on Management of data. This work was supp orted in part by an ARPA grant, administered by the Oce of Naval Research
under grantnumb er N0014-92-J-4038. Work of Bettini was p erformed in part while visiting George Mason University.Work
of Wang was also supp orted by the NSF grant IRI-9409769. Bettini is with the Dipartimento di Scienze dell'Informazione,
University of Milano, 20135 Milano, Italy, and Wang and Ja jo dia are with the Department of Information and Software
Systems Engineering, George Mason University,Fairfax, VA 22030{4444. Emails: [email protected] and fxywang,
asks for the account holder of a certain account in a particular month and account holders are stored
in terms of days, she naturally assumes that the answer is someone who is the account holder of that
account on all the days within that month.
Researchers have long realized such richness of semantics in temp oral data [CT85, SS87, Tan87] and
provided various op erators in di erent query languages for the users to co de the semantic assumptions
into queries [SS87,Tan87]. For the aforementioned balance query, the user would use a \last" or
similar op erator in her query to retrieve the appropriate balance.
We argue, however, that temp oral semantics should b e an integral part of temp oral databases
and users should not b e burdened with having to incorp orate these semantics in their queries. In the
account balance example, the user should b e able to query the database directly ab out the balance at
any particular time, instead of having to co de the semantics of the balance into the query. The system
should b e able to answer the query appropriately according to the semantics. To provide such an ability,
the system has to
include a precise formalization of the temp oral semantics, and
evaluate queries according to these semantics.
In this pap er, we provide a framework to incorp orate temp oral semantics into temp oral databases and
investigate a metho d to evaluate queries with semantics.
Consider the bank account example in more detail. Assume the temp oral relation ACCOUNTS records
accountnumb ers AcctNo, account holder AccHol, account balance Balance, annual interest rate
AnIntRate, and the asso ciated time Time, i.e., the time when the rep orted values b ecome true. We
can assume that new tuples are added to this relation when an event o ccurs such as the op ening of
an account, a dep osit, a withdrawal, or a change in interest rates. The values of Time are timestamps
consisting of the date month/day/year concatenated with the time of the day up to seconds. Here
and in the rest of the pap er timestamps represent valid time; extensions to include transaction time and
other temp oral dimensions are not considered.
An instance of the ACCOUNTS relation is shown in Figure 1.
AcctNo AccHol Balance AnIntRate Time
1001 J.Smith 1000 3.00 3/3/93:09:01:00
1001 J.Smith 3500 4.00 3/4/93:10:01:55
1500 A.Brady 2000 3.00 3/4/93:11:00:00
2034 T.Ford 500 2.50 3/4/93:12:19:03
1500 A.Brady 1500 3.00 3/4/93:18:00:00
1001 J.Smith 4000 4.00 7/3/93:09:00:00
Figure 1: An instance of the relation ACCOUNTS
Since we store new tuples only when some of the attribute values change, the relation do es not
contain an explicit tuple for every second of eachday for an account. However, if asked ab out the
balance of Mr. Smith's account at no on of March 4, 1993, we could answer without hesitation that the
balance was $3,500. This is b ecause we assume that nothing has changed for that account since the last
transaction b efore that time. Here wesay that the attribute Balance satis es a semantic assumption
called persistence for each account. Another example of p ersistence is given by the attribute AccHol
that is intuitively p ersistent for each account. 2
Other assumptions apply when di erent temp oral typ es or granularities are considered. For
example, the attribute AnIntRate can b e classi ed as liquid, meaning that it satis es the following
prop erties:
It is downward-hereditary: If the value of AnIntRate is 3.00 for a month, we can assume that its
value is 3.00 for eachday of that month.
It is upward-hereditary: If the value of AnIntRate is 3.00 for eachday of a month, we can assume
that its value is 3.00 for that month.
The terms \liquid" and \downward/upward hereditary" are b orrowed from the temp oral reasoning
community [Sho87].
Note that not all attributes are p ersistent and/or liquid. For example, consider a relation with
the attributes hAcctNo, Amount, secondi that stores the time and the amount of each dep osit or
withdrawal to a certain account. The Amount attribute is obviously neither p ersistent nor liquid.
The aforementioned semantic assumptions, namely p ersistence and liquidity, are among the most
common ones that have b een identi ed by the temp oral reasoning community. There are other semantic
assumptions that arise quite naturally in databases. For example, if a database stores annual salaries
of the employees, the monthly salary can easily b e determined by dividing the annual salary by 12.
Contribution of the pap er
Recognizing and formalizing that a group of attributes satis es some semantic assumption is very useful
in answering queries ab out values of attributes at times of same or di erent temp oral typ es for which
avalue is not explicitly stored. One of the contributions of this pap er is to formalize the notion of
semantic assumptions for the purp ose of query evaluation.
In this pap er, each semantic assumption is viewed as a function for deriving implicit information
from explicit information by applying one or more information generation pro cedures. Such generation
pro cedures are formally captured by our notion of assumption metho ds. Intuitively, an assumption
metho d derives implicit information using some sp eci c rule. For example, \p ersistence" can b e regarded
as one such rule that uses the latest stored value as the currentvalue for an attribute. In Figure 1,
by applying the p ersistence metho d on the Balance attribute, we obtain $1,000 to b e the balance at
the time 3/3/93:09:01:01, 3/3/93:09:01:02, 3/3/93:09:01:03, 3/3/93:09:01:04, and so on. The function
for a semantic assumption that uses p ossibly di erent metho ds to generate information for di erent
attributes, is then a \comp osition" of the rules of the appropriate assumption metho ds.
DB the database that contains, in addition to DB , the We denote by DB the database and by
data generated by applying all semantic assumptions. A user query on a database DB is viewed as a
query on DB . If the query asks for the balance of account 1001 at no on of March 4, 1993, it retrieves
values from DB , and gives $3,500 as the answer.
However, it is usually impractical for the database to store and manage DB . Indeed, consider the
attribute AnIntRate in Figure 1. Since AnIntRate is liquid, it is p ossible to derivevalues for AnIntRate
for each calendar unit. Not only an enormous e ort is required to manage all these derived values, but
this e ort maybewasted since users may not b e interested in knowing the interest rate for all p ossible
calendar units. A similar problem exists for the Balance attribute.
Our solution is to translate the user query into a query that incorp orates the semantic assumptions.
As shown in Figure 2, a user query Q is viewed as a query on DB . Instead of using the semantic
0
DB ,we use them to translate the query Q into another query Q such that the assumptions to build 3
0
new query evaluated on the stored database Q [DB ] is the same as the answer of the user query Q
DB Q[DB ]. In formula, we require that over
DB
H
H
H
H
H
H
0
Q : system query
H
H
H
H
Semantic
H
H
assumptions
H
H
H
H
H
H
H
H
H
? - Hj
DB Q: user query Query result
0 0
Figure 2: Do es there exist a Q such that Q [DB ]=Q[ DB ]?
0
Q [DB ]=Q[ DB ]:
0
Obviously, whether there always exists a Q for each Q largely dep ends on the languages used. In this
F
pap er, we use a single language MQL for the semantic assumptions, for user queries Q,aswell as for
F
0
system queries Q .MQL is an extension of the query language intro duced in [WJS93] as a general query
F
language on temp oral databases. We show that MQL is a reasonably p owerful language for sp ecifying
0
semantic assumptions. Furthermore, we show that the aforementioned system query Q always exists if
F F
all semantic assumptions are sp eci ed byMQL and all user queries are in MQL .We provide automatic
0
derivation of Q .
Another contribution of the pap er is in its analysis of safety issues of the temp oral database query
F
language MQL .Persistence, and other semantic assumptions, naturally gives rise to in nite information.
For example, the p ersistence of the Balance attribute dictates that the relation in Figure 1 generates
in nitely many tuples: For account 1001, there will b e one tuple for each second after 7/3/93:09:00:00.
There are an in nite numb er of seconds. Such in niteness, called eventual uniformity in this pap er, has
b een accommo dated by researchers by using sp eci c data mo dels e.g., the tuple timestamp [1; 1]of
[Sno84]. However, it is not analyzed formally in a general setting. In this pap er, we study the safety
requirement of query languages that yields only eventually uniform results and other general safety
criteria.
In summary, the contribution of the pap er is three-fold: 1 Semantic assumptions are formalized
for the purp ose of database query evaluation; 2 A paradigm for query evaluation is intro duced to
incorp orate semantic assumptions; and 3 Safety issues are analyzed in this more general setting.
Related research
As mentioned earlier, some of our terms used for semantic assumptions are b orrowed from the temp oral
reasoning community. In particular, the term liquidity is from [Sho87], while the term persistence has
b een extensively used in temp oral reasoning for planning e.g., [AKPT91]. The notion of p ersistence
has also b een used in the context of temp oral databases in [DM87] to make predictions ab out unknown 4
facts in the database; however, the emphasis of [DM87] is on consistency maintenance of a logical
database containing uncertain temp oral information ab out events. Logical databases and uncertain
information are b eyond the scop e of this pap er. Instead, we consider general classes of assumptions
including p ersistence and assumptions involving multiple granularities. Our fo cus is on relational-like
databases and query evaluation on these databases.
Various semantic assumptions in temp oral database settings were p erhaps recognized rst by Clif-
ford and Warren [CW83]. The earliest systematic study was however p erformed by Segev and Shoshani
[SS87]. They recognized various prop erties of time sequences, such as \stepwise constant" and \contin-
uous", and provided a numb er of functions to b e used in user query languages to accommo date these
prop erties. A larger set of sophisticated op erators for the same purp ose were prop osed byTansel in
[Tan87]. By using these op erators, many semantic assumptions can b e co ded into user queries. These
works di er in two resp ects from the current one. First, in this pap er, we formalize the notion of seman-
tic assumption, while in these earlier studies, no general de nition is given. Second, we assume that the
user queries the database with the assumptions built in, instead of having to co de them into queries.
Thus, in this study, the query language for the user is a simple extension of the relational calculus the
extension is for incorp orating the temp oral dimension, not for semantic assumptions.
Other researchers [CT85, WJL91, CI94] also recognized the imp ortance of semantic assumptions
in temp oral databases. Cli ord [CT85] p ointed out the use of interp olation in temp oral databases;
however, the query evaluation was not formalized. Cli ord and Isakowitz [CI94] dealt with formalizing
the semantics of variables that many temp oral data mo dels employ to denote various intuitive semantic
assumptions. The work clari ed manyvague, although intuitive, notions. However, [CI94] did not
address how user queries are evaluated on databases with suchvariables.
Regarding the safety of temp oral queries, the problem is sometimes avoided by assuming a nite time
domain and the safety problem reduces to that of standard relational query languages. A recent pap er
by Cli ord, Croker and Tuzhilin [CCT94] mentioned a syntactic safety requirement based on [Ull88 ] for
a logic based temp oral query language. However, it is not clear what prop erties this syntactic restriction
leads to.
Organization
The rest of the pap er is divided into 8 sections. Section 2 de nes our temp oral data mo del. In Section 3,
weintro duce the notion of a general class of semantic assumption: p oint-based semantic assumptions.
Persistence is a sp ecial case of the p oint-based assumptions. Several imp ortant prop erties of p oint-
based assumptions, and the p ersistence assumption in particular, are investigated. Query evaluation
with resp ect to p oint-based assumptions is discussed in Section 4. Interval-based assumptions that
include the liquidity assumption are formalized in Section 5 and query evaluation with resp ect to such
assumptions is studied in Section 6. In Section 7, the ab ovetwo kinds of assumptions are combined in
evaluating queries. Safety issues are discussed in Section 8. Finally, the pap er is concluded with some
remarks on the results of the pap er and on future research directions in Section 9.
2 Data Mo del
The data mo del used in this pap er is based on that intro duced in [WJS93, WBBJ94] as a uni ed interface
for accessing di erent underlying temp oral information systems. It is b elieved that the concepts and
the results of this pap er are readily translated to other temp oral data mo dels. 5
2.1 Temp oral typ es
We start with de ning temp oral typ es that mo del typical and atypical calendar units. We assume
there is an underlying notion of absolute time, represented by the set N of all p ositiveintegers.
De nition Let I b e the set of all intervals on N , i.e., I = f[i; j ] j i; j 2N and i j g[f[i; 1] j i 2
N N
1
Ng. A temporal type is a mapping from the set of the p ositiveintegers the time ticks to the set
I [ f;g i.e., all intervals on N plus the empty set such that for each p ositiveinteger i, all following
N
conditions are satis ed:
1 if i= [k; l] and i +1= [m; n], then m = l +1.
2 i= ; implies i +1= ;.
3 there exists j such that j = [k; l] with k i l .
Condition 1 states that the mapping must b e monotonic and that the intervals corresp onding to
consecutive ticks should not have a gap b etween them. Condition 2 disallows a temp oral typ e to map
a certain time tick to the empty set unless it maps all subsequent time ticks to the empty set. Condition
3 requires that each p ositiveinteger must b e contained in the interval corresp onding to a tickofthe
temp oral typ e. A tick i of typ e is said to b e empty if i= ;. One particular consequence of the
ab ove three conditions is that the last non-empty tick if it exists must b e mapp ed to an interval of
the form [i; 1].
Typical calendar units suchasday, month, week and year can b e de ned as temp oral typ es. As
an example, supp ose the underlying time is measured in terms of seconds. Then the calendar unit day
assuming it starts on the rst day of 1900 is a mapping such that day1 is the set of all the seconds
that comprise the rst day of 1990, and day2 maps to all the seconds of the second day of 1990, and
so on.
There is a natural \ ner-than" relation among temp oral typ es. The temp oral typ e is said to b e
ner than the temp oral typ e if for each tickof, the corresp onding time interval is \entirely covered"
by the time interval of some tickof .Thus, for example, day is ner than week and month is ner than
year.Formally,wehave:
De nition Let and b e temp oral typ es. Then is said to b e ner than , denoted ,if
1 2 1 2 1 2
for each i, there exists j such that i j .
1 2
It is easily seen that is a partial order. However, it is not a total order since, for example, week
and month are incomparable i.e., week is not ner than month, and month is not ner than week. In
fact, the set of all temp oral typ es is a complete lattice with resp ect to the ner-than relation [WJS93].
Another imp ortant relation regarding temp oral typ es involves time ticks. For example, wewould
liketosay that a particular month is within a particular year. For this purp ose, we assume there is a
binary interpreted predicate IntSec for each pair of temp oral typ es and :
;
De nition For temp oral typ es and , let IntSec b e the binary predicate on p ositiveintegers
;
such that IntSec i; j is TRUE if i \ j 6= ;, and IntSec i; j is FALSE otherwise.
; ;
1
An interval [i; j ][i; 1], resp. is viewed as the set of all integers k such that i k j k i, resp.. 6
In other words, IntSec i; j is TRUE i the intersection of the corresp onding absolute time
;
intervals of tick i of and tick j of is not empty.Thus, for example, IntSec i; j is true i
month;year
the month i falls within the year j .
2
It is imp ortant to emphasize that a realistic system can treat in nite temp oral typ es and their
corresp onding IntSec relations only if these in nite typ es have nite representations. Various p erio dical
descriptions, e.g., [KSW90, NS92], are p ossible but outside the scop e of this pap er.
2.2 Temp oral mo dule schemes and temp oral mo dules
The temp oral mo dules were presented in [WJS93, WBBJ94] as a uni ed interface for accessing di erent
underlying temp oral information systems. They can b e viewed as \abstract temp oral databases" [Cho94]
or a \conceptual data mo del" [JSS93]. It is b elieved that the concepts and the results of this pap er are
readily translated in terms of other temp oral data mo dels.
We assume that there is a set of attributes and a set of values called the data domain. Each nite
set R of attributes is called a relation scheme. A relation scheme R = fA ;:::;A g is usually written as
1 n
hA ;:::;A i.For relation scheme R, let TupR denote the set of all mappings, called tuples, from R
1 n
to the data domain. A tuple of relation scheme hA ;:::;A i is usually written as ha ;:::;a i, where
1 n 1 n
a = i for each1 i n.
i
De nition A temporal module scheme is a pair R; where R is a relation scheme and a temp oral
typ e. A temporal module is a triple R; ; ' where R; is a temp oral mo dule scheme and ' is a
TupR
mapping, called a time windowing function, from N to 2 such that 'i= ; if i= ; for each
S
i, and 'i is a nite set.
i1
Intuitively, the time windowing function ' in a temp oral mo dule R; ; ' gives the tuples facts
that hold at non-empty time tick i of temp oral typ e . This is a generalization of many temp oral mo dels
in the literature.
A temporal database is a nite set of temp oral mo dules. Throughout this pap er, we assume a xed
set of temp oral mo dule schemes, which is the database scheme. A temp oral database thus is only a
di erent instantiation of the windowing functions of the xed temp oral mo dule schemes. Furthermore,
each temp oral scheme is assigned a unique name. For each temp oral mo dule scheme M,we shall use R
M
and to denote the relation scheme and temp oral typ e, resp ectively.We also use ' to denote the
M M
\current" instantiation of the windowing function of scheme M.We use M as the name of the current
instantiation of M.For convenience, in temp oral mo dule examples, instead of the p ositiveintegers we will
sometimes use an equivalent domain. For instance, the set of expressions of the form 3/3/93:09:01:00
month/day/year:hour:minute:second will serveassuch a domain.
Example 1 We view the temp oral relation ACCOUNTS given in the intro duction as a temp oral mo dule
with ACCOUNTS, second, where ACCOUNTS = hAcctNo, AccHol, Balance, AnIntRatei, as its scheme.
The relation in Figure 1 corresp onds to the time windowing function ' de ned as follows:
'3/3/93:09:01:00=fh1001, J.Smith, 1000, 3.00ig
'3/4/93:10:01:55=fh1001, J.Smith, 3500, 4.00ig
'3/4/93:11:00:00=fh1500, A.Brady, 2000, 3.00ig
'3/4/93:12:19:03=fh2034, T.Ford, 500, 2.50ig
'3/4/93:18:00:00=fh1500, A.Brady, 1500, 3.00ig
'7/3/93:09:00:00=fh1001, J.Smith, 4000, 4.00ig
2
A temp oral typ e is said to b e in nite if it has an in nite numb er of non-empty ticks. 7
Note that the ab ove windowing function only re ects the explicit data stored in the relation. We will
use semantic assumptions in Section 3 to give implicit data.
2
Finally,we de ne the pro ject and select op erations on temp oral mo dules that will b e used in later
T 0
sections. Let M b e a temp oral mo dule and W R . Then M is the temp oral mo dule W; ;'
M M
W
0
such that ' i= 'i for each i 1. Note that is the standard pro ject op eration. Similarly,
W W
T 0 0
M is the temp oral mo dule R ; ;' such that ' i= 'i for each i 1. Note that is
M M F F
F
the standard select op eration.
F
2.3 The query language MQL
We sp ecify the query language on temp oral mo dules using a multi-sorted rst-order logic: One sort is
a generic non-temp oral data domain, while the other sorts are temp oral ones corresp onding to the
temp oral typ es that we consider. Each temp oral typ e we use gives rise to a distinct temp oral sort.
We shall use \temp oral typ e" and \temp oral sort" interchangeably when no confusion arises. Also,
we allow certain scalar functions [EMHJ93], suchas+; ;=; , on the data domain. These functions
are not intended as functions interpreted in the logic, but rather as external calls to the system with
F
xed interpretations. The formulas of this multi-sorted logic are called MQL formulas and the query
F
language we build using the logic is called MQL Multi-sorted Query Language with F unctions.
The range of each data variable is the data domain and the range of each temp oral sort variable is
F
N . The data terms of MQL formulas are of the form x or f x ;:::;x , where each x isavariable
0 1 k 0
name or a constant of the non-temp oral data sort, each x 1 i k is a term of the data sort,
i
and f is a function of arity k .Thus, functions can b e nested and are only for the data sort. Atomic
F
MQL formulas are of the following four typ es: a x = x where x and x are data terms; b
1 2 1 2
IntSec t ;t , where t and t are variables or constants resp ectively of sorts and ; c t ; 1 2 1 2 1 2 t = t , where t and t are variables or constants of the same temp oral sort; and d Mx ;:::;x ;t 1 2 1 2 1 n where M is a temp oral mo dule scheme name of the database with n b eing the arityof R , each x is M i F a non-temp oral variable name or constant and t a temp oral variable or constant. An MQL formula is formed by b o olean connectives and the existential and universal quanti ers in a usual way.Asa syntactic sugar, wechange the quanti cation of temp oral sorts to the form 9t : and 8t : , where t is of sort . This syntactic sugar allows us to tell the sorts of b ounded variables from the formula itself. The predicates IntSec are interpreted as in the previous subsection, < is interpreted as the integer order, and = is the standard equality. F query is of the form An MQL fx :::x ;t: j x ;:::;x ;tg 1 k 1 k where x :::x are variable names or constants of the non-temp oral sort, t is a temp oral variable of 1 k F typ e , and x ;:::;x ;tisanMQL formula whose only free variables are among x ;:::;x ;t. The 1 k 1 k formula can obviously contain temp oral variables of sorts di erent from , provided they are b ounded. F The following is an example MQL query on the ACCOUNTS temp oral mo dule: fx; t : month j9y; z; w 9s : second ACCOUNTSx; y; z; w; s ^ z 2000 ^ IntSec t; sg month;second This query asks for the months in which an account has a balance over $2,000 for at least one second. F An MQL query is correctly typed if for all subformula Mx ;:::;x ;t app earing in it, the temp oral 1 n sort of t is , namely the temp oral typ e of M. The ab ove example query is correctly typ ed since the M F temp oral typ e of the temp oral mo dule ACCOUNTS is second. With each correctly typ ed MQL query Q = fx ;:::;x ;t: j g,we asso ciate an answer, denoted Q[DB ], in the form of a temp oral mo dule 1 k 8 R; ; ', where R is a relation scheme of arity k and ' is given as follows: ha ;:::;a i is in 'mif 1 k and only if is TRUE when the free variables x , :::, x and t are substituted with the constants 1 k a , :::, a and m. Note that the truth value of the atomic formula My ;:::;y ;sis TRUE i 1 k 1 n hy ;:::;y i is in ' s. If we consider the aforementioned example query on a DB containing the 1 n M temp oral mo dule ACCOUNTS of Example 1, its answer is the temp oral mo dule AcctNo; month;' with '3=93 = f1001; 1500g, '7=93 = f1001g, and 'i= ; for each i corresp onding to a month di erent from March and July of 1993. If a query is not correctly typ ed, there is no answer asso ciated with the query. In Section 6, we will showhow to answer incorrectly typ ed queries by using interval-based semantic assumptions. F F If an MQL formula only has a single temp oral sort, we call it a at-MQL formula. Alternatively, F F we view at-MQL as a 2-sorted fragmentofMQL , where all temp oral variables and temp oral constants are of the same temp oral typ e. Since there is a single typ e , the only predicate of the form IntSec is F IntSec , and IntSec is equivalent to =. Hence, in a at-MQL formula, we assume that IntSec ; ; F F F predicates do not app ear. A at-MQL query is an MQL query in which the formula is a at-MQL for- F mula. If a at-MQL query is correctly typ ed, all temp oral constants and variables app earing in the query have the same sort and wemay omit the syntactic sugar on the quanti ers on temp oral variables and the typ e declaration of the free temp oral variable when their typ e is clear from the context. The F following is an example of at-MQL query. fx; t j9y; z; wACCOUNTSx; y; z; w; t ^ z<1000g Assuming this query is correctly typ ed, then the typ e of t is whichissecond. This query ACCOUNTS asks for the time when an account has a balance lower than $1,000. 3 Point-Based Assumptions We call p oint-based assumptions those semantic assumptions that can b e used to derive information at certain ticks of time based on the information explicitly given at di erent ticks of the same temp oral typ e. Such derivation can b e done in a varietyofways. One way is to assume that the values of certain attributes p ersist in time unless they are explicitly changed. Another way is to assume that a missing value is taken as the average of the last and next explicitly given values. Still another wayistotake the sum of the last three values, etc. In this section, we give a general notion of p oint-based assumptions that uses, in principle, anyinterp olation function to derive information from explicit values. In the next subsection, we illustrate p oint-based assumptions by using p ersistence as an example. The discussion of the p ersistence assumption also motivates the syntax and the semantics of general p oint-based assumptions. 3.1 An example: Persistence assumption A p oint-based assumption that is widely used in practice and in the literature is the p ersistence assump- per sis tion. With P Y , we denote the assumption of the attributes XY b eing p ersistent with resp ect X to the attributes X . This intuitively means that if wehave explicit values for X and Y at a certain tick of time, these values will p ersist in time until we nd a tick at whichwehave explicit values that are the same for the attributes X but di erent for Y . More formally, let M =R; ; ' b e a temp oral mo dule per sis and P Y b e a p ersistence assumption. Then, wesay that a tuple t on XY is derived by this X 00 00 assumption for tick i if there exists j 0 0 2 for all k , j information pro jected on XY in the original temp oral mo dule. Hence, a p ersistence assumption on a temp oral mo dule gives all the original information pro jected on XY plus all the derived information. Consider our running example of the ACCOUNTS temp oral mo dule. The assumption per sis P AccHol; Balance; AnIntRate AcctNo says that the values of these four attributes p ersist in time until a di erentvalue for one of the attributes AccHol, Balance,orAnIntRate with resp ect to the same AcctNo is found. Note that this is a reasonable assumption for the ACCOUNTS temp oral mo dule. In the temp oral database literature a notion similar to p ersistence is found when the value of a tuple is given for an interval [1;uc] where uc is a short hand for \until changed" [WJL91, CI94]. However, the notion of until changed is not well formalized. For example, it is not clear which attributes must actually change to b e quali ed as \changed". Besides having a clear semantics, p ersistence assumptions are more p owerful, since more than one p ersistence assumption can b e sp eci ed on the same set of attributes. In the next subsection, we formally de ne the syntax and the semantics of p oint-based assumptions, of which the p ersistence assumption is an example. 3.2 Syntax and semantics of p oint-based assumptions A p oint-based semantic assumption relies on the use of certain metho ds called assumption methods to derive implicit values from explicit ones. A metho d is used to deriveavalue of an attribute by meth \interp olating" values of the same attribute at di erent ticks of time. The expression P Y X denotes the assumption using metho d meth to derive implicit values of attributes XY with resp ect to attributes X .We assume there is a xed set of assumption metho ds. Wenow give the syntax of semantic assumptions. De nition Let X , Y , :::, Y b e pair-wise disjoint sets of attributes i.e., X \ Y = ; and Y \ Y = ; 1 n i i j meth meth 1 n for each i 6= j and meth , :::, meth assumption metho ds. Then P Y ::: Y is called a 1 n X n 1 point-based assumption. The attribute set X is called the base of the assumption. The requirement that the sets X , Y , :::, Y b e pair-wise disjoint in the ab ove de nition intuitively 1 n says that it is not p ossible to use di erent assumption metho ds for the same attribute. Before de ning the semantics of our semantic assumptions, we need to intro duce the semantics of assumption metho ds. De nition Semantics of assumption metho ds For each assumption metho d meth, there is an asso ciated mapping meth from two sets of attributes and a temp oral mo dule to a temp oral mo dule such that, for all attribute sets X and Y and temp oral mo dule M =R; ; ', with XY R, the following conditions are satis ed: 1. methX; Y ; M is a temp oral mo dule on XY; ; 0 0 2. if methX; Y ; M = XY;;' , then 'i ' i for each p ositiveinteger i; XY T T methX; Y ; M ; and M = 3. for each set Z of attributes with Z Y , methX; Z; XZ XZ T T methX; Y ; M . M = 4. for eachvalue x on X , methX; Y ; X =x X =x 10 The rst condition in the ab ove de nition sp eci es the scheme of the derived mo dules. The second condition requires that each derived mo dule contains all the original tuples. That is, a semantic metho d can only b e used to add tuples but not to change or delete original tuples. The third condition in the de nition requires that each metho d derives values for a certain attribute indep endently from values of other attributes. The nal condition requires that each metho d derives values for a certain attribute by considering only tuples in the original mo dule having the same values for X . Although the semantics of an assumption metho d is given as a mapping, it is usually describ ed by a syntactic expression in practice. Any formal language that is suciently expressive can b e used for F this purp ose. In Section 4 we give a formalism based on MQL to sp ecify the semantics of assumption metho ds. An example of a metho d that is di erent from p ersistence is avg. By applying avg to a set of numeric attributes Y with base attributes X ,we derive the implicit tuples on XY where the value for each A 2 Y is taken as the average b etween the values of A in the last and next explicitly stored tuples that have the same values for X . Example 2 Consider a temp oral mo dule RISKS that stores, for each account holder who received a loan from the bank, a measure of the p ossible risk for the bank: RISKS =hAccHol; EstRiski; day;' where '2=4=93 = fhJ.Smith, 2i; hT.Ford, 4ig, '6=4=93 = fhJ.Smith, 4i; hT.Ford, 3ig, and ' is empty for all other days. Then av g fAccHolg; fEstRiskg; RISKS is the temp oral mo dule that contains the original information as well as the following: For eachdaybetween 2/4/93 and 6/4/93, the EstRisk values for customers J.Smith and T.Ford were resp ectively 3 and 3:5. 2 Having de ned the semantics of assumption metho ds, we can now give the semantics of semantic assumptions. De nition Semantics of p oint-based assumptions meth meth 1 k For each p oint-based assumption P = P Y :::Y , there is an asso ciated partial map- X 1 k ping f from temp oral mo dules to temp oral mo dules such that for each temp oral mo dule M = P R; ; ' with XY :::Y R,ifmeth X; Y ;M= XY ;;' for each j =1;:::;k, then f M = 1 k j j j j P 0 0 XY :::Y ;;' where ' i= ' i ./ ::: ./' i for each i 1. 1 k 1 k meth meth k 1 :::Y combines via natural join the implicit in- Thus, the semantic assumption P Y X 1 k formation on attributes XY , :::, XY supplied by the metho ds meth , :::, meth , resp ectively. 1 k 1 k A consequence of the ab ove de nition is the fact that in general, the semantics asso ciated with meth meth meth P Y Y is not always the same as that of P Y Y . Indeed, consider the temp o- X X 1 2 1 2 ral mo dule M =hA; B ; C i;;', where '1 = fh1; 2; 3i; h1; 3; 4ig, and the two semantic assump- per sis per sis per sis tions P = P B C , P = P BC . Let f M = hA; B ; C i;;' and f M = 1 A 2 A P 1 P 1 2 hA; B ; C i;;' . It is easily seen that h1; 2; 4i is in ' 2, but not in ' 2. 2 1 2 We consider this as natural since it is likely that the reason that the two sets Y and Y of 1 2 attributes are written separately is b ecause their values are derived indep endently. However, the base X of a p oint-based assumption is often a temp oral key i.e., there do not exist two distinct tuples that b elong to the same tickyet have the same X values, in which case the semantics of meth meth meth and P Y Y are the same. Indeed, by condition 3 of assumption metho ds se- Y P Y X 1 2 X 2 1 mantics, the temp oral mo dules obtained by methX; Y ;M and methX; Y ;M are the pro jections of 1 2 0 M = methX; Y Y ;M. Since X is a temp oral key, the join of the two pro jections, i.e., the semantics 1 2 0 meth meth meth is the same mo dule M i.e., the semantics of P Y Y . Y of P Y X 1 2 X 2 1 11 3.3 Prop erties of temp oral mo dules with assumptions Given a temp oral mo dule M and a set of assumptions , wewould like to know if there is a new temp oral mo dule that mo dels all the information implied by M under . Clearly, all the tuples in f M , for P each P in , should b e present; however, the new mo dule should not include any extraneous tuples. We call such a mo dule minimal closure of M under . In order to formalize this notion, we rst de ne a subsumption relation among temp oral mo dules. 0 0 De nition Given two mo dules M =R; ; ' and M =R; ; ' , wesay that M is subsumedby 0 0 0 M , denoted M M , if for each p ositiveinteger i, 'i ' i. The subsumption is said to b e strict, 0 0 0 denoted M M ,if M M and M 6= M . De nition Let M b e a temp oral mo dule and P a p oint-based assumption with W as the set of all the attributes that app ear in P . A temp oral mo dule M is called a closureof M under P if M is on 1 1 T the same temp oral scheme of M , M M , and f M M . M is called a minimal closureof 1 P 1 1 W M under P if it is a closure of M under P and there do es not exist closure M of M under P such that 2 0 M M . A temp oral mo dule M is said to b e a closureof M under a set ofpoint-based assumptions 2 1 if it is a closure under each assumption P in , and is said to b e a minimal closureof M under , if 00 00 0 there do es not exist closure M of M under such that M M . Notice that the minimal closure of a certain mo dule under a set of p oint-based assumptions may not b e unique. For example, given M =AB C ; ; ' with '1 = a; b; c and 'i=; for each i> 1, and per sis the assumption P = P B , we can nd as many minimal closures of M under the assumption P , A as the numberofvalues that attribute C can take. Indeed, for eachvalue x that C can take, the mo dule M =AB C ; ; ' with ' 1 = a; b; c and ' i= a; b; x for each i> 1, is a minimal closure of M x x x x under P . Here we give a sucient condition for the existence of the unique minimal closure. Prop osition 1 Let b e a set of p oint-based assumptions and R; a temp oral scheme. If for each meth meth 1 k in ,wehave XY Y = R, then each temp oral mo dule on R; assumption P Y ::: Y 1 k X 1 k has a unique minimal closure under . Furthermore, if the minimal closure of a temp oral mo dule M under is R; ; ', and the minimal closure of M under each assumption P in isR; ; ' , then P S 'i= ' i. P P 2 The opp osite is not necessarily true even if wehave only p ersistence assumptions. Consider for example the temp oral mo dule M =AB C ; ; ' with '1 = 1; 1; 1 and 'i= ; for all other ticks. per sis per sis per sis Supp ose wehavetwo assumptions, P C and P B . Since P B do es not include AB C C all the attributes in M , it violates the condition of the ab ove prop osition. However, M do es have the 0 0 0 unique minimal closure M =AB C ; ; ' , where ' i= 1; 1; 1 for all i. We use minimal closures to develop two useful notions. The rst notion deals with the question whether two temp oral mo dules are equivalent under assumptions. If there are no semantic assumptions, two temp oral mo dules are equivalent if and only if they have the same schemes, same temp oral typ es and same windowing functions, i.e., they are identical temp oral mo dules. However, when semantic assumptions are present, two temp oral mo dules may not b e identical, yet they may contain the same information. To see this, consider the temp oral scheme hAccHol,Addressi, month and the two mo dules M and M on this scheme. For the rst month, ' 1 = ' 1 =fJ.Smith, 16 Fifth Av.,NYg, 1 2 M M 1 2 while for the second month ' 2 =fA.Brady, 5 Old St.,NYg and ' 2 = ' 1 [ ' 2. These 1 M M M 2 1 1 mo dules are not identical, since ' 2 6= ' 2. However, they are equivalent under the assumption M M 1 2 per sis P Address , since they have identical minimal closures under that assumption. AccHol 12 0 0 0 De nition Given two mo dules M =R; ; ' and M =R; ; ' , wesay that M is equivalent to M under assumptions if they have identical sets of minimal closures. Since there may b e di erent mo dules that are equivalent under a set of assumptions, it is interesting to see if there is a minimum temp oral mo dule with resp ect to the subsumption relation among these equivalent mo dules. In practice, wemaycho ose to use such a minimum mo dule to save storage space. De nition Minimal Representation 0 Let b e a set of p oint-based assumptions and M a temp oral mo dule. A temp oral mo dule M is said to 0 be a minimal representation of M under ifM is equivalenttoM under , and there is no temp oral 00 00 00 0 mo dule M such that M is equivalenttoM and M M . The second notion we develop deals with implicit assumptions. It is p ossible to derive implicit as- sumptions from a given set of explicit ones. For example, given the assumption per sis per sis P AccHol; Balance; AnIntRate we can derive P Balance . Intuitively, given AcctNo AcctNo an assumption P and a set of assumptions, we will say that P is implied by if, for any mo dule, any implicit information that can b e derived from P can also b e derived from . Formally, De nition Implied Assumption A p oint-based assumption P is implied by a set of assumptions we write j= P if given an arbitrary mo dule M, each closure of M under is also a closure of M under P . Prop osition 2 The following derivation rule is sound for all meth , :::, meth : 1 k meth meth meth meth 1 k 1 k ;:::;W ifW Y for each j with 1 j k . ;:::;Y implies P W P Y j j X X 1 1 k k An interesting prop erty of p ersistence assumption is that there exists a sound and complete deriva- tion rule for p ersistence assumptions. Theorem 1 The following derivation rule is b oth sound and complete for p ersistence assumptions: per sis per sis If P Y , then P W for all Z X and W X n Z [ Y . X Z 4 Query Evaluation with Point-Based Assumptions In this section, we assume that each temp oral mo dule M in the temp oral database is asso ciated with a set