Temp oral Semantic Assumptions and Their Use in Query



Evaluation

Claudio Bettini X. Sean Wang Sushil Ja jo dia

Technical Rep ort: ISSE-TR-95-105

Department of Information and Software Systems Engineering

George Mason University

June 13, 1995

Abstract

Temp oral data explicitly stored in a temp oral database are often asso ciated with certain seman-

tic assumptions. Each assumption can b e viewed as a way of deriving implicit information from the

explicitly stored data. Rather than leaving the task of deriving p ossibly in nite implicit data to

application programs, as is the case currently, it is desirable that this b e handled by the database

management systems. Toachieve this, this pap er formalizes and studies twotyp es of semantic as-

sumptions: p oint-based and interval-based. The p oint-based assumptions include those assumptions

that use interp olation metho ds, while the interval-based assumptions include those that involve dif-

ferent temp oral typ es time granularities. In order to incorp orate semantic assumptions into query

evaluation, this pap er intro duces a translation pro cedure that converts a user query into a system

query such that the answer of this system query over the explicit data is the same as that of the user

query over the explicit and the implicit data. The pap er also investigates the niteness safety of

user queries and system queries.

1 Intro duction

A prominent feature of temp oral information is its richness in semantics asso ciated with its temp oral

domain. When querying a temp oral database, a user naturally assumes some \usual" semantics on

stored temp oral data. For instance, she exp ects that her bank account balance p ersists, i.e., the balance

stays the same unless a transaction|dep osit, withdrawal or accrual of interest|is p erformed. Therefore,

if she wishes to nd the balance at a particular time and no balance amount is stored for that time, she

lo oks for the balance of the last transaction that was p erformed b efore the time in question. Semantic

assumptions may also involve di erent temp oral typ es time granularities. For example, when the user



A preliminary version of this pap er app eared in the 1995 Pro ceedings of the ACM SIGMOD International Conference

on Management of data. This work was supp orted in part by an ARPA grant, administered by the Oce of Naval Research

under grantnumb er N0014-92-J-4038. Work of Bettini was p erformed in part while visiting George Mason University.Work

of Wang was also supp orted by the NSF grant IRI-9409769. Bettini is with the Dipartimento di Scienze dell'Informazione,

University of Milano, 20135 Milano, Italy, and Wang and Ja jo dia are with the Department of Information and Software

Systems Engineering, George Mason University,Fairfax, VA 22030{4444. Emails: [email protected] and fxywang,

[email protected]. 1

asks for the account holder of a certain account in a particular month and account holders are stored

in terms of days, she naturally assumes that the answer is someone who is the account holder of that

account on all the days within that month.

Researchers have long realized such richness of semantics in temp oral data [CT85, SS87, Tan87] and

provided various op erators in di erent query languages for the users to co de the semantic assumptions

into queries [SS87,Tan87]. For the aforementioned balance query, the user would use a \last" or

similar op erator in her query to retrieve the appropriate balance.

We argue, however, that temp oral semantics should b e an integral part of temp oral

and users should not b e burdened with having to incorp orate these semantics in their queries. In the

account balance example, the user should b e able to query the database directly ab out the balance at

any particular time, instead of having to co de the semantics of the balance into the query. The system

should b e able to answer the query appropriately according to the semantics. To provide such an ability,

the system has to

 include a precise formalization of the temp oral semantics, and

 evaluate queries according to these semantics.

In this pap er, we provide a framework to incorp orate temp oral semantics into temp oral databases and

investigate a metho d to evaluate queries with semantics.

Consider the bank account example in more detail. Assume the temp oral ACCOUNTS records

accountnumb ers AcctNo, account holder AccHol, account balance Balance, annual interest rate

AnIntRate, and the asso ciated time Time, i.e., the time when the rep orted values b ecome true. We

can assume that new tuples are added to this relation when an event o ccurs such as the op ening of

an account, a dep osit, a withdrawal, or a change in interest rates. The values of Time are timestamps

consisting of the date month/day/year concatenated with the time of the day up to seconds. Here

and in the rest of the pap er timestamps represent valid time; extensions to include transaction time and

other temp oral dimensions are not considered.

An instance of the ACCOUNTS relation is shown in Figure 1.

AcctNo AccHol Balance AnIntRate Time

1001 J.Smith 1000 3.00 3/3/93:09:01:00

1001 J.Smith 3500 4.00 3/4/93:10:01:55

1500 A.Brady 2000 3.00 3/4/93:11:00:00

2034 T.Ford 500 2.50 3/4/93:12:19:03

1500 A.Brady 1500 3.00 3/4/93:18:00:00

1001 J.Smith 4000 4.00 7/3/93:09:00:00

Figure 1: An instance of the relation ACCOUNTS

Since we store new tuples only when some of the attribute values change, the relation do es not

contain an explicit tuple for every second of eachday for an account. However, if asked ab out the

balance of Mr. Smith's account at no on of March 4, 1993, we could answer without hesitation that the

balance was $3,500. This is b ecause we assume that nothing has changed for that account since the last

transaction b efore that time. Here wesay that the attribute Balance satis es a semantic assumption

called persistence for each account. Another example of p ersistence is given by the attribute AccHol

that is intuitively p ersistent for each account. 2

Other assumptions apply when di erent temp oral typ es or granularities are considered. For

example, the attribute AnIntRate can b e classi ed as liquid, meaning that it satis es the following

prop erties:

It is downward-hereditary: If the value of AnIntRate is 3.00 for a month, we can assume that its

value is 3.00 for eachday of that month.

It is upward-hereditary: If the value of AnIntRate is 3.00 for eachday of a month, we can assume

that its value is 3.00 for that month.

The terms \liquid" and \downward/upward hereditary" are b orrowed from the temp oral reasoning

community [Sho87].

Note that not all attributes are p ersistent and/or liquid. For example, consider a relation with

the attributes hAcctNo, Amount, secondi that stores the time and the amount of each dep osit or

withdrawal to a certain account. The Amount attribute is obviously neither p ersistent nor liquid.

The aforementioned semantic assumptions, namely p ersistence and liquidity, are among the most

common ones that have b een identi ed by the temp oral reasoning community. There are other semantic

assumptions that arise quite naturally in databases. For example, if a database stores annual salaries

of the employees, the monthly salary can easily b e determined by dividing the annual salary by 12.

Contribution of the pap er

Recognizing and formalizing that a group of attributes satis es some semantic assumption is very useful

in answering queries ab out values of attributes at times of same or di erent temp oral typ es for which

avalue is not explicitly stored. One of the contributions of this pap er is to formalize the notion of

semantic assumptions for the purp ose of query evaluation.

In this pap er, each semantic assumption is viewed as a function for deriving implicit information

from explicit information by applying one or more information generation pro cedures. Such generation

pro cedures are formally captured by our notion of assumption metho ds. Intuitively, an assumption

metho d derives implicit information using some sp eci c rule. For example, \p ersistence" can b e regarded

as one such rule that uses the latest stored value as the currentvalue for an attribute. In Figure 1,

by applying the p ersistence metho d on the Balance attribute, we obtain $1,000 to b e the balance at

the time 3/3/93:09:01:01, 3/3/93:09:01:02, 3/3/93:09:01:03, 3/3/93:09:01:04, and so on. The function

for a semantic assumption that uses p ossibly di erent metho ds to generate information for di erent

attributes, is then a \comp osition" of the rules of the appropriate assumption metho ds.

DB the database that contains, in addition to DB , the We denote by DB the database and by

data generated by applying all semantic assumptions. A user query on a database DB is viewed as a

query on DB . If the query asks for the balance of account 1001 at no on of March 4, 1993, it retrieves

values from DB , and gives $3,500 as the answer.

However, it is usually impractical for the database to store and manage DB . Indeed, consider the

attribute AnIntRate in Figure 1. Since AnIntRate is liquid, it is p ossible to derivevalues for AnIntRate

for each calendar unit. Not only an enormous e ort is required to manage all these derived values, but

this e ort maybewasted since users may not b e interested in knowing the interest rate for all p ossible

calendar units. A similar problem exists for the Balance attribute.

Our solution is to translate the user query into a query that incorp orates the semantic assumptions.

As shown in Figure 2, a user query Q is viewed as a query on DB . Instead of using the semantic

0

DB ,we use them to translate the query Q into another query Q such that the assumptions to build 3

0

new query evaluated on the stored database Q [DB ] is the same as the answer of the user query Q

DB Q[DB ]. In formula, we require that over

DB

H

H

H

H

H

H

0

Q : system query

H

H

H

H

Semantic

H

H

assumptions

H

H

H

H

H

H

H

H

H

? - Hj

DB Q: user query Query result

0 0

Figure 2: Do es there exist a Q such that Q [DB ]=Q[ DB ]?

0

Q [DB ]=Q[ DB ]:

0

Obviously, whether there always exists a Q for each Q largely dep ends on the languages used. In this

F

pap er, we use a single language MQL for the semantic assumptions, for user queries Q,aswell as for

F

0

system queries Q .MQL is an extension of the intro duced in [WJS93] as a general query

F

language on temp oral databases. We show that MQL is a reasonably p owerful language for sp ecifying

0

semantic assumptions. Furthermore, we show that the aforementioned system query Q always exists if

F F

all semantic assumptions are sp eci ed byMQL and all user queries are in MQL .We provide automatic

0

derivation of Q .

Another contribution of the pap er is in its analysis of safety issues of the temp oral database query

F

language MQL .Persistence, and other semantic assumptions, naturally gives rise to in nite information.

For example, the p ersistence of the Balance attribute dictates that the relation in Figure 1 generates

in nitely many tuples: For account 1001, there will b e one tuple for each second after 7/3/93:09:00:00.

There are an in nite numb er of seconds. Such in niteness, called eventual uniformity in this pap er, has

b een accommo dated by researchers by using sp eci c data mo dels e.g., the tuple timestamp [1; 1]of

[Sno84]. However, it is not analyzed formally in a general setting. In this pap er, we study the safety

requirement of query languages that yields only eventually uniform results and other general safety

criteria.

In summary, the contribution of the pap er is three-fold: 1 Semantic assumptions are formalized

for the purp ose of database query evaluation; 2 A paradigm for query evaluation is intro duced to

incorp orate semantic assumptions; and 3 Safety issues are analyzed in this more general setting.

Related research

As mentioned earlier, some of our terms used for semantic assumptions are b orrowed from the temp oral

reasoning community. In particular, the term liquidity is from [Sho87], while the term persistence has

b een extensively used in temp oral reasoning for planning e.g., [AKPT91]. The notion of p ersistence

has also b een used in the context of temp oral databases in [DM87] to make predictions ab out unknown 4

facts in the database; however, the emphasis of [DM87] is on consistency maintenance of a logical

database containing uncertain temp oral information ab out events. Logical databases and uncertain

information are b eyond the scop e of this pap er. Instead, we consider general classes of assumptions

including p ersistence and assumptions involving multiple granularities. Our fo cus is on relational-like

databases and query evaluation on these databases.

Various semantic assumptions in temp oral database settings were p erhaps recognized rst by Clif-

ford and Warren [CW83]. The earliest systematic study was however p erformed by Segev and Shoshani

[SS87]. They recognized various prop erties of time sequences, such as \stepwise constant" and \contin-

uous", and provided a numb er of functions to b e used in user query languages to accommo date these

prop erties. A larger set of sophisticated op erators for the same purp ose were prop osed byTansel in

[Tan87]. By using these op erators, many semantic assumptions can b e co ded into user queries. These

works di er in two resp ects from the current one. First, in this pap er, we formalize the notion of seman-

tic assumption, while in these earlier studies, no general de nition is given. Second, we assume that the

user queries the database with the assumptions built in, instead of having to co de them into queries.

Thus, in this study, the query language for the user is a simple extension of the relational calculus the

extension is for incorp orating the temp oral dimension, not for semantic assumptions.

Other researchers [CT85, WJL91, CI94] also recognized the imp ortance of semantic assumptions

in temp oral databases. Cli ord [CT85] p ointed out the use of interp olation in temp oral databases;

however, the query evaluation was not formalized. Cli ord and Isakowitz [CI94] dealt with formalizing

the semantics of variables that many temp oral data mo dels employ to denote various intuitive semantic

assumptions. The work clari ed manyvague, although intuitive, notions. However, [CI94] did not

address how user queries are evaluated on databases with suchvariables.

Regarding the safety of temp oral queries, the problem is sometimes avoided by assuming a nite time

domain and the safety problem reduces to that of standard relational query languages. A recent pap er

by Cli ord, Croker and Tuzhilin [CCT94] mentioned a syntactic safety requirement based on [Ull88 ] for

a logic based temp oral query language. However, it is not clear what prop erties this syntactic restriction

leads to.

Organization

The rest of the pap er is divided into 8 sections. Section 2 de nes our temp oral data mo del. In Section 3,

weintro duce the notion of a general class of semantic assumption: p oint-based semantic assumptions.

Persistence is a sp ecial case of the p oint-based assumptions. Several imp ortant prop erties of p oint-

based assumptions, and the p ersistence assumption in particular, are investigated. Query evaluation

with resp ect to p oint-based assumptions is discussed in Section 4. Interval-based assumptions that

include the liquidity assumption are formalized in Section 5 and query evaluation with resp ect to such

assumptions is studied in Section 6. In Section 7, the ab ovetwo kinds of assumptions are combined in

evaluating queries. Safety issues are discussed in Section 8. Finally, the pap er is concluded with some

remarks on the results of the pap er and on future research directions in Section 9.

2 Data Mo del

The data mo del used in this pap er is based on that intro duced in [WJS93, WBBJ94] as a uni ed interface

for accessing di erent underlying temp oral information systems. It is b elieved that the concepts and

the results of this pap er are readily translated to other temp oral data mo dels. 5

2.1 Temp oral typ es

We start with de ning temp oral typ es that mo del typical and atypical calendar units. We assume

there is an underlying notion of absolute time, represented by the set N of all p ositiveintegers.

De nition Let I b e the set of all intervals on N , i.e., I = f[i; j ] j i; j 2N and i  j g[f[i; 1] j i 2

N N

1

Ng. A temporal type is a mapping  from the set of the p ositiveintegers the time ticks to the set

I [ f;g i.e., all intervals on N plus the empty set such that for each p ositiveinteger i, all following

N

conditions are satis ed:

1 if i= [k; l] and i +1= [m; n], then m = l +1.

2 i= ; implies i +1= ;.

3 there exists j such that j = [k; l] with k  i  l .

Condition 1 states that the mapping must b e monotonic and that the intervals corresp onding to

consecutive ticks should not have a gap b etween them. Condition 2 disallows a temp oral typ e to map

a certain time tick to the empty set unless it maps all subsequent time ticks to the empty set. Condition

3 requires that each p ositiveinteger must b e contained in the interval corresp onding to a tickofthe

temp oral typ e. A tick i of typ e  is said to b e empty if i= ;. One particular consequence of the

ab ove three conditions is that the last non-empty tick if it exists must b e mapp ed to an interval of

the form [i; 1].

Typical calendar units suchasday, month, week and year can b e de ned as temp oral typ es. As

an example, supp ose the underlying time is measured in terms of seconds. Then the calendar unit day

assuming it starts on the rst day of 1900 is a mapping such that day1 is the set of all the seconds

that comprise the rst day of 1990, and day2 maps to all the seconds of the second day of 1990, and

so on.

There is a natural \ ner-than" relation among temp oral typ es. The temp oral typ e  is said to b e

ner than the temp oral typ e  if for each tickof, the corresp onding time interval is \entirely covered"

by the time interval of some tickof .Thus, for example, day is ner than week and month is ner than

year.Formally,wehave:

De nition Let  and  b e temp oral typ es. Then  is said to b e ner than  , denoted    ,if

1 2 1 2 1 2

for each i, there exists j such that  i  j .

1 2

It is easily seen that  is a partial order. However, it is not a total order since, for example, week

and month are incomparable i.e., week is not ner than month, and month is not ner than week. In

fact, the set of all temp oral typ es is a complete lattice with resp ect to the ner-than relation [WJS93].

Another imp ortant relation regarding temp oral typ es involves time ticks. For example, wewould

liketosay that a particular month is within a particular year. For this purp ose, we assume there is a

binary interpreted predicate IntSec for each pair of temp oral typ es  and  :

;

De nition For temp oral typ es  and  , let IntSec b e the binary predicate on p ositiveintegers

;

such that IntSec i; j is TRUE if i \  j  6= ;, and IntSec i; j is FALSE otherwise.

; ;

1

An interval [i; j ][i; 1], resp. is viewed as the set of all integers k such that i  k  j k  i, resp.. 6

In other words, IntSec i; j is TRUE i the intersection of the corresp onding absolute time

;

intervals of tick i of  and tick j of  is not empty.Thus, for example, IntSec i; j  is true i

month;year

the month i falls within the year j .

2

It is imp ortant to emphasize that a realistic system can treat in nite temp oral typ es and their

corresp onding IntSec relations only if these in nite typ es have nite representations. Various p erio dical

descriptions, e.g., [KSW90, NS92], are p ossible but outside the scop e of this pap er.

2.2 Temp oral mo dule schemes and temp oral mo dules

The temp oral mo dules were presented in [WJS93, WBBJ94] as a uni ed interface for accessing di erent

underlying temp oral information systems. They can b e viewed as \abstract temp oral databases" [Cho94]

or a \conceptual data mo del" [JSS93]. It is b elieved that the concepts and the results of this pap er are

readily translated in terms of other temp oral data mo dels.

We assume that there is a set of attributes and a set of values called the data domain. Each nite

set R of attributes is called a relation scheme. A relation scheme R = fA ;:::;A g is usually written as

1 n

hA ;:::;A i.For relation scheme R, let TupR denote the set of all mappings, called tuples, from R

1 n

to the data domain. A tuple  of relation scheme hA ;:::;A i is usually written as ha ;:::;a i, where

1 n 1 n

a =  i for each1 i  n.

i

De nition A temporal module scheme is a pair R;  where R is a relation scheme and  a temp oral

typ e. A temporal module is a triple R; ; ' where R;  is a temp oral mo dule scheme and ' is a

TupR

mapping, called a time windowing function, from N to 2 such that 'i= ; if i= ; for each

S

i, and 'i is a nite set.

i1

Intuitively, the time windowing function ' in a temp oral mo dule R; ; ' gives the tuples facts

that hold at non-empty time tick i of temp oral typ e . This is a generalization of many temp oral mo dels

in the literature.

A temporal database is a nite set of temp oral mo dules. Throughout this pap er, we assume a xed

set of temp oral mo dule schemes, which is the database scheme. A temp oral database thus is only a

di erent instantiation of the windowing functions of the xed temp oral mo dule schemes. Furthermore,

each temp oral scheme is assigned a unique name. For each temp oral mo dule scheme M,we shall use R

M

and  to denote the relation scheme and temp oral typ e, resp ectively.We also use ' to denote the

M M

\current" instantiation of the windowing function of scheme M.We use M as the name of the current

instantiation of M.For convenience, in temp oral mo dule examples, instead of the p ositiveintegers we will

sometimes use an equivalent domain. For instance, the set of expressions of the form 3/3/93:09:01:00

month/day/year:hour:minute:second will serveassuch a domain.

Example 1 We the temp oral relation ACCOUNTS given in the intro duction as a temp oral mo dule

with ACCOUNTS, second, where ACCOUNTS = hAcctNo, AccHol, Balance, AnIntRatei, as its scheme.

The relation in Figure 1 corresp onds to the time windowing function ' de ned as follows:

'3/3/93:09:01:00=fh1001, J.Smith, 1000, 3.00ig

'3/4/93:10:01:55=fh1001, J.Smith, 3500, 4.00ig

'3/4/93:11:00:00=fh1500, A.Brady, 2000, 3.00ig

'3/4/93:12:19:03=fh2034, T.Ford, 500, 2.50ig

'3/4/93:18:00:00=fh1500, A.Brady, 1500, 3.00ig

'7/3/93:09:00:00=fh1001, J.Smith, 4000, 4.00ig

2

A temp oral typ e is said to b e in nite if it has an in nite numb er of non-empty ticks. 7

Note that the ab ove windowing function only re ects the explicit data stored in the relation. We will

use semantic assumptions in Section 3 to give implicit data.

2

Finally,we de ne the pro ject and select op erations on temp oral mo dules that will b e used in later

T 0

sections. Let M b e a temp oral mo dule and W R . Then  M  is the temp oral mo dule W; ;' 

M M

W

0

such that ' i=  'i for each i  1. Note that  is the standard pro ject op eration. Similarly,

W W

T 0 0

 M  is the temp oral mo dule R ; ;'  such that ' i=  'i for each i  1. Note that  is

M M F F

F

the standard select op eration.

F

2.3 The query language MQL

We sp ecify the query language on temp oral mo dules using a multi-sorted rst-order logic: One sort is

a generic non-temp oral data domain, while the other sorts are temp oral ones corresp onding to the

temp oral typ es that we consider. Each temp oral typ e we use gives rise to a distinct temp oral sort.

We shall use \temp oral typ e" and \temp oral sort" interchangeably when no confusion arises. Also,

we allow certain scalar functions [EMHJ93], suchas+; ;=; , on the data domain. These functions

are not intended as functions interpreted in the logic, but rather as external calls to the system with

F

xed interpretations. The formulas of this multi-sorted logic are called MQL formulas and the query

F

language we build using the logic is called MQL Multi-sorted Query Language with F unctions.

The range of each data variable is the data domain and the range of each temp oral sort variable is

F

N . The data terms of MQL formulas are of the form x or f x ;:::;x , where each x isavariable

0 1 k 0

name or a constant of the non-temp oral data sort, each x 1  i  k  is a term of the data sort,

i

and f is a function of arity k .Thus, functions can b e nested and are only for the data sort. Atomic

F

MQL formulas are of the following four typ es: a x = x where x and x are data terms; b

1 2 1 2

IntSec t ;t , where t and t are variables or constants resp ectively of sorts  and  ; c t

; 1 2 1 2 1 2

t = t , where t and t are variables or constants of the same temp oral sort; and d Mx ;:::;x ;t

1 2 1 2 1 n

where M is a temp oral mo dule scheme name of the database with n b eing the arityof R , each x is

M i

F

a non-temp oral variable name or constant and t a temp oral variable or constant. An MQL formula

is formed by b o olean connectives and the existential and universal quanti ers in a usual way.Asa

syntactic sugar, wechange the quanti cation of temp oral sorts to the form 9t :  and 8t : , where t is of

sort . This syntactic sugar allows us to tell the sorts of b ounded variables from the formula itself. The

predicates IntSec are interpreted as in the previous subsection, < is interpreted as the integer order,

and = is the standard equality.

F

query is of the form An MQL

fx :::x ;t:  j  x ;:::;x ;tg

1 k 1 k

where x :::x are variable names or constants of the non-temp oral sort, t is a temp oral variable of

1 k

F

typ e  , and  x ;:::;x ;tisanMQL formula whose only free variables are among x ;:::;x ;t. The

1 k 1 k

formula  can obviously contain temp oral variables of sorts di erent from  , provided they are b ounded.

F

The following is an example MQL query on the ACCOUNTS temp oral mo dule:

fx; t : month j9y; z; w 9s : second ACCOUNTSx; y; z; w; s ^ z  2000 ^ IntSec t; sg

month;second

This query asks for the months in which an account has a balance over $2,000 for at least one second.

F

An MQL query is correctly typed if for all subformula Mx ;:::;x ;t app earing in it, the temp oral

1 n

sort of t is  , namely the temp oral typ e of M. The ab ove example query is correctly typ ed since the

M

F

temp oral typ e of the temp oral mo dule ACCOUNTS is second. With each correctly typ ed MQL query

Q = fx ;:::;x ;t:  j  g,we asso ciate an answer, denoted Q[DB ], in the form of a temp oral mo dule

1 k 8

R; ; ', where R is a relation scheme of arity k and ' is given as follows: ha ;:::;a i is in 'mif

1 k

and only if  is TRUE when the free variables x , :::, x and t are substituted with the constants

1 k

a , :::, a and m. Note that the truth value of the atomic formula My ;:::;y ;sis TRUE i

1 k 1 n

hy ;:::;y i is in ' s. If we consider the aforementioned example query on a DB containing the

1 n M

temp oral mo dule ACCOUNTS of Example 1, its answer is the temp oral mo dule AcctNo; month;' with

'3=93 = f1001; 1500g, '7=93 = f1001g, and 'i= ; for each i corresp onding to a month di erent

from March and July of 1993.

If a query is not correctly typ ed, there is no answer asso ciated with the query. In Section 6, we will

showhow to answer incorrectly typ ed queries by using interval-based semantic assumptions.

F F

If an MQL formula only has a single temp oral sort, we call it a at-MQL formula. Alternatively,

F F

we view at-MQL as a 2-sorted fragmentofMQL , where all temp oral variables and temp oral constants

are of the same temp oral typ e. Since there is a single typ e , the only predicate of the form IntSec is

F

IntSec , and IntSec is equivalent to =. Hence, in a at-MQL formula, we assume that IntSec

; ;

F F F

predicates do not app ear. A at-MQL query is an MQL query in which the formula is a at-MQL for-

F

mula. If a at-MQL query is correctly typ ed, all temp oral constants and variables app earing in the

query have the same sort and wemay omit the syntactic sugar on the quanti ers on temp oral variables

and the typ e declaration of the free temp oral variable when their typ e is clear from the context. The

F

following is an example of at-MQL query.

fx; t j9y; z; wACCOUNTSx; y; z; w; t ^ z<1000g

Assuming this query is correctly typ ed, then the typ e of t is  whichissecond. This query

ACCOUNTS

asks for the time when an account has a balance lower than $1,000.

3 Point-Based Assumptions

We call p oint-based assumptions those semantic assumptions that can b e used to derive information at

certain ticks of time based on the information explicitly given at di erent ticks of the same temp oral

typ e. Such derivation can b e done in a varietyofways. One way is to assume that the values of certain

attributes p ersist in time unless they are explicitly changed. Another way is to assume that a missing

value is taken as the average of the last and next explicitly given values. Still another wayistotake the

sum of the last three values, etc. In this section, we give a general notion of p oint-based assumptions

that uses, in principle, anyinterp olation function to derive information from explicit values.

In the next subsection, we illustrate p oint-based assumptions by using p ersistence as an example.

The discussion of the p ersistence assumption also motivates the syntax and the semantics of general

p oint-based assumptions.

3.1 An example: Persistence assumption

A p oint-based assumption that is widely used in practice and in the literature is the p ersistence assump-

per sis

tion. With P Y , we denote the assumption of the attributes XY b eing p ersistent with resp ect

X

to the attributes X . This intuitively means that if wehave explicit values for X and Y at a certain tick

of time, these values will p ersist in time until we nd a tick at whichwehave explicit values that are the

same for the attributes X but di erent for Y . More formally, let M =R; ; ' b e a temp oral mo dule

per sis

and P Y  b e a p ersistence assumption. Then, wesay that a tuple t on XY is derived by this

X

00 00

assumption for tick i if there exists j

0 0

2 for all k , j

information pro jected on XY  in the original temp oral mo dule. Hence, a p ersistence assumption on a

temp oral mo dule gives all the original information pro jected on XY  plus all the derived information.

Consider our running example of the ACCOUNTS temp oral mo dule. The assumption

per sis

P AccHol; Balance; AnIntRate 

AcctNo

says that the values of these four attributes p ersist in time until a di erentvalue for one of the attributes

AccHol, Balance,orAnIntRate with resp ect to the same AcctNo is found. Note that this is a reasonable

assumption for the ACCOUNTS temp oral mo dule.

In the temp oral database literature a notion similar to p ersistence is found when the value of a tuple

is given for an interval [1;uc] where uc is a short hand for \until changed" [WJL91, CI94]. However,

the notion of until changed is not well formalized. For example, it is not clear which attributes must

actually change to b e quali ed as \changed". Besides having a clear semantics, p ersistence assumptions

are more p owerful, since more than one p ersistence assumption can b e sp eci ed on the same set of

attributes.

In the next subsection, we formally de ne the syntax and the semantics of p oint-based assumptions,

of which the p ersistence assumption is an example.

3.2 Syntax and semantics of p oint-based assumptions

A p oint-based semantic assumption relies on the use of certain metho ds called assumption methods

to derive implicit values from explicit ones. A metho d is used to deriveavalue of an attribute by

meth

\interp olating" values of the same attribute at di erent ticks of time. The expression P Y 

X

denotes the assumption using metho d meth to derive implicit values of attributes XY with resp ect to

attributes X .We assume there is a xed set of assumption metho ds.

Wenow give the syntax of semantic assumptions.

De nition Let X , Y , :::, Y b e pair-wise disjoint sets of attributes i.e., X \ Y = ; and Y \ Y = ;

1 n i i j

meth

meth

1

n

for each i 6= j  and meth , :::, meth assumption metho ds. Then P Y ::: Y  is called a

1 n X

n

1

point-based assumption. The attribute set X is called the base of the assumption.

The requirement that the sets X , Y , :::, Y b e pair-wise disjoint in the ab ove de nition intuitively

1 n

says that it is not p ossible to use di erent assumption metho ds for the same attribute.

Before de ning the semantics of our semantic assumptions, we need to intro duce the semantics of

assumption metho ds.

De nition Semantics of assumption metho ds

For each assumption metho d meth, there is an asso ciated mapping meth from two sets of attributes

and a temp oral mo dule to a temp oral mo dule such that, for all attribute sets X and Y and temp oral

mo dule M =R; ; ', with XY R, the following conditions are satis ed:

1. methX; Y ; M  is a temp oral mo dule on XY; ;

0 0

2. if methX; Y ; M = XY;;' , then  'i ' i for each p ositiveinteger i;

XY

T T

methX; Y ; M ; and M  =  3. for each set Z of attributes with Z Y , methX; Z; 

XZ XZ

T T

methX; Y ; M . M  =  4. for eachvalue x on X , methX; Y ; 

X =x X =x 10

The rst condition in the ab ove de nition sp eci es the scheme of the derived mo dules. The second

condition requires that each derived mo dule contains all the original tuples. That is, a semantic metho d

can only b e used to add tuples but not to change or delete original tuples. The third condition in the

de nition requires that each metho d derives values for a certain attribute indep endently from values of

other attributes. The nal condition requires that each metho d derives values for a certain attribute

by considering only tuples in the original mo dule having the same values for X .

Although the semantics of an assumption metho d is given as a mapping, it is usually describ ed by

a syntactic expression in practice. Any formal language that is suciently expressive can b e used for

F

this purp ose. In Section 4 we give a formalism based on MQL to sp ecify the semantics of assumption

metho ds.

An example of a metho d that is di erent from p ersistence is avg. By applying avg to a set of

numeric attributes Y with base attributes X ,we derive the implicit tuples on XY where the value for

each A 2 Y is taken as the average b etween the values of A in the last and next explicitly stored tuples

that have the same values for X .

Example 2 Consider a temp oral mo dule RISKS that stores, for each account holder who received a

loan from the bank, a measure of the p ossible risk for the bank: RISKS =hAccHol; EstRiski; day;'

where '2=4=93 = fhJ.Smith, 2i; hT.Ford, 4ig, '6=4=93 = fhJ.Smith, 4i; hT.Ford, 3ig, and ' is empty

for all other days. Then av g fAccHolg; fEstRiskg; RISKS is the temp oral mo dule that contains the

original information as well as the following: For eachdaybetween 2/4/93 and 6/4/93, the EstRisk

values for customers J.Smith and T.Ford were resp ectively 3 and 3:5.

2

Having de ned the semantics of assumption metho ds, we can now give the semantics of semantic

assumptions.

De nition Semantics of p oint-based assumptions

meth

meth

1 k

For each p oint-based assumption P = P Y :::Y , there is an asso ciated partial map-

X

1

k

ping f from temp oral mo dules to temp oral mo dules such that for each temp oral mo dule M =

P

R; ; ' with XY :::Y R,ifmeth X; Y ;M= XY ;;'  for each j =1;:::;k, then f M =

1 k j j j j P

0 0

XY :::Y ;;'  where ' i= ' i ./ ::: ./' i for each i  1.

1 k 1 k

meth

meth

k 1

:::Y  combines via natural join the implicit in- Thus, the semantic assumption P Y

X

1

k

formation on attributes XY , :::, XY supplied by the metho ds meth , :::, meth , resp ectively.

1 k 1 k

A consequence of the ab ove de nition is the fact that in general, the semantics asso ciated with

meth meth meth

P Y Y is not always the same as that of P Y Y  . Indeed, consider the temp o-

X X 1 2

1 2

ral mo dule M =hA; B ; C i;;', where '1 = fh1; 2; 3i; h1; 3; 4ig, and the two semantic assump-

per sis per sis per sis

tions P = P B C , P = P BC  . Let f M = hA; B ; C i;;'  and f M =

1 A 2 A P 1 P

1 2

hA; B ; C i;;' . It is easily seen that h1; 2; 4i is in ' 2, but not in ' 2.

2 1 2

We consider this as natural since it is likely that the reason that the two sets Y and Y of

1 2

attributes are written separately is b ecause their values are derived indep endently. However, the

base X of a p oint-based assumption is often a temp oral key i.e., there do not exist two distinct

tuples that b elong to the same tickyet have the same X values, in which case the semantics of

meth meth meth

 and P Y Y   are the same. Indeed, by condition 3 of assumption metho ds se- Y P Y

X 1 2 X

2 1

mantics, the temp oral mo dules obtained by methX; Y ;M and methX; Y ;M are the pro jections of

1 2

0

M = methX; Y Y ;M. Since X is a temp oral key, the join of the two pro jections, i.e., the semantics

1 2

0 meth meth meth

 is the same mo dule M i.e., the semantics of P Y Y  . Y of P Y

X 1 2 X

2 1 11

3.3 Prop erties of temp oral mo dules with assumptions

Given a temp oral mo dule M and a set of assumptions , wewould like to know if there is a new temp oral

mo dule that mo dels all the information implied by M under . Clearly, all the tuples in f M , for

P

each P in , should b e present; however, the new mo dule should not include any extraneous tuples. We

call such a mo dule minimal closure of M under . In order to formalize this notion, we rst de ne a

subsumption relation among temp oral mo dules.

0 0

De nition Given two mo dules M =R; ; ' and M =R; ; ' , wesay that M is subsumedby

0 0 0

M , denoted M  M , if for each p ositiveinteger i, 'i ' i. The subsumption is said to b e strict,

0 0 0

denoted M  M ,if M  M and M 6= M .

De nition Let M b e a temp oral mo dule and P a p oint-based assumption with W as the set of all

the attributes that app ear in P . A temp oral mo dule M is called a closureof M under P if M is on

1 1

T

the same temp oral scheme of M , M  M , and f M    M . M is called a minimal closureof

1 P 1 1

W

M under P if it is a closure of M under P and there do es not exist closure M of M under P such that

2

0

M  M . A temp oral mo dule M is said to b e a closureof M under a set ofpoint-based assumptions

2 1

if it is a closure under each assumption P in , and is said to b e a minimal closureof M under , if

00 00 0

there do es not exist closure M of M under such that M  M .

Notice that the minimal closure of a certain mo dule under a set of p oint-based assumptions may not

b e unique. For example, given M =AB C ; ; ' with '1 = a; b; c and 'i=; for each i> 1, and

per sis

the assumption P = P B , we can nd as many minimal closures of M under the assumption P ,

A

as the numberofvalues that attribute C can take. Indeed, for eachvalue x that C can take, the mo dule

M =AB C ; ; '  with ' 1 = a; b; c and ' i= a; b; x for each i> 1, is a minimal closure of M

x x x x

under P . Here we give a sucient condition for the existence of the unique minimal closure.

Prop osition 1 Let b e a set of p oint-based assumptions and R;  a temp oral scheme. If for each

meth

meth

1 k

in,wehave XY Y = R, then each temp oral mo dule on R;  assumption P Y ::: Y

1 k X

1

k

has a unique minimal closure under . Furthermore, if the minimal closure of a temp oral mo dule M

under is R; ; ', and the minimal closure of M under each assumption P inisR; ; ' , then

P

S

'i= ' i.

P

P 2

The opp osite is not necessarily true even if wehave only p ersistence assumptions. Consider for

example the temp oral mo dule M =AB C ; ; ' with '1 = 1; 1; 1 and 'i= ; for all other ticks.

per sis per sis per sis

Supp ose wehavetwo assumptions, P C  and P B . Since P B  do es not include

AB C C

all the attributes in M , it violates the condition of the ab ove prop osition. However, M do es have the

0 0 0

unique minimal closure M =AB C ; ; ' , where ' i= 1; 1; 1 for all i.

We use minimal closures to develop two useful notions. The rst notion deals with the question

whether two temp oral mo dules are equivalent under assumptions. If there are no semantic assumptions,

two temp oral mo dules are equivalent if and only if they have the same schemes, same temp oral typ es

and same windowing functions, i.e., they are identical temp oral mo dules. However, when semantic

assumptions are present, two temp oral mo dules may not b e identical, yet they may contain the same

information. To see this, consider the temp oral scheme hAccHol,Addressi, month and the two mo dules

M and M on this scheme. For the rst month, ' 1 = ' 1 =fJ.Smith, 16 Fifth Av.,NYg,

1 2 M M

1 2

while for the second month ' 2 =fA.Brady, 5 Old St.,NYg and ' 2 = ' 1 [ ' 2. These

1 M M M

2 1 1

mo dules are not identical, since ' 2 6= ' 2. However, they are equivalent under the assumption

M M

1 2

per sis

P Address , since they have identical minimal closures under that assumption.

AccHol 12

0 0 0

De nition Given two mo dules M =R; ; ' and M =R; ; ' , wesay that M is equivalent to M

under assumptions if they have identical sets of minimal closures.

Since there may b e di erent mo dules that are equivalent under a set of assumptions, it is interesting

to see if there is a minimum temp oral mo dule with resp ect to the subsumption relation among these

equivalent mo dules. In practice, wemaycho ose to use such a minimum mo dule to save storage space.

De nition Minimal Representation

0

Let b e a set of p oint-based assumptions and M a temp oral mo dule. A temp oral mo dule M is said to

0

be a minimal representation of M under ifM is equivalenttoM under , and there is no temp oral

00 00 00 0

mo dule M such that M is equivalenttoM and M  M .

The second notion we develop deals with implicit assumptions. It is p ossible to derive implicit as-

sumptions from a given set of explicit ones. For example, given the assumption

per sis per sis

P AccHol; Balance; AnIntRate we can derive P Balance . Intuitively, given

AcctNo AcctNo

an assumption P and a set of assumptions, we will say that P is implied by if, for any mo dule, any

implicit information that can b e derived from P can also b e derived from . Formally,

De nition Implied Assumption

A p oint-based assumption P is implied by a set of assumptions we write j= P  if given an arbitrary

mo dule M, each closure of M under is also a closure of M under P .

Prop osition 2 The following derivation rule is sound for all meth , :::, meth :

1 k

meth meth

meth meth

1 k 1 k

;:::;W ifW Y for each j with 1  j  k . ;:::;Y  implies P W  P Y

j j X X

1 1

k k

An interesting prop erty of p ersistence assumption is that there exists a sound and complete deriva-

tion rule for p ersistence assumptions.

Theorem 1 The following derivation rule is b oth sound and complete for p ersistence assumptions:

per sis per sis

 If P Y , then P W  for all Z X and W X n Z  [ Y .

X Z

4 Query Evaluation with Point-Based Assumptions

In this section, we assume that each temp oral mo dule M in the temp oral database is asso ciated with a set

of p oint-based semantic assumptions. We use to denote the collection of all semantic assumptions.

M

We restrict ourselves to temp oral mo dules having a unique minimal closure under semantic assumptions.

In light of Prop osition 1, we assume that each semantic assumption in involves all the attributes in

M

R .

M

We rst formally de ne the notion of query answer with p oint-based semantic assumptions.

F

De nition Let Q b e a correctly typ ed MQL query and supp ose the temp oral mo dules in the database

DB are M , :::, M . Then, the answer of Q on DB with assumptions , denoted Q[DB ; ], is Q[ DB ],

1 m

DB is the database with the temp oral mo dules M , :::, M and each M is the minimal closure where

1 m i

of M under the assumptions .

i M

i 13

That is, the answer of a query with assumptions is the answer of the original query on all the minimal

closures of the temp oral mo dules in the database. In other words, we rst accommo date the assumptions,

via minimal closures, into the temp oral mo dules and then use the minimal closures as the temp oral

database to answer the query. Note that the query in question must b e correctly typ ed since p oint-based

semantic assumptions do not change the temp oral typ es.

The ab ove de nition captures the purp ose of using semantic assumptions in answering queries.

However, as mentioned in the intro duction, instead of changing the temp oral mo dules in the database

to answer queries, wechange the query to encompass all the semantics such that the new query on the

original database will give the intended answer under assumptions. Formally,

0

De nition Given a query Q and semantic assumptions , a query Q is said to b e a -captured version

0

of Q if Q [DB ]= Q[DB ; ] for all instantiations of DB .

If an assumption-captured version of a query is found, in order to answer the query with assumptions,

an existing system can simply evaluate this assumption-captured version in a usual way using \standard"

+

techniques [TCG 93]. That is, the underlying query evaluation programs of an existing system need

not change to accommo date semantic assumptions in its database.

In order to formalize the pro cess that obtains a -captured version, we limit the set of assumptions

F

3

to those that can b e expressed in at-MQL formulas. In particular, for each temp oral mo dule scheme

meth

meth

1 k

M =R;  and assumption P = P Y ::: Y  with R = XY :::Y we require the existence

X 1 k

1

k

F

M M

of a formula in MQL such that a only one predicate M app ears p ossibly several times in and

P P

M

predicate M has arity kRk + 1 and b for each mo dule M on M, f M = fx ;:::;x ;t j g. It is clear

P 1 n

P

M

that by identifying for each temp oral scheme M,wehave identi ed the mapping f .

P

P

In the remainder of this section, we rst showhow assumption metho ds can b e describ ed by syntactic

expressions from whichwe can derive the formulas corresp onding to the assumptions. In particular,

F

we use a slight extension of at-MQL to syntactically sp ecify assumption metho ds, and then we give

a pro cedure on how to derive the aforementioned formulas for semantic assumptions from these

syntactic sp eci cations.

The syntactic sp eci cation of assumption metho ds need to b e parametric with resp ect to the sets of

F

attributes X and Y and to the mo dule scheme M.For this purp ose we extend at-MQL as follows: The

number n of data attributes of the predicate M and corresp onding variables is left as a parameter, and

M

the symb ol Ind , where V is a set of attributes, can b e used to indicate the set of indices p ositions of

V

4

attributes in V app earing in M when the \parameterized" scheme is instantiated. We call a formula in

this extension a formula template.

F

In this pap er, we require that all assumption metho ds b e sp eci ed by such an extension of at-MQL .

F

Such a requirement is not overly restrictive. Indeed, at-MQL isavery expressive language, and we

can sp ecify many metho ds that are useful in practice. As an example, the formula template for the

p ersistence metho d is as follows:

M

w ;:::;w ;t= Mw ;:::;w ;t _

per sis 1 n 1 n

P W 

V

0 0 0

9t ;v ;:::;v  t

1 n 1 n

M

00 0 00 00

 z = v  ^ 8t ;z ;:::;z t

i i 1 n 1 n

V

M

 w = v  8i 2 Ind

i i

VW

M

where P is a We shall see b elowhow the ab ove template is used to derive the required formula

P

3

This restriction can b e relaxed if -captured version can b e in terms of other languages.

4 M

= f2; 3; 5g. For example, assume M =hA; B ; C; D; E i; and V = fB; C; Eg. Then Ind

V 14

p ersistence assumption and M is a temp oral mo dule scheme. However, it is quite clear that the ab ove

formula template corresp onds to our intuitive de nition of p ersistence.

As another example, we formally de ne the avg assumption metho d by giving the following formula

template:

M

w ;:::;w ;t= Mw ;:::;w ;t _

av g

1 n 1 n

P W 

V

9t ;t ;v ;:::;v ;u ;:::;u t

1 2 1 n 1 n 1 1 n 1

M

00 00 00

^ 8t ;z ;:::;z  t

1 n 1 1 n i i

V

^ t

2 1 n 2

M

00 00 00

^ 8t ;z ;:::;z  t  t

1 n 2 1 n i i

V

M

^ 8i 2 Ind  w = v = u

i i i

V

v +u

M

i i

 ^ 8i 2 Ind  w =

i

W

2

It is easily seen that the mapping given by the ab ove template corresp ond to the intuition b ehind the

average metho d.

M

Wenow provide a pro cedure to obtain the formula for each given temp oral scheme M and

P

assumption P provided that each assumption metho d app earing in P is sp eci ed by a formula template.

meth

meth

1 k

Transformation 1 Given the p oint-base assumption P = P Y :::Y  and temp oral mo dule

X

1

k

M

scheme M =R; , the following steps output the formula x ;:::;x ;t:

1 n

P

M 1 1 M k k

1. Derive the formulas w ;:::;w ;t;:::; w ;:::;w ;t from the formula

meth meth

1 n 1 n

1

k

P Y 

P Y 

X

X

1

k

template of metho ds as follows:

 the free variable symb ols are substituted with the ones given here;

 V and W are instantiated with X and Y and  with the given value of ;

 the numberofvariables n is set to kRk;

M

 v = ::: = w " obtained after instantiating V with  each expression of the form \8i 2 Ind

i i

Z

X and W with Y and p ossibly renaming the variables, is written out as a conjunction of a

M

nite number of v = ::: = w for each l in Ind .

l l

Z

M

2. Output the following formula as x ;:::;x ;t:

1 n

meth

meth

1 k

P Y ::: Y 

X

1

k

1 1 M k k 1 1 k k M

9w ;:::;w w ;:::;w ;t ^ w ;:::;w ;t ^ ::: ^9w ;:::;w

meth meth

1 n 1 n 1 n 1 n

1

k

P Y 

 P Y

X

X

1

k

M M M

1 k 1 k

8i 2 Ind x = w = ::: = w x = w  ^ 8i 2 Ind x = w  ^ ::: ^ 8i 2 Ind 

i i i

X Y Y

i i i i

1

k

It is easily seen that the ab ove formula is equivalent to the natural join op eration required in the

de nition of assumption semantics. Hence, the following holds:

M

Prop osition 3 Transformation 1 correctly identi es the formula for a p oint-based assumption P

P

and temp oral mo dule scheme M.

av g per sis

As an example, consider the mapping for P = P BD EF  . Given a mo dule M on

AC

scheme M=hA; B ; C; D; E ; F i;, this assumption intuitively says that values for attributes B and D

should b e derived with resp ect to AC using the av g average metho d while values of attributes E and F

M M

= f2; 4g, = f1; 3g, Ind should simply p ersist with resp ect to AC . In this example n = 6 and Ind

BD AC 15

M

M

and Ind = f5; 6g. The following is the formula de ning . In the formula, the temp oral variable

EF

P

t is of sort .

1 1 M 1 1 M

w ;:::;w ;t ^ x ;:::;x ;t = 9w ;:::;w

av g

1 6

1 6 P 1 6

P BD 

AC

2 2 M 2 2

9w ;:::;w w ;:::;w ;t ^

per sis

1 6 1 6

P EF 

AC

1 2 1 2

x = w = w ^ x = w = w ^

1 3

1 1 3 3

1 1 2 2

x = w ^ x = w ^ x = w ^ x = w

2 4 5 6

2 4 5 6

where

2 2 2 2 0 0 2 2 0 M

w ;:::;w ;t= Mw ;:::;w ;t _ 9t  t

per sis

1 6 1 6 1 6

P EF 

AC

00 0 00 00 2 2

^ 8t ;z ;:::;z t

1 6 1 6 1 3

1 3

M 1 1

and w ;:::;w ;t is similarly derived.

av g

1 6

P BD 

AC

We are now ready to give the pro cess by whichwe obtain -captured versions of given queries.

F

Algorithm 1 Let Q b e a correctly-typ ed MQL query and the collection of p oint-based assumptions.

For each Mx ;:::;x ;t app earing in Q, where each x is a data variable or constant and t is a temp oral

1 n i

variable or constantoftyp e  , substitute Mx ;:::;x ;t with

1 n

M M

x ;:::;x ;t _ ::: _ x ;:::;x ;t

1 n 1 n

P P

1

k

where P , :::, P are all the assumptions in .

1 k M 2

M

Note that in the ab ove replacement, we assume that each free variable in is substituted by the

P

j

corresp onding variable or constantin Mx ;:::;x ;tinQ. By a simple induction on the structure of

1 n

the input query,wehave:

Theorem 2 The query obtained by Algorithm 1 is a -captured version of Q.

F

Example 3 Let us consider the at-MQL query given earlier:

Q = fx; t j9y; z; w ACCOUNTSx; y ; z ; w; t ^ z<1000g.IfDB = fACCOUNTSg and = fP g where

1

per sis 0

P = P AccHol; Balance; AnIntRate g, then Algorithm 1 gives as output the query Q =

1 AcctNo

ACCOUNTS ACCOUNTS

x; y ; z ; w; t with the x; y; z; w; t ^ z<1000g. Substituting fx; t j9y; z; w 

P P

1 1

F

corresp onding at-MQL formula that sp eci es the semantics of the p ersistence metho d, we can easily

0

verify that the answer Q [DB ] is the mo dule AcctNo; second;' with 'i= ; for each i b efore

3/4/94:12:19:03 and 'i= f2034g for each i at and after 3/4/94:12:19:03. Note that this answer is

equal to what wewould obtain by computing the closure of ACCOUNTS with resp ect to P and then

1

evaluating Q on this closure. Indeed, the minimal closure would contain all the tuples derived by the

p ersistence, in particular those with resp ect to Mr.Ford's accountnumb er 2034.

2

5 Interval-Based Assumptions

Interval-based assumptions are those that can b e used to derive information for a certain tickofone

temp oral typ e from information at ticks of a di erent temp oral typ e. The word interval indicates the fact

that these \source" ticks must b e intervals having a certain relationship containmentorintersection

with resp ect to the interval of the \target" tick for which the value is b eing derived. 16

5.1 An example: Liquidity



With I A we denote the assumption of the attribute A b eing downward hereditary with resp ect

X

to the attributes in X . This intuitively means that if wehave an explicit value for the attribute A

with resp ect to certain values for X at a certain tickoftyp e , then for each tickofany other typ e

that is covered by it, A has that same value with resp ect to the same values for X . Referring to our



running example, it is reasonable to assume I AccHol ; intuitively, if an accountAcctNo has

AcctNo

a corresp onding account holder name AccHol in a second, then it has the same account holder name

"

in any millisecond, microsecond, and so on within that second. Similarly, with I A we denote the

X

assumption of the attribute A b eing upward hereditary with resp ect to X .Intuitively,ifwehave the

same value for the attribute A with resp ect to the same X at contiguous ticks, that value is also the

value of A for the same X for each tickofany other typ e that is the union of some of these ticks. In

our example, if an account has the same account holder name in each second of a month, we know that

l

the account has that account holder in that particular month. With I A we denote the assumption

X

of the attribute A b eing liquid with resp ect to X ; i.e., it is b oth downward and upward hereditary.

l

Referring to our example, we can reasonably make the assumption I AccHol; AnIntRate .

AcctNo

Liquidity,aswell as upward/downward heredity, can b e used to answer queries that are not correctly

typ ed. Consider the query:

fy; t: month j9x; z ; w ACCOUNTSx; y; z; w; t ^ x = 1001g

This query is not correctly typ ed since it asks for the account holder name of account 1001 in months,

while the temp oral typ e of ACCOUNTS is in seconds. However, we can answer the query if there are

assumptions, such as the upward heredity assumption, for all the attributes in ACCOUNTS to convert the

information to a coarser typ e. Note that the example query could b e used to see if there has b een a

change of name during a month, in which case the answer would b e empty.

Wenow formalize the notion of a general interval-based assumption.

5.2 Syntax and semantics

As in p oint-based assumptions, an interval-based assumption relies on the use of certain \conversion"

metho ds. For example, the liquidity assumption uses a \constant" function to derivevalues. We assume

that there is a xed set of conversion metho ds.

De nition Let X , Y , :::, Y b e pair-wise disjoint attribute sets, and conv , :::, conv conversion

1 n 1 n

conv

conv

1

n

 Y metho ds. Then I Y  is called an interval-based assumption.

X

n

1

In Subsection 5.1 wehave illustrated three conversion metho ds. There are many others that can

b e useful. As an example, consider a mo dule of typ e  and assume that i is a tickofatyp e  such

that there is a sequence of ticks j ;:::;j of typ e , all contained in tick i of  .We can de ne two

1 k

conversion metho ds|we call them rst and last, resp ectively|that derive the value for tick i of 

using the values at tick j and j of , resp ectively. In our running example, it is reasonable to make

1 k

the assumption that last can b e used to convert the Balance attribute with resp ect to the attribute

AcctNo. This means that if a query on the ACCOUNTS mo dule asks for the balance of an account for

a month, the balance at the last second of that month for that account is returned. Note that even

though we do not have explicit values for the last second of each month, we can still derive them using

the p ersistence assumption. Similarly, considering all other attributes in ACCOUNTS, the following is a

l last

reasonable interval-based assumption: I AccHol; AnIntRate ; Balance .

AcctNo 17

Analogous to p oint-based assumptions, we de ne the semantics of conversion metho ds b efore giving

the semantics of interval-based semantic assumptions. Given a temp oral mo dule M =R; ; ' and a

0 0

set S of integers, we use the notation M j to denote the temp oral mo dule R; ; '  where ' is de ned

S

0 0

as follows: For each p ositiveinteger i, ' i= 'iif i \ S 6= ;, and ' i= ; otherwise. That is,

M j is a temp oral mo dule obtained bykeeping only the values of the windowing function at the ticks

S

whichintersect with the absolute time set S .

De nition Semantics of conversion metho ds

The semantics of a conversion metho d conv is given by a mapping conv  from two sets of attributes, X

and Y , a temp oral mo dule M =R; ; ' with XY R, and a temp oral typ e  such that the following

conditions are satis ed:

1. conv X; Y ; M ;   is a temp oral mo dule on XY; ;

T

2. conv X; Y ; M ; =  M ;

XY

T T

3. for each subset Z of Y , conv X; Z;  M ;=  conv X; Y ; M ;  ;

XZ XZ

T T

conv X; Y ; M ;  ; and M ;=  4. for eachvalue x of X , conv X; Y ; 

X =x X =x

5. for each p ositiveinteger j , conv X; Y ; M j ;= conv X; Y ; M ;  j .

 j   j 

The rst condition says that for a given temp oral mo dule and a target temp oral typ e, the interval-based

assumption I gives a temp oral mo dule with all the attributes in I as the relation scheme of the output

mo dule and the target temp oral typ e as the temp oral typ e of the output mo dule. The second condition

requires that if the target temp oral typ e coincides with the temp oral typ e of the input mo dule, then

the output mo dule should simply b e a pro jection of the input mo dule. The third condition requires

that each metho d derives values for a certain attribute indep endently from values of other attributes.

The fourth condition requires that each metho d derives values for a certain attribute considering only

tuples in the original mo dule having the same value for X . The nal condition says that the value at

each tick j in the target typ e  dep ends only on values at ticks that intersect with j .

If a mo dule M =R; ; ' has a non-empty set of asso ciated interval-based assumptions, values for

mo dules in di erent temp oral typ es can b e implied by these assumptions. The way of deriving these

values is sp eci ed by the semantics of the semantic assumptions:

De nition Semantics of interval-based assumptions

conv

conv

1 k

, there is an asso ciated partial mapping For eachinterval-based assumption I = I Y :::Y

X

1

k

f from temp oral mo dules to temp oral mo dules such that, for each temp oral mo dule M =R; ; '

I

with XY :::Y R,ifconv X; Y ;M;= XY ;;'  for each j =1;:::;k, then f M;  =

1 k j j j j I

0 0

XY :::Y ;;'  where ' i= ' i ./ ::: ./' i for each i  1.

1 k 1 k

Hence, analogous to p oint-based assumptions, the values that are derived according to di erent

conversion metho ds within one assumption are combined via the natural join. Notice that the mapping

for an interval-based assumption de ned this way satis es all the conditions required for the mappings

of conversion metho ds.

6 Query Evaluation with Interval-Based Assumptions

The notion of query answering with p oint-based assumptions in Section 4 can b e extended to that with

interval-based assumptions. As in Section 4, we assume there is a set of interval-based assumptions

M

for each M in the database, and denote the collection of all interval-based assumptions by. 18

As for p oint-based assumptions, we restrict ourselves to assumptions involving all the attributes

5

app earing in the mo dules on which they are applied. We also limit interval-based assumptions for

F

whichwe can evaluate queries to those that can b e expressed in MQL formulas. In particular, for

conv

conv

1 k

each temp oral mo dule scheme M =R; , temp oral typ e  and assumption I = I Y ::: Y 

X

1

k

M;

F

in MQL such that a among the with R = XY :::Y we require the existence of a formula

1 k

I

predicates in the formula, only one predicate symbol M,involving b oth data and temp oral sorts, app ears

M;

and predicate M has arity kRk + 1, and b if M is a mo dule on M then p ossibly several times in

I

M;

f M;  = fx ;:::;x ;t:  j g where n = kRk.

I 1 n

I

Analogous to assumption metho ds, conversion metho ds can b e de ned through formula templates

such that the formulas of assumption involving those metho d can b e easily derived from the templates.

The formula template needs to b e parametric with resp ect to the mo dule scheme R; , the target typ e

 and to the sets of attributes X and Y .For this purp ose the same simple extension considered for

M

F

at-MQL can b e used here, allowing n to remain a parameter, and the symb ol Ind where V is a set of

V

attributes can b e used to indicate the set of indices p ositions of attributes in V app earing in R when

the scheme is instantiated. The restriction to this language for the de nition of conversion metho ds is

F

reasonable, since MQL can b e used to express many practically interesting interval-based assumptions.

For example, the liquidity conversion metho d is sp eci ed by the following formula template:

M;

x ;:::;x ;t=

1 n

l

I W 

V

M

0 0 0

8t : IntSec t ;t  9w ;:::;w Mw ;:::;w ;t  ^ 8j 2 Ind x = w 

; 1 n 1 n j j

VW

M;

We shall see b elowhow the ab ove template is used to derive the required formula , where I is

I

l

a liquidity assumption of the form I Y . However, it should b e intuitively clear that this formula

X

template gives the mappings for the liquidity conversion.

The conversion last can b e sp eci ed by the formula template:

M;

x ;:::;x ;t=

1 n

last

I W 

V

0 0 00 00 0 00

9t : IntSec t ;t ^ 8t : t >t :IntSec t ;t

; ;

0

^ 8s :  s>t:IntSec t ;s

;

M

0

^ 9w ;:::;w Mw ;:::;w ;t  ^ 8j 2 Ind  x = w 

1 n 1 n j j

VW

Once wehave the de nition of conversion metho ds in the form of the formula templates, for any

M;

such that semantic assumption I involving the de ned metho ds, we can easily obtain the formula

I

M;

f M;  = fx ;:::;x ;t:  j g for each temp oral mo dule M on M and typ e  .

I 1 n

I

M;

Similar to what wehave done for p oint based assumptions, the formula for an assumption

I

conv

conv

k 1

::: Y , a target typ e  , and a temp oral mo dule scheme M =R;  such that I = I Y

X

1

k

XY :::Y = R with kRk = n can b e obtained by the following transformation.

1 k

conv

conv

1 k

Transformation 2 Given an interval based assumption I Y :::Y , temp oral mo dule scheme

X

1

k

M;

M and target temp oral typ e  , the following steps output the formula x ;:::;x ;t:

1 n

I

M; M;

1 1 k k

1. Derive the formulas w ;:::;w ;t;:::; w ;:::;w ;t from the semantics

conv conv

1

n k n

1 1

I Y 

I Y 

X

X

1

k

of conversion metho ds as follows:

 the free variable symb ols are substituted with the ones given here;

 V and W are instantiated with X and Y , while  and  with their given values;

 the numberofvariables n is set to kRk;

5

A notion of minimal closure analogous to that for p oint-based assumptions can b e easily de ned. 19

M

 each expression of the form \8i 2 Ind  v = ::: = w " obtained after instantiating V and

i i

Z

W with X and Y and p ossibly renaming the variables, is written out as a conjunction of a

M

nite number of v = ::: = w for each l in Ind .

l l

Z

M;

x ;:::;x ;t: 2. Output the following as

1 n

I

M; M;

1 1 1 1 k k k k

9w ;:::;w w ;:::;w ;t ^ ::: ^9w ;:::;w w ;:::;w ;t ^

conv conv

1

k

1 n 1 n 1 n 1 n

I Y 

I Y 

X

X

1

k

M 1 k M 1 M k

8i 2 Ind x = w = ::: = w  ^ 8i 2 Ind x = w  ^ ::: ^ 8i 2 Ind x = w 

i i i

X i i Y i Y i

1

k

It is easily seen that the ab ove formula is the equivalent in our logic of the natural join op eration

required in the de nition of interval-based assumption semantics. Hence, the following holds:

Prop osition 4 Transformation 2 identi es the mapping for an interval-based assumption.

The purp ose of interval-based assumptions is to derive information in terms of a di erent temp oral

typ e. For instance, monthly information can b e derived from daily information. The advantage is that

the user query do es not have to b e correctly typ ed for the system to answer. Indeed, supp ose a temp oral

mo dule in the database contains accountinterest rate information in terms of days. If the user wants

to know the interest rate at a particular hour, she may query the database using hour as the temp oral

typ e for the query. The system will then use the liquidity assumption to answer the query.

However, there are cases where assumptions are not \designed" for certain information derivation.

For example, if only downward heredity is assumed on a temp oral mo dule M, information in terms of

any temp oral typ e that is coarser than  is not derivable. Such incorrectly typ ed queries cannot b e

M

answered. Belowwe formalize this intuition.

De nition Typ e reachability

M;

F

Let I b e an interval-based assumption and is its asso ciated MQL formula. Supp ose  and  are

I

two temp oral typ es. Then a type  can reach type  via I if there exists a temp oral mo dule M =R; ; '

M;

such that the answer of the query fx ;:::;x ;t:  j g, where x , :::, x are all the free variables in

1 n 1 n

I

M;

, is not empty.

I

In other words, typ e  cannot reachtyp e  via I if the answer of the query given in the de nition is

empty indep endently from the mo dule M .Thus, no matter what is the given information whichisin

terms of , no information in terms of  can b e derived by using I .

We can now de ne the notion of doable queries , which are incorrectly typ ed queries that can b e

answered in the presence of interval-based assumptions.

De nition Doable query

Wesay that a query is doable wrt the interval-based assumptions  if for each subformula Mx ;:::;x ;t

1 n

app earing in the query, where t is of typ e  ,  can b e reached from  via some interval-based assump-

M

tions in .

M

F

The doability of queries is decidable if all conversion metho ds are sp eci ed byMQL formula tem-

plates that use no arithmetic op erations and with the assumption that the IntSec predicate is precom-

puted into a nite with a constant lo okup time. Unfortunately, the complexity of the decision

pro cedure even under these simplifying assumptions is rather high.

Prop osition 5 Let b e a set of interval-based assumptions whose conversion metho ds are sp eci ed by

F

MQL formula template with the ab ove simpli cation and assumption. Then, the problem of determining

if a query is doable under is decidable, but the complexity of such a decision pro cedure is at least

cn

2

2 where n is the length of the query and c is a constant. 20

In terms of practical systems, we require that the system designer gives explicit information ab out

interval-based assumptions. We identify b elowtwo main kinds of interval-based assumptions: upward

and downward assumptions.

De nition Wesay that an assumption I is upward downward, resp. if for eachtyp e  coarser  ner,

resp. than ,typ e  can b e reached from  via I .

F

Clearly, the liquidity assumption is b oth upward and downward. It is easy to de ne in MQL the

twovariants of liquidity so that one works as upward heredity and the other as downward heredity. Each

one has only one of the two prop erties de ned ab ove. If our interval-based assumptions are all upward

and/or downward, doability is easily checked.

Similar to the p oint-based assumptions, wemay de ne query answers with interval-based assump-

tions and assumption-captured version of queries:

F

De nition Let b e a set of interval-based assumptions and Q aMQL query wrt . Let V b e the set

of all the temp oral typ es of the variables and constants app earing in Q, and for each M in DB and  in

i

S

' j . Then the answer of Q with V , let M =R ; ;' , where for each j  1, ' j =

i; i i i i

f M ; 

I 2

i

I

M

i

assumptions denoted by Q[DB ; ] is Q[DB ], where

 DB = fM j M 2 DB and  2Vg, and

i; i

 Q is the the query obtained bychanging each M x ;:::;x ;ttoM x ;:::;x ;tin Q, where 

i 1 k i; 1 k

is the typ e of t.

0 0

A query Q is called -captured version of Q if Q [DB ]=Q[DB ; ].

That is, to answer a query with assumptions , one needs to a obtain, via the given interval-based

assumptions, the temp oral mo dules in terms of the temp oral typ e used in the query, and b change the

o ccurrences of the temp oral mo dule predicates to address the \new" temp oral mo dules. The altered

query is obviously correctly-typ ed, and it is then answered in a usual wayover the altered database. A

-captured version is a query which will obtain, without changing the database, the correct answer of

the original query with assumptions.

F

Algorithm 2 Given an MQL query Q on a temp oral database with interval assumptions, Q is trans-

0

formed in a query Q on the original mo dules by substituting each o ccurrence of an atom Mx ;:::;x ;t,

1 n

where each x is a data variable or constant and t isavariable or constantoftyp e  , with

i

M; M;

x ;:::;x ;t _ ::: _ x ;:::;x ;t

1 n 1 n

I I

1

k

where I ;:::;I are all the interval assumptions in .

1 k M 2

Theorem 3 Given a query Q and a set of interval-based assumptions as inputs to Algorithm 2, the

query obtained by the algorithm is a -captured version of Q.

7 Combining Point-Based and Interval-Based Assumptions

If b oth p oint-based and interval-based assumptions are presentin , the interval-based assumptions

M

must b e applied on the minimal closure of M with resp ect to the p oint-based assumptions. This is quite 21

intuitive, since values at ticks in di erent temp oral typ es can b e derived by conversion using values not

present in the original mo dule, but may b e implied by p oint-based assumptions. For example, supp ose

the mo dule contains the value of an attribute for the rst day of the year and no value for the other

days. Moreover, supp ose we know that that attribute p ersists and that it is liquid. If we ask what is

its value for the whole year, we can answer only by applying rst p ersistence to derive that value for

eachday of the year, and then by applying the liquidity to derive the value for the whole year.

The answer to a query in the presence of b oth p oint-based and interval-based assumptions is de ned

by combining the de nition of the query answering with p oint-based assumptions and interval-based

assumptions. That is, rst, the closures of the temp oral mo dules are obtained by p oint-based assump-

tions. Then the temp oral mo dules are changed, byinterval-based assumptions, into mo dules in terms

of the typ es requested by the given query. The formal de nition of answer can b e given as follows:

Q[DB ] where Q[DB ; [ ]=

I P

S

T

 DB = fM j M = M ;; M is the minimal closure of M 2 DB and  2Vg, and f 

i; i; i i i I

I 2

M

i

 Q is the the query obtained bychanging each M x ;:::;x ;ttoM x ;:::;x ;tin Q, where 

i 1 k i; 1 k

is the typ e of t.

The notion of assumption-captured version can b e de ned easily.To obtain assumption-captured

versions, one only needs to apply Algorithm 1 rst and then Algorithm 2.

F

Theorem 4 Let b e a set of p oint-based and/or interval-based assumptions and Q an MQL query.

Then, Algorithm 1 and 2, applied in this order, yield a -captured version of Q.

Example 4 Let DB =fACCOUNTS, RISKSg, where ACCOUNTS and RISKS are temp oral mo dules of Ex-

ampls 1 and 2 resp ectively,=fP ;P ;I ;I g where

1 2 1 2

per sis

P = P AccHol; Balance; AnIntRate ,

1 AcctNo

av g

P = P EstRisk ,

2 AccHol

l last

I = I AccHol; AnIntRate ; Balance , and

1 AcctNo

l

I = I EstRisk .

2 AccHol

Consider the query

Q = fx; t : month j9y; z; w; r ACCOUNTSx; y; z; w; t ^ z<1000 ^ RISKSy; r;t ^ r>3g

This query asks for accounts having a balance under $1,000 and such that the estimated risk for their

account holder is greater than 3. The query asks for these accountnumb ers in terms of months. This

means that only the accounts such that those conditions are veri ed on an interval corresp onding to

a month would b e returned in the answer. The answer Q[DB ; ] can b e obtained applying rst the

p oint-based assumption P and P resp ectively on the two mo dules in DB . Then, applying the interval-

1 2

based assumption I and I resp ectively to the resulting mo dules, they will b e converted in terms of

1 2

typ e month. The application of P and P ensures that there will b e at least one tuple resp ectively for

1 2

each second in ACCOUNTS and for eachdayin RISKS. Ab out interval-based assumptions, since Balance

is converted using last and the other attributes in ACCOUNTS are liquid, the resulting MONTHLY-ACCOUNTS

has the following windowing function:

'i= f1001; J:Smith; 3500; 4:00; 1500; A:Brady; 1500; 3:00; 2034; T:Ford; 500; 2:50g

for April of 1993 i  June of 1993 and

'i= f1001; J:Smith; 4000; 4:00; 1500; A:Brady; 1500; 3:00; 2034; T:Ford; 500; 2:50g

for i  July of 1993. Similarly, the resulting MONTHLY-RISKS has windowing function

0

' j = fSmith; 3; Ford; 3:5g for March of 1993 j  May of 1993, and empty for other ticks. 22

The answer can now b e computed evaluating Q on this new database and in this case it is a mo dule

whose windowing function has the only value 2034 for ticks corresp onding to April and May of 1993.

Theorem 4 says that we don't need to compute this new database, but we can simply mo dify the query,

yielding the same answer.

2

8 Safety

F

As in the traditional relational calculus, we require that MQL queries b e \domain indep endent" [Ull88 ].

F

Since scalar functions are used in MQL queries, the notion of domain indep endence needs to b e extended.

F

Here we adopt the notion of embedded domain independence of [EMHJ93]. Intuitively,anMQL query is

emb edded domain indep endent if its answer only dep ends on the embedded domain, where the emb edded

domain consists of a the values that app ear in the database, called active domain, and b the values

from a b ounded numb er of function applications on the active domain.

De nition Given a set of assumptions, a query Q is embedded domain independent wrt if the

non-temp oral values in the answer of Q with assumptions are from the emb edded domain.

F

Emb edded domain indep endence for MQL is undecidable. Wethus follow [Ull88 ] to give a syntactic

F F

6

restriction on allowable MQL queries. An MQL formula is said to b e safe if it satis es the four

conditions given on page 153 of [Ull88] mo di ed as follows: First, we ignore the temp oral variables

app earing in the formula. Variables and functions of the di erent sorts are in fact strictly separated.

That is, the four conditions [Ull88, p.153] only apply to non-temp oral variables. Second, wehaveto

takeinto account the fact that we allow mathematical functions. Sp eci cally, for condition 3 of [Ull88,

p.153], we add the fact that a data variable x is also limited if it is assigned to the result of a function

applied on limited variables, i.e., if x = f y ;:::;y  is a subformula, then x is limited when y , :::,

1 m 1

y are all limited.

m

F

An MQL query fx ;:::;x ;t: jg is said to b e safe if the formula  is safe.

1 k

F

It is easily seen that a safe MQL query is emb edded domain indep endent. This in particular

means that if the set of data non-temp oral values app earing in the database is nite, then the set of

non-temp oral values app earing in the answer of a safe query is also nite.

Consider now the answer of a query with a set of semantic assumptions. A safe query do es not

guarantee emb edded domain indep endence unless the assumptions satisfy certain conditions.

De nition A p oint-based resp. interval-based semantic assumption P resp. I  is said to b e safe if

M;

M

is safe for any temp oral mo dule scheme M resp. is safe for any temp oral mo dule scheme M and

P

I

typ e  .

Persistence and heredity assumptions are easily seen as safe p oint-based and interval-based assump-

F

tions, resp ectively. Since a semantic assumption can b e represented as an MQL query, its safety implies

emb edded domain indep endence .

The following prop osition formally states that the condition on the safety of assumptions is sucient.

Prop osition 6 If Q is a safe query and a set of safe assumptions, then Q is emb edded domain

indep endent with resp ect to .

6

A more elab orate criterion called \emb edded allowed" app eared in [EMHJ93]. 23

We conclude, by the following prop osition, that to ensure the emb edded domain indep endence of

the query that we are evaluating, it is sucienttocheck the safety of the original query if all the

assumptions are assumed to b e safe.

Prop osition 7 Let Q b e a query and a set of assumptions. Supp ose each assumption in , as well

0

as Q, is safe. Then Q , obtained by the transformation from Q according to , is emb edded domain

indep endent.

0

Proof. By Theorem 4, Q [DB ]= Q[DB ; ]. By Prop osition 6, Q is emb edded domain indep endent wrt

0

. Hence, Q is emb edded domain indep endent. 2

Unlike the traditional relational databases, however, the niteness of non-temp oral values app ear-

ing in an answer do es not guarantee the \ niteness" of the answer. Many natural semantic assumptions

lead to in nite answers. In Example 3, the answer to the query Q = fx; t j9y; z; w ACCOUNTSx; y; z; w; t^

z<1000g, asking for the accounts and the times that have a balance under $1,000, is in nite. Indeed,

the windowing function ' of the resulting mo dule is non-empty'i= f2034g for an in nite number

of ticks. Note that the numb er of di erent non-temp oral values in the answer is nite.

The ab ove in nite answer, however, shows a particular characteristics, namely, for each time i which

is after a certain xed time t , the value set of tuples asso ciated with i 'i is always the same.

max

We call such an in nite temp oral mo dule \eventually uniform."

De nition Wesay that a mo dule is eventual ly uniform if after a certain tick its windowing function

always gives the same value. Formally, a temp oral mo dule M =R; ; 'iseventually uniform if there

exist t such that 'i= 't  for each i> t .

max max max

Note that a representation equivalenttoaneventually uniform mo dule is used in almost all temp oral

extensions of relational databases. A typical way of representing the eventual uniformity is asso ciating

to a value/tuple the interval [c , 1], where c is a time constant.

1 1

F

In this section, we show that every at-MQL query gives eventually uniform answers if the temp oral

mo dules in the database are all eventually uniform and each assumption is expressed using a at-

F F

MQL formula. We start, however, with a more general notion and result for MQL queries.

De nition Wesay that a temp oral mo dule is 1st-order nitely partitioned 1fp-module for short if

there always exists a of its ticks into a nite numb er of sets such that its windowing function

always gives the same nite set of tuples within these sets. Moreover each such set of ticks can b e

sp eci ed by a rst-order formula with only temp oral variables and the predicates IntSec and <.

F

Theorem 5 The answer to an emb edded-domain indep endentMQL query on a database with as-

sumptions is always 1st-order nitely partitioned if each mo dule in the database is 1st-order nitely

partitioned.

F

Wenow show that the at-MQL language enjoys a b etter prop erty regarding query answers.

F

Theorem 6 Let b e a set of assumptions expressed in terms of at-MQL formulas. The answer

F

to an emb edded-domain indep endent query in at-MQL on a database with assumptions is always

eventually uniform if each temp oral mo dule in the database is eventually uniform. 24

Moreover, the value t of the tick after which the answer is always the same can b e computed for

max

each query Q and input mo dules M ;:::;M with assumptions . Indeed, the whole pro cess describ ed in

1 k

the pro of of Theorem 6 is a syntactical pro cess from which t can b e derived.

max

F

To conclude this section, we mention that Theorem 6 do es not hold for MQL queries. Consider

a temp oral mo dule R; day;', where R = hNameOnDutyi, which sp eci es the p erson on duty for each

day. The p ersistence assumption is assumed on the mo dule, i.e., if no one is sp eci ed to b e on dutyon

a sp eci c day, the last p erson on duty should continue. Now consider the following query:

fx; i : dayjMx; i ^9j : monthIntSec j; i ^:9i : dayIntSec j; i  ^ i

month;day 2 month;day 2 2

This query asks the list of p ersons and the days who are on duty on rst day of month. Clearly, the

result mo dule is usually not eventually uniform. Indeed, the result mo dule alternates a non-empty tick

i.e., 'i 6= ; with a sequence of empty ticks i.e., 'i= ;, up to the in nite.

From the results of this section, we conclude that when a temp oral database involves only one

temp oral typ e, then we need only eventually uniform temp oral mo dules. This justi es the choice

researchers have made in most temp oral database literature. On the other hand, if more than one typ es

are involved, then eventual uniformity do es not seem to b e enough. Wemayhave to resort to rst-order

nite partitioning. Ultimate p erio dicity such as [KSW90] may b e a go o d candidate as a sp ecial case

of rst-order nite partitioning.

9 Conclusion

This pap er intro duced the notion of semantic assumptions for temp oral databases. Semantic assump-

tions allow a compact representation of p otentially in nite temp oral data; they can b e used to compute

values not explicitly given and to convert data from one temp oral typ e to another. While some of

these assumptions have b een extensively used in the literature on temp oral databases, they were not

adequately formalized. This pap er gives such a formalization considering twovery general classes of

assumptions: a p oint-based assumptions used on temp oral databases with a single temp oral typ e; and

b interval-based assumptions that are sp eci c for databases allowing di erent temp oral typ es.

A temp oral database with a set of assumptions is equivalent to a larger sometimes in nite database

where values implied by assumptions are made explicit. We de ned as the minimal closure of the

database the minimal of these databases with explicit values. If each assumption involves all the

attributes of the scheme to which it is applied, then this minimal closure is unique. This condition can

b e seen as a limitation, but it can b e relaxed for the purp ose of answering queries. One option is to

intro duce \don't care" values into the query language. For example, the query language could allowto

talk ab out part of a temp oral mo dule: if M is a temp oral mo dule on a scheme with three attributes, the

formula could contain Mx; ;y;t, where  is a don't care. In this case, the \don't care" may not have

avalue. This can b e a straightforward extension of our work.

F

Another interesting issue concerns the expressiveness of the query language. Even though MQL is

apowerful language, there are certain assumptions that cannot b e sp eci ed. An example of derivation

metho d that cannot b e sp eci ed is weighted sum, where a value is computed as the weighted sum of a

F

set of values using as weights the ticknumb ers that attach to the values of the set. Indeed, MQL do es

not allow terms of temp oral sorts as arguments of functions.

F

There are also interval-based assumptions that cannot b e expressed in MQL . Consider a summation

conversion metho d deriving an attribute value at a tick of time as the sum of the values at ticks of ner

temp oral typ es contained in it. It is imp ossible to express this metho d without having a b ound on the 25

numb er of ticks contained in each coarser tick. Hence, to represent this kind of metho ds we require an

additional constraint on the set of typ es used in the database:

Given an arbitrary typ e  in the set, it exists a number k b ound such that each tickof

 contains at most k ticks of each other typ e in the set.

In the case of summation this condition ensures that there is a b ound to the number of values that will

b e summed. Observe that the ab ove requirement is satis ed for many sets of typ es of practical utility.

F

There are other interval-based assumptions that cannot b e expressed in MQL .For example, con-

sider a conversion function saying that values in terms of months can b e derived from values in terms

of years by dividing it by 12. The e ect of an assumption using this conversion should b e limited to

F

months and years, but there is no way to do so using MQL .To takeinto account this kind of conversion

function it is necessary to intro duce the notion of an interval-based assumption restricted to a subset

of the temp oral typ es of the database. Extending the expressiveness of our query language along this

direction should not a ect the results rep orted in the pap er.

In addition to the formalization of semantic assumptions, a ma jor contribution of the pap er is a

simple metho d to transform a query on a temp oral database with assumptions into a new query on the

same database but without assumptions, so that it is not necessary to compute the minimal closure

of each mo dule in the database to obtain the answer. The same algorithms can probably b e used to

check constraint satisfaction on temp oral databases with assumptions. For example, dynamic integrity

constraints are sp eci ed in [Cho92] using Temp oral Logic TL; it is easy to show that every formula

F

in TL can b e translated into an equivalent formula in MQL .

References

[AKPT91] J. F. Allen, H. Kautz, R. Pelavin, and J. Tenenb erg. Reasoning about Plans. Morgan-

Kaufman, 1991.

[CCT94] James Cli ord, Alb ert Croker, and Alexander Tuzhilin. On completeness of historical re-

lational query languages. ACM Transactions on Database Systems, 191:64{116, March

1994.

[Cho92] J. Chomicki. History-less checking of dynamic integrity constraints. In F. Golshani, editor,

Proceedings of the International Conference on Data Engineering,volume 8, pages 557{564,

Los Alamitos, CA, February 1992. IEEE Computer So ciety Press.

[Cho94] Jan Chomicki. Temp oral query languages: A survey. In D.M.Gabbay and H.J. Ohlbach,

editors, Temporal Logic, First International Conf.,volume 827 of LNAI, pages 506{534,

Bonn, Germany, 1994. Springer-Verlag.

[CI94] J. Cli ord and T. Isakowitz. On the semantics of bitemp oral variable databases. In

M. Jarke, J. Bub enko, and K. Je ery, editors, Proceedings of 4th International Conference

on Extending Database Technology, pages 215{230, March 1994.

[Co o72] C.D. Co op er. Theorem proving in arithmetic without multiplication. Machine Intel ligence,

7:91{99, 1972.

[CT85] J. Cli ord and A.U. Tansel. On an algebra for historical relational databases: Two views. In

S. Navathe, editor, Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 247{265,

Austin, TX, May 1985. ACM. 26

[CW83] J. Cli ord and D.S. Warren. Formal semantics for time in databases. ACM Transactions on

Database Systems, 82:214{254, June 1983.

[DM87] T. Dean and D. V. McDermott. Temp oral data base management. Arti cial Intel ligence

Journal, 32:1{55, 1987.

[EMHJ93] M. Escobar-Molano, R. Hull, and D. Jacobs. Safety and translation of calculus queries with

scalar functions. In Proc. ACM Symp. on Principles of Database Systems, pages 253{264,

Washington, DC, May 1993.

[FR79] J. Ferrante and C. W. Racko . The computational complexity of logical theories. Number

718 in Lecture Notes in Mathematics. Spinger-Verlag, 1979.

[GHR94] D. M. Gabbay, I. Ho dkinson, and M. Reynolds. Temporal Logic: Mathematical founda-

tions and computational aspects,volume 1 of Oxford Science Publications. Clarendon Press,

Oxford, 1994.

[JSS93] C. S. Jensen, M. D. So o, and R. T. Sno dgrass. Unifying temp oral data mo dels via a con-

ceptual mo del. Technical Rep ort TR 93-31, Department of Computer Science, Universityof

Arizona, Tucson, AZ, 1993.

[KSW90] F. Kabanza, J-M. Stevenne, and P.Wolp er. Handling in nite temp oral data. In Proc. ACM

Symp. on Principles of Database Systems, Nashiville, Tennessee, April 1990. ACM.

[NS92] M. Niezette and J. Stevenne. An ecient symb olic representation of p erio dic time. In

First International Conference on Information and Know ledge Management, Baltimore, MD,

Novemb er 1992.

[Sho87] Yoav Shoham. Temp oral logics in AI: Semantical and ontological considerations. Arti cial

Intel ligence Journal, 33:89{104, February 1987.

[Sno84] R. Sno dgrass. The temp oral query language TQuel. In Proc. ACM Symp. on Principles of

Database Systems, pages 204{212, Waterlo o, Ontario, Canada, April 1984.

[SS87] A. Segev and A. Shoshani. Logical mo deling of temp oral data. In U. Dayal and I. Traiger,

editors, Proceedings of the ACM SIGMOD Annual Conference on Management of Data,

pages 454{466, San Francisco, CA, May 1987. ACM, ACM Press.

[Tan87] A.U. Tansel. A statistical interface for historical relational databases. In Proceedings of the

International Conference on Data Engineering, pages 538{546, Los Angeles, CA, February

1987. IEEE Computer So ciety, IEEE Computer So ciety Press.

+

[TCG 93] A. Tansel, J. Cli ord, S. Gadia, S. Ja jo dia, A. Segev, and R. Sno dgrass, editors. Temporal

databases: theory, design, and implementation. Benjamin/Cummings, 1993.

[Ull88] J. D. Ullman. Priciples of Database and Know ledge-base systems. Computer Science Press,

1988.

[WBBJ94] X. Wang, C. Bettini, A. Bro dsky, and S. Ja jo dia. Logical design for temp oral databases

with multiple temp oral typ es. Technical Rep ort ISSE-TR-94-111, ISSE Department, George

Mason University, 1994. 27

[WJL91] G. Wiederhold, S. Ja jo dia, and W. Litwin. Dealing with granularity of time in temp o-

ral databases. In Proc. 3rd Nordic Conf. on Advanced Information Systems Engineering,

Trondheim, Norway,May 1991.

[WJS93] X. Wang, S. Ja jo dia, and V.S. Subrahmanian. Temp oral mo dules: An approachtoward fed-

erated temp oral data bases. In Proceedings of 1993 ACM SIGMOD International Conference

on the Management of Data,Washington, D.C., 1993. 28

App endix

A.1 Pro of of Prop osition 1

Proof. Let M b e a temp oral mo dule on R; . Since each assumption P ininvolves all the attributes

in R, f M  is a temp oral mo dule on R; by de nition. Assume f M = R; ; '  for each P in

P P P

S

and let M =R; ; ', where ' is de ned as follows: For each p ositiveinteger i, 'i= ' i. It is

P

P 2

M is a minimal closure of M under . Indeed, it is clear that M is a closure of M under easily shown that

. It is also clear that M is a minimal closure of M under since, if a tuple t is dropp ed from 'i for

some i, then ' i 6 'i for some P 2 and M is no longer a closure. Wenow show that this minimal

P

closure is unique. Supp ose by contradiction that there exists a temp oral mo dule M =R; ; '  such

1 1

M and M is a minimal closure of M under . By de nition, f M   M for each P 2 . that M 6=

1 P 1 1

S

Hence, ' i ' i for each P 2 and ' i ' i for each p ositiveinteger i. By the

P 1 P 1

P 2

de nition of M ab ove, it follows that 'i ' i and we conclude that M  M . This contradicts to

1 1

the fact that M 6= M and M is a minimal closure of M under . 2

1 1

A.2 Pro of of Prop osition 2

meth meth

meth meth

k k

1 1

 such that W Y for  and P = P W :::W Proof. Let P = P Y :::Y

j j 2 X 1 X

1 1

k k

each1  j  k . We show that P j= P . Let M =R; ; ' b e a temp oral mo dule and M =

1 2

' b e a closure of M under P . We only need to show that M is a closure of M under P . R; ;

1 2

T 0

Since M is a closure of P ,wehave f M    M . Let f M = R; ; ' , and for each

1 P P

1 1

XY Y

1

k

Y Y Y 0

1  j  k , let meth X; Y ;M= XY ;;' . By de nition, ' i ./  ./' i= ' i and

j j j

j 1

k

00 0

'i for each p ositiveinteger i.Furthermore, let f M = R; ; '  and meth X; W ;M= ' i

P j j

2

W 00 W W

XW ;;'  for each1 j  k . By de nition, ' i= ' i ./  ./' i for each p ositiveinteger

j

j 1

k

M is a closure of M under P , it suces now to show i. Let i b e a p ositiveinteger. To show that

2

00 T

that ' i  'i. By de nition again, meth X; W ;M=  meth X; W ;M =

XW W j j j j

1

k XW

j

T 00 T Y W

meth X; W ; i. Hence, ' i= M  =  i=  ' meth X; Y ;M. Therefore, '

j j XW j j

j j j

XW XW

j j

Y Y 00

 ' i ./  ./ ' i. Since Y , :::, Y are pairwise disjoint, it is clear now that ' i=

XW XW 1 k

1

1

k k

Y Y 0

'i. 2  ' i ./  ./' i =  ' i  

XW W XW W XW W

1 1 1 1

k k k

k

A.3 Pro of of Theorem 1

Proof.

Soundness

per sis per sis

Let Z X .We only need to prove that P V  is implied by P Y  where V =X n Z  [ Y .

Z X

per sis

[Indeed, if this is the case, by Prop osition 2, we know that P W , where W V , is implied by

Z

per sis

P Y .] In order to show this, let M =R; ; ' b e an arbitrary mo dule and M a closure of M

X

per sis T

 under P Y . Let V =X n Z  [ Y . By de nition, wehave f M    M , and

per sis

X

P Y 

XY

X

T

we need to prove that f per sis M    M . By the sp eci cation of the p ersistence semantics,

P V 

ZV

Z

0 0

f M = XY;;'  where, for each i  1, ' i=  'i [  S where S = ft j9j j<

per sis

XY XY Z Z

P V 

Z

i; t 2 'j  and 8k; t j

per sis

j k k k j

P Y 

X

00 00

XY;;'  where, for each i  1, ' i=  'i [  S and S is de ned as S with X in place

XY XY X X Z

of Z . Since Z X , it is easily seen that t [Z ] 6= t [Z ] implies t [X ] 6= t [X ] and hence, S S .

k j k j Z X

T

 Therefore, f M   f M    M .

per sis per sis

P V  P Y 

XY

Z X

Completeness 29

per sis

Let P = P W  b e a p ersistence assumption and a set of p ersistence assumptions. Assume that

Z

per sis

for each P Y  in , either Z 6 X ,or Z X but W 6 X n Z  [ Y .To prove the completeness,

X

we only need to show that there exists a temp oral mo dule M such that a closure of M under is not a

closure of M under P . Let R b e a relation scheme including all attributes app earing in the assumptions

that we are considering, and  an arbitrary typ e with at least two non-empty ticks. We de ne the

mo dule M =R; ; ' with '1 = f1;:::;1g, 'i= ; for each i  3 and '2 de ned as follows: For

per sis

each assumption P Y  2 , if Z 6 X , '2 includes the tuple t with t[X ] = 1 and t[R n X ]= 0.

X

0 per sis

0 0

For each assumption P = P Y  in , assume f M = XY;;' . Wenow give a minimal

X

P P

M =R; ; ' b e the temp oral mo dule on R;  whose windowing function closure of M under . Let

' is de ned as follows: '1 = '1 and for each i  2,

0 0 0 per sis 0

0

* i and t [R n XY ]= 0g. 'i='i [ft j9P = P Y  2 ; t [XY ] 2 '

X P

0

0

M is a closure of M since M  M and f M  for each P = M     It is easy to verify that

XY

P

per sis

P Y  in . Let f M = ZW;;'  and t b e the tuple on ZW with t[A] = 1 for each A 2 ZW .

X P P

0

It is easily seen that t is in ' 2. Indeed, by the construction of M , for each tuple t in '2, there

P

0 per sis 0 0

exists P = P Y  in with Z 6 X such that t [A]= 1 if A 2 X and t [A] = 0 otherwise. Since

X

0 0

t[A] = 1 for each A 2 Z ,itisnow clear by such a construction that t [Z ] 6= t[Z ] for each t 2 '2.

per sis

Hence, by p ersistence, t is derived from the tuple 1;:::;1 in '1 by the p ersistence P = P W ,

Z

i.e., t is in ' 2.

P

per sis

Wenow prove that M is not a closure of M under P W by showing that t 62  '2. In

Z ZW

0 per sis

order to do this, since t 62  '2, by *, we need only to show that for each P = P Y  2 ,

ZW X

0 0 0 per sis 0

0

'2 such that t [XY ] 2 ' 2 and t [ZW ]= t. Let P = P Y bein there do es not exist t 2

X

P

. Two cases arise: a Z 6 X and b Z X but W 6 X n Z  [ Y . Consider a. Note that by the

0

construction of '2 and the p ersistence assumption, ' 2 =  '2, i.e., no tuple is derived from

XY

P

the tuple 1;:::;1 in '1, and hence, we know that t 62  '2. Consider b. In this case, there

ZW

0

0

exists an attribute A 2 W such that A 2 R n XY . Let t be in ' 2. This tuple is either already

P

0

in '2 in this case, t [ZW ] 6= t as we reasoned ab ove, or it is derived from the tuple 1;:::;1 in

0 0 0

'1 by P . In this latter case, we know that t [XY ]=1;:::;1 and t [R n XY ]= 0;:::;0. Since

0

t[ZW ]=1;:::;1 and there exists A 2 W such that A 2 R n XY , it follows that t [ZW ] 6= t. 2

A.4 Pro of of Theorem 2

Proof. Let Q b e a correctly-typ ed query on a database DB with a collection of p oint-based assump-

F

0

tions expressed in flat MQL , and Q the query obtained by Algorithm 1 with Q and as inputs.

0

In order to prove the theorem, we show that Q [DB ]=Q[DB ; ]. Supp ose DB = fM ;:::;M g.

1 k

Q[DB ]= Q[M ;:::;M ], where each M is the minimal We know that, by de nition, Q[DB ; ] =

1 k i

F

closure of M . The query Q,asany query in MQL , has the form fx ;:::;x ;t:  j x ;:::;x ;tg.

i 1 n 1 n

However, for simplicity of the pro of, we generalize the query to have more than one free temp oral

variable. Hence, Q has the general form: fx ;:::;x ;t :  ;:::;t :  j x ;:::;x ;t ;:::;t g.

1 n 1 1 m m 1 n 1 m

We use induction on the structure of the formula . Since the transformation is only concerned

with atomic formulas of the typ e M , the case when = M x ;:::;x ;t where M is an ar-

1 1 n 1

bitrary mo dule in DB is the crucial step of the pro of. In this case, to evaluate Q on the clo-

sure of DB , i.e. the closure of M , it means considering M  instead of M  as a predicate in

1 1 1

0

Q. Hence, Q[DB ]= fx ;:::;x ;t j M x ;:::;x ;tg = M . From the algorithm wehave: Q =

1 n 1 1 n 1

M M

fx ;:::;x ;t j x ;:::;x ;tg where P , :::, P are all the assumptions x ;:::;x ;t _ ::: _

1 n 1 n 1 l 1 n

P P

1

l

0

Q[DB ]= Q [DB ] in this case. By de nition, in containing all the attributes in M.Wenow show that

M is the minimal mo dule having the same temp oral mo dule scheme as M such that f M for M  

1 1 P 1 1

i

F

each1 i  l .We know that the assumption mappings can b e expressed in flat MQL ; for example, 30

M

1

f M =fx ;:::;x ;t j x ;:::;x ;tg. By Prop osition 1, M is the temp oral mo dule whose

P 1 1 n 1 n 1

1

P

1

M M

windowing function is given by ' t=fx ;:::;x j x ;:::;x ;t _ ::: _ x ;:::;x ;tg

M 1 n 1 n 1 n

P P

1

1 l

for each time t, since the logical or _ corresp onds to the union op eration. From this fact it is clear

0

M and this concludes the that the evaluation of Q on DB gives exactly the same tuples presentin

1

pro of of the case when the atomic formula is M . Other cases of atomic formulas regard comparisons

1

among variables: x

i j i j i j i j ; i j

formulas, no transformation is done and the answer is indep endent from DB . Hence, the theorem holds

in these cases. Given that the theorem holds when is an atomic formula, we can easily apply the

induction step to complete the induction. However, the induction is trivial since the transformation

do es not a ect other constructs of a given formula. Wethus omit the induction step and conclude that

the theorem holds. 2

A.5 Pro of of Prop osition 5

Proof. To prove the prop osition, we rst argue that it is decidable if a typ e  can b e reached from

conv

typ e  via the interval-based assumption I = I Y , where the semantics of conv is sp eci ed bya

X

F

MQL formula template. Indeed, by condition 3 in the de nition of the semantics for conversion metho ds,

0 conv

atyp e  can b e reached from typ e  via I if and only if it can b e reached via I = I A , where A is a

X

single attribute in Y .Furthermore, by condition 4 in the same de nition, the typ e  can b e reached from

conv

typ e  via I A  if and only if there exists a temp oral mo dule M =XA;;' with the condition

X

M;

that t[X ] is a xed constant for all tuples t in the mo dule, and such that fx ;:::;x ;t:  j g is

conv

1 k

I A 

X

M;

not empty.For each predicate Mx ;:::;x ;t app earing in ,wemay simply replace it with

1 n conv

I A 

X

M t ^ x = a ^ ^ x = a assuming that the constant for X is a ;:::;a . Here, without

x 1 1 n1 n1 1 n1

n

loss of generality,we assume that A is the last attribute in M. Furthermore, replace any quanti er 9x

0 0 0 0 00 0 00

and 8x by 9M and 8M , and replace x = y by 8t M t  M t . Note that t

x x x y

0

are time variables, will b e unchanged in the formula. Call suchchanged formula , which is in monadic

0

second-order logic. It is easily seen that the formula is valid i the original is valid. It is known

[GHR94, page 558] that the validity of the formulas in this logic is decidable since the linear order we

M;

use is a discrete integer order. Hence, it is decidable if   can b e satis ed by a temp oral

conv

I A 

X

M;

g is always empty or not. Thus we conclude mo dule, i.e., whether the query fx ;:::;x ;t:  j

conv

1 k

I A 

X

conv

that, for a given arbitrary assumption I of the form I Y  and typ es  and  , it is decidable if 

X

can b e reached from  via I .

It is now easily seen that, for a given arbitrary assumption I and typ es  and  , it is decidable

if  can b e reached from  via I . [The reason is that the semantics of an interval-based assumption

that involve more than one conversion metho d, is the join of the semantics of each conversion metho d.]

From this, we conclude that the problem of whether a query is doable is decidable.

The lower b ound of the complexity of the decision pro cedure for the ab ove monadic second-order

cn

2

logic formula is 2 according to [FR79, page 3]. 2

A.6 Pro of of Theorem 3

Proof. Let Q b e a query on a database DB with a collection of interval-based assumptions expressed

F

0

in MQL , and Q the query obtained by Algorithm 2 with Q and as inputs. In order to prove the

0

theorem, we show that Q [DB ]=Q[DB ; ]. Supp ose DB = fM ;:::;M g. Let  ;:::; b e all

1 k 1 r

the temp oral typ es app earing in the database DB and assumptions in and in the query Q. Let 31

g, where each M is the union of the mo dules f M ;  ;:::;M ;:::;M DB = fM ;:::;M

i; I i j k; k; 1; 1;

r r

j 1 1

Q b e the query for eachinterval-based assumption I 2 involving all the attributes in M . Let

i

obtained bychanging each M x ;:::;x ;ttoM x ;:::;x ;tin Q, where  is the typ e of t. By

i 1 k i; 1 k

0

de nition, Q[DB ; ] = Q[DB ]. We only need to show that Q [DB ]= Q[DB ].

F

The query Q,asany query in MQL , has the form fx ;:::;x ;t :  j x ;:::;x ;tg.However,

1 n 1 n

for simplicity of the pro of, we generalize the query to have more than one free temp oral variable.

Hence, Q has the general form: fx ;:::;x ;t :  ;:::;t :  j x ;:::;x ;t ;:::;t g. We use

1 n 1 1 m m 1 n 1 m

induction on the structure of the formula . The induction pro of di ers from that of Theorem 2 only

for the case when = M x ;:::;x ;t :   where M is an arbitrary mo dule in DB . In this case we

1 k

have Q = fx ;:::;x ;t :  j M x ;:::;x ;tg that, evaluated on DB gives Q[DB ]= M .From the

1 k  1 k 

M; M;

0

x ;:::;x ;tg where I , algorithm wehave: Q = fx ;:::;x ;t :  j x ;:::;x ;t _ ::: _

1 k 1 1 k 1 k

I I

1

l

:::, I are the interval-based assumptions in containing all the attributes in M.Wenow show that

l

0

Q[DB ]= Q [DB ] in this case. By de nition, M is the mo dule having the same attributes and temp oral



T T

typ e as M and de ned as f M; [ ::: [ f M;. We know that the assumption mappings can b e

I I

1

l

M;

F

x ;:::;x ;tg. Hence, M is such expressed in MQL ; for example, f M;= fx ;:::;x ;t:  j

1 k  I 1 k

1

I

1

M; M;

that its windowing function 't= fx ;:::;x j x ;:::;x ;t _ ::: _ x ;:::;x ;tg for each

1 k 1 k 1 k

I I

1

l

0

time t.From this fact it is clear that the evaluation of Q on DB gives exactly the same tuples present

in M and this concludes the pro of of this case. The rest of the induction pro of is trivial as in the pro of



of Theorem 2 and thus omitted.

2

A.7 Pro of of Prop osition 6

Proof. If the query Q is on a generic database M ;:::;M the only predicates app earing in Q will b e

1 k

M  for any1 i  k . Each predicate instance corresp onds to the closure of a mo dule in the database

i

accordingly with the p oint-based assumptions or it corresp onds to a mo dule derived accordingly to

intervalor interval + p oint-based assumptions. Since the assumptions are assumed to b e safe, the

values in the minimal closures of these mo dules will b e either from the database or from function

application on values of the database. Hence, the syntactic conditions of safetyofQ and ensure that

the values resulting from the query are either from the database or from a b ounded numb er of function

applications on values in the database, and we can conclude that Q is emb edded domain indep endent

wrt . 2

A.8 Pro of of Theorem 5

Proof. Given a general query Q on a set of mo dules M ;:::;M with assumptions , by Theorem 4 we

1 k

0 0

know that its answer is the same as that of the query Q on M ;:::;M without assumptions, where Q

1 k

is obtained by the transformations of Algorithms 1 and 2 using assumptions . Hence, without loss of

generality,we will only show that for each query on a database without assumptions, its answer is

1st-order nitely partitioned.

For each 1fp-mo dule, since the numb er of tuples in a temp oral mo dule are assumed nite i.e., the

S

set 'i is nite, M can b e represented by the following formula

i1

 w ;:::;w =w = a ^ ::: ^ w = a ^ t _ :::_ w = a ^ ::: ^ w = a ^ t;

M 1 n 1 11 n 1n 1 1 r 1 n rn r

F

where each is a MQL formula involving only temp oral variables and temp oral predicates IntSec

j

and <. Each subformula in the disjunction identi es an n-tuple, and the formula t identi es the

j 32

set of ticks at which that tuple is valid. For example, the subformula w = a ^ w = b ^ 1 <

1 2

t ^ t< 4 _ t> 8 represents the tuple a; bin M as present at each tickbetween 1 and 4, and at

each tick greater than 8. Note that the number r of subformulas in the ab ove disjunction equals the

numb er of di erent tuples in a mo dule. Hence, has a nite length for each 1fp-mo dule M.

M

0 0

Let Q = fx :::x ;t j  x ;:::;x ;tg. Let  x ;:::;x ;t b e the formula obtained from   by sub-

1 m 1 m 1 m

0 0

stituting each predicate Mw ;:::;w ;t  app earing in it by the corresp onding formula  w ;:::;w ;t .

1 n M 1 n

0

All the terms in  are of one of the forms x  x , where 2 f=; 6=; +; ; ;=g and t  t where

i j l m

0 00

2 f=; 6=;<;>g. It is always p ossible to transform  into a formula  having the following structure:

00 00 00

 x ;:::;x ;t=  x ;:::;x  ^ t _ ::: _  x ;:::;x  ^ t:

1 k 1 k 1 1 k s

1 s

00

Each  x ;:::;x  is a rst-order formula on the data domain with free variables x ;:::;x and

1 k 1 k

j

involving only terms x  x . It intuitively identi es a set of tuples in the answer. Each tis

i j j

a rst-order formula on the temp oral domain with the only free variable t and involving only terms

00

t  t .Itintuitively identi es the set of ticks at which the tuples sp eci ed in  are in the answer.

l m

j

This representation of the answer satis es the de nition of 1st-order nitely partitioned mo dule. Indeed,

0

there are a nite numb er of sets of ticks, each identi ed by a 1st-order formula t, forming a partition

j

of the target typ e and such that the value of the windowing function is the same for each set. 2

A.9 Pro of of Theorem 6

F

Proof. From the pro of of Theorem 5, the answer of a MQL query takes the form:

00 00 00

 x ;:::;x ;t=  x ;:::;x  ^ t _ ::: _  x ;:::;x  ^ t:

1 k 1 k 1 1 k s

1 s

Since now each t only involves one typ e of temp oral variables and no IntSec predicate is used, we

i

0

may apply quanti er elimination pro cedures [Co o72] to reduce each to : a conjunction of temp oral

i

i

0

terms with t as the only free variable. intuitively identi es a nite set of intervals on time at which

i

00

=9y>0 x + y = x ^ x = 10 and the tuples must b e in the answer mo dule. For example, if 

1 2 2

j

0

t= t>8, the answer mo dule will contain at each tick after 8, all the p ossible tuples having 10 as

j

value of the second attribute and a value less then 10 for the rst attribute. Take t as the greatest

max

00 0

.For each tick j after t , the formula  identi es the same set temp oral constant app earing in all

max

i

of tuples as for tick t . Hence, the answer is eventually uniform, and this concludes the pro of. 2

max 33