Ensuring Consistency in Multidatabases by

Preserving TwoLevel Serializ abi l i ty

Sharad Mehrotra

W Springel d Avenue University of Illinois Urbana IL

Rajeev Rastogi Henry F Korth Abraham Silb ersc hatz

Lucent Technologies Mountain Avenue Murray Hill NJ

Asso ciation for Computing Machinery Inc Broadway New York NY USA

Tel Fax

The concept of has b een the traditionally accepted correctness criterion in

systems However in multidatabase systems MDBSs ensuring global serializability is a dicult

task The diculty arises due to the heterogeneity of the proto cols used by

the participating lo cal database management systems DBMSs and the desire to preservethe

autonomy of the lo cal DBMSs In general solutions to the global serializability problem result in

executions with a low degree of concurrency The alternative relaxed serializabilitymay result

in data inconsistency

In this pap er weintro duce a systematic approach to relaxing the serializability requirement

in MDBS environments Our approach exploits the structure of the integrity constraints and the

nature of transaction programs to ensure consistency without requiring executions to b e serializ

able We develop a simple yet p owerful classication of MDBSs based on the nature of integrity

constraints and transaction programs For eachoftheidentied mo dels weshowhow consistency

can b e preserved by ensuring that executions are twolevel serializable LSR LSR is a cor

rectness criterion for MDBS environmentsweaker than serializability What makes our approach

interesting is that unlike global serializability ensuring LSR in MDBS environments is relatively

simple and proto cols to ensure LSR p ermit a high degree of concurrencyFurthermorewe b elieve

the range of mo dels we consider cov er many practical MDBS environments to which the results

of this pap er can b e applied to preserve database consistency

General Terms Database consistency Multidatabase Systems Beyond serializability

Additional Key Words and Phrases Concurrency Control heterogeneous database integration

Much of this work was done at the UniversityofTexas at Austin with supp ort from TARP under

Grant ARP the NSF under Grants IRI and IRI and grants from the IBM

and HP corp orations

This is a preliminary release of an article accepted byACM Transactions on Database Systems

The denitiveversion is currently in pro duction at ACM and when released will sup ersede this

version

Permission to make digital or hard copies of part or all of this work for p ersonal or classro om use is

granted without fee provided that copies are not made or distributed for prot or direct commercial

advantage and that copies show this notice on the rst page or initial screen of a display along

with the full citation Copyrights for comp onents of this work owned by others than ACM must

b e honored Abstracting with credit is p ermitted To copy otherwise to republish to p ost on

servers to redistribute to lists or to use any comp onentofthiswork in other works requires prior

sp ecic p ermission andor a fee Permissions may b e requested from Publications Dept ACM

Inc Broadway New York NY USA fax or permissionsacmorg

INTRODUCTION

Databases are usually constructed to supp ort a single enterprise However for many

new applications domains there is a need to extend the database environmentto

include a broader range of users and to include several distinct within

a common framework These needs develop from the integration of departmental

information systems within a corp oration from corp orate mergers and acquisitions

from co op erativeventures involving indep endent corp orations etc

Although currentnetwork technology allows one physically to supp ort suchinte

gration serious problems exist at the database system level

The databases to b e integrated may run on distinct database managementsys

tems As a result the data mo dels relational ob jectoriented hierarchical etc

may dier and the application programs that access the databases may b e writ

ten in distinct and p ossibly incompatible languages

The data itself may b e in distinct formats on each system These distinctions

may b e in data typ es physical data representation or data semantics units of

measure language national or corp orate conventions etc

The organizations whose databases are b eing integrated may maintain a signi

cant degree of autonomyThismay limit the degree of central control that can

b e imp osed on the integrated system

Since each database system environment represents an enormous investment

in application development it is usually not economically feasible for all of the

databases to b e converted to a single database management system Issues of au

tonomy also inhibit such conversions

Amultidatabase system MDBS is a software system running on top of the indi

vidual database management systems DBMSs that manage the v arious databases

to b e integrated The job of the MDBS is to present users with the illusion of a

single unied database environment and to hide to the extent p ossible the fact

that the environment consists of indep endent geographically distributed sites each

running its own DBMS

This pap er fo cuses on one of the many issues in MDBS design transaction man

agement Ideally the MDBS should preserve the usual transactional prop erties

of atomicity consistency isolation and durabilityGray and Reuter at the

global level Achieving this however is dicult b ecause of the following twochar

acteristics of the MDBS environments

Heterogeneity Each lo cal DBMS may follow dierent concurrency control and

recovery algorithms

Autonomy It is not practically feasible to mo dify the underlying lo cal DBMS

software to facilitate integration

Database consistency is traditionally ensured by requiring that the concurrent

execution of transactions b e serializable that is equivalent to a nonconcurrent

executionPapadimitriou The problem of ensuring global serializabilityin

an MDBS environment has b een studied extensively Breitbart and Silb erschatz

Breitbart et al Georgakop oulos et al Batra et al Du et al Mehrotra et al Pu Raz A necessary condition for



maintaining global serializabilityisthatallglobal transactions that access data

at multiple DBMSs are serialized in the same order at all the sites at whichthey

execute The MDBS is limited in its p ower to ensure this prop erty since preexisting

applications in the lo cal DBMSs can generate transactions that run entirely within

that lo cal DBMS These transactions may generate indirect conicts among global

transactions conicts of which the MDBS is unaware One way to guarantee this

is to make the p essimistic assumption that anytwo global transactions that execute

at a common lo cal DBMS conict This however results in low concurrency

One waytoovercome the problem of low concurrency is to relax the serializabil

ity requirementforMDBSenvironments Numerous such approaches have b een

prop osed Wu et al Du and Elmagarmid Rastogi et al Rastogi

et al Mehrotra et al a and are discussed in Section whichcovers re

lated work Relaxing the serializability requirement however may result in a loss

of database consistency which for many database applications cannot b e tolerated

In this pap er weintro duce a systematic approach to relaxing the serializability

requirement in MDBS environments without jeopardizing database consistencyWe

develop a simple yet p owerful classication for MDBSs based on the structure

of the integrity constraints and the nature of the transaction programs present

in the system For each of the develop ed mo dels weshowhow consistency can

b e preserv ed by ensuring that executions are twolevel serializable LSR Two

level serializability is a correctness criterion for MDBS environments intro duced in

Mehrotra et al a that is weaker than serializability What makes our approach

interesting is that unlike global serializability ensuring that executions are LSR

in MDBS environments is relatively simple and proto cols for ensuring LSR allow

a high degree of concurrencyFurthermore we b elieve that the range of mo dels for

whichweshow LSR executions preserve database consistency cover many practical

MDBS environments to which the results of this pap er can b e applied

The remainder of the pap er is organized as follows In Section we establish

the preliminaries for our work we formalize the notion of database consistency and

develop the transaction and schedule mo del used in the rest of the pap er In Sec

tion we describ e the MDBS mo del to whichourwork is applicable In Section

we discuss the LSR correctness criterion for MDBS environments and mechanisms

that can b e used to ensure schedules are LSR In Section we develop a sp ectrum

of MDBS mo dels based on the structure of the integrity constraints and the na

ture of transaction programs for whichweshow LSR schedules preserve database

consistency Section presents a nancial application that can b e captured by one

of our MDBS mo dels and thus demonstrates the utility of LSR In Section we

discuss related and previous work Finally Section oers concluding remarks

PRELIMINARIES

In a database system where serializability is ensured through some concurrency

control scheme database consistency can b e maintained by simply requiring that

each transaction individually maintain consistencyInsuch systems the transac

tion manager need not b e concerned with the sp ecic nature of the consistency

constraints In contrast for us to b e able to relax the serializability requirements

while maintaining consistencywe rst need to establish what constitutes a consis

tent database state In this section we formalize the notion of database consistency



and develop a mo del for transactions and schedules that is used in the remainder

of the pap er

Database Consistency

A database consists of a set D of data items For each data item d D Domd

denotes the domain of d For simplicitywe consider Domd to consist of numeric

and string constants although our results can b e extended to richer domains eg

sets lists A database state maps every data item d to a value v where v

Domd Thus a database state can b e expressed as a set of ordered pairs of data

items in D and their values DS fd v d D and v Domd g DS has

the prop erty that if d v DS and d v DS thenv v This sp ecies

 

that each data item has a unique value

Integrity constraints denoted by IC in a database distinguish inconsistent database

states from consistent ones One way to formalize the notion of integrity constraints

is to consider them as a subset of all the p ossible database states and a database

state is consistent if it b elongs to that subset Papadimitriou An equivalent

formulation is to consider integrity constraints as a conjunction of rstorder logic

formulae over a language consisting of the following

Numerical and string constants eg Jim

Functions over numeric and string constants eg max

Comparison op erators eg and

Set of variables data items in D

Let IC b e the integrity constraints over the database system and let DS be a

database state DS is consistent denoted by DS j IC if the formula resulting by

the replacementofevery o ccurrence of the data item d in IC by its value in DS

D For example consider a database consisting of data is satisable where d

items a b and an integrity constraint IC a b The database state DS

fa b g is consistent However the database state DS fa b g is



not consistent

Finallywe asso ciate the notion of consistency with a restriction of the database

d d

state DS to data items in d D that is of DS We dene DS to b e consistent

d d

if there exists a consistent database state DS suchthatDS DS Notethat

even though DS is not consistent certain of its restrictions may b e consistent For

example consider a database consisting of data items a b and an integritycon

straint IC a b The database state DS fa b g is not consistent

fag fbg

However its restrictions DS fa g and DS fb g are consistent

since each of the restrictions can b e extended to a consistent database state The

following two lemmas relate the consistency of restrictions of a database state to

the consistency of the database state

Lemma Let IC C C C where IC C are dened over data

 l e

items in D d resp ectively suchthatd d for all e f Letd d and

e e f e

e

S

l

d d

e e

is DS b e a database state is consistent i for all e l DS DS

e

consistent

Pro of

S

l

d d

e e

DS is consistent then for all e l DS is consistent This If

e



follows directly from the denition of database consistency

S

l

d d

e e

is is consistent then DS Wenowprovethatifforalle l DS

e

d

e

consistent Since DS are consistent there exist consistent database states DS

e

d

d

e

e

e lLetDS b e a database state suchthat suchthatDS DS

e



d

d

e

e

DS DS e l suchaDS exists since d d e f Since

 e f

e



d

e d

e

DS j and C is dened only over data items in d DS j C C DS DS

e e e  e e

e



S

l

d

e

DS Thus DS e lThus DS j C C C Also

   l

e

S

l

d

e

there exists a consistent database state DS such that DS DS Hence

 

e

S

l

d

e

is consistent DS by denition of database consistency

e

Note that it is essential for the data items over which conjuncts are dened to

b e disjoint if Lemma is to hold W eshowthisby the following example Let

IC a b c b Thus d fa bg and d fa bg



d d



Let d fag d fcg DS fa g and DS fc gThus even though



d d d d

 

DS and DS are consistent DS DS is inconsistent Since d d



database state DS in the pro of of Lemma may not exist



In the following lemma weshow that if the restriction of a database state to

data items in every conjunct is consistent then the database state itself must b e

consistent In contrast to Lemma the sets of data items over which the conjuncts

are dened are not required to b e disjoint

Lemma Let IC C C C where IC C are dened over data items

 l e

d

k

in D d resp ectively Let DS b e a database state DS forallk k l

e

is consistentiDS is consistent

Pro of

Follows directly from the denition of consistency

d

k

Since DS is consistentandC is dened only over data items in d DS j C

k k k

for all k k lThus DS j C C C Hence DS is consistent

 l

It must be pointed out that the results presented in the pap er do not necessarily

require integrity constraints to b e restricted to rstorder logic formulae the results

would hold for integrity constraints expressed in an arbitrary language as long as

lemmas and hold for the language

Transactions and Schedule s

A transaction T is a sequence of op erations resulting from the execution of a

i

transaction program P A transaction program is usually written in a highlevel

i

programming language with assignments lo ops conditional statements and other

complex control structures In practice transaction programs access the database

through emb edded SQL For simplicity in our examples we use explicit read and

write op erations on data items However our results apply to any language in which



transaction programs maybewritten A transaction results from the execution of

a transaction program Execution of a transaction program is deterministicthat

is execution of a transaction program from the same database state always results

in the same transaction However execution of a transaction program starting at

dierent database states may result in dierent transactions Formally a transac

tion T O where O fo o o g is a set of op erations and is a

T  n T

total order on O An op eration o isatupleactiono entity o v al ueo

i i i i

actiono denotes an op eration typ e which is either a read r or write w op

i

eration entity o is the data item on which the op eration is p erformed If the

i

op eration is a read op eration valueo isthevalue returned by the read op eration

i

for the data item read For a write op eration valueo is the value assigned to the

i

data item by the write op eration For simplicit y of the exp osition we assume that

in each transaction adatabase item is read at most once and written at most once

and that no database item is read after it is written

Aschedule consists of a sequence of op erations resulting from the concurrent

execution of a set of transaction programs Aschedule S isanite

S

set of transactions together with a total order on all op erations of the

S



transactions Also if for two op erations o o in S and some transaction T



wehave o o then o o

T  S 

We use the notation fDS gP fDS g to denote the fact that when a transaction

i 

program P executes from a database state DS it results in a database state

i

DS Similar notation is also used for schedules transactions andor any arbitrary



sequence of op erations For example fDS gSfDS g denotes that the execution of



the schedule S from the initial database state DS results in the database state

DS Note that since op erations havevalues asso ciated with them execution of



a sequence of op erations is p ossible only from certain legal database states A

database state DS is legal with resp ect to op eration o denoted by legalDS o

i i

if it is p ossible to execute o from DS Thus legalDS o if

i i

either actiono w or

i

if actiono r then entity o valueo DS

i i i

A database state DS is legal with resp ect to a sequence of op erations seq

o o o if it is p ossible to execute seq from DS thatislegalDS seq if

 p

DS o and legal

if p then legalDS o o where fDS g o fDS g

 p

Execution of a sequence of op erations seq from a database state which is not legal

with resp ect to seq is undened Therefore when wewritefDS gseq fDS gwe



assume implicitly that the database state DS is legal with resp ect to seq

Insert and delete op erations can b e handled by the readwrite mo del as follows A valid bit is

asso ciated with each data item and every insertdelete is then simply a write that setsresets

the valid bit for the item Furthermore phantoms can b e handled using next key locking Mohan



In distributed systems schedules are generally dened to b e a set of op erations with a partial

order dened on them We consider schedules to b e a set of op erations with a total order dened

on them This total order can b e assumed to b e an arbitrary total order which is consistentwith the original partial order



Correctness of Transactions and Schedule s

We assume that each transaction program when executed in isolation preserves

database consistency That is for all database states DS suchthatfDS gP fDS g

i 

if DS is consistent then DS is also consistent Also a pair of op erations o

 i

and o in a schedule S are said to conict if entity o entity o and either

j i j

actiono w or actiono w Theserialization graph for a schedule S consists

i j

of no des corresp onding to transactions in S and an edge from T to T if and only

i j

o Aschedule S is said to if for conicting op erations o T and o T o

j i i j j i S

be serializable if its serialization graph is acyclic Bernstein et al

Traditionally database consistency is guaranteed implicitly by ensuring that the

schedule resulting from the concurrent execution of transaction programs is serial

izable Since each transaction program maps a consistent database state to another

consistent state a serial and hence a serializable schedule preserves database con

sistency Serializability is not only a sucient condition for the preservation of

database consistency but is also necessary if arbitrary integrity constraints maybe

present and only syntactic or structural information ab out transactions that is the

set of read and write op erations that constitute transactions is known Kung and

Papadimitriou That is given a nonserializable schedule S containing trans

actions T T T itisalways p ossible to construct a set of transaction programs

 n

P P P and asso ciate a set of integrity constraints IC such that for some

 n

initial consistent database state DS the concurrent execution of P P P

 n

results in schedule S such that fDS gS fDS gwhere DS is inconsisten t

 

The necessity of serializability for preserving database consistency in the pres

ence of arbitrary integrity constraints and transaction programs suggests that the

key to relaxing the serializability requirement without jeopardizing consistency is

to exploit either the knowledge of the semantics of transaction programs or the in

formation ab out the nature of integrity constraints over the database In Section

we showhowsuch information can b e used to ensure consistency without requiring

schedules to b e serializable in MDBS environments

Strong Correctness

So far wehave only considered the requirement that concurrent executions pre

serve database consistency Preservation of consistencyby itself may not b e a

sucient consistency guarantee for applications For example it alone do es not

prevent readonly transactions from seeing an inconsistent state of the database

Executions that preserve database consistencybutmay result in transactions see

ing an inconsistent state of the database may still b e undesirable Topreventsuch

executions we dene the notion of strong correctness that in addition to requiring

schedules to preserveintegrity constraints also requires transactions in a schedule

to read consistentdatavalues

In order to formally develop the notion of strong correctness weneedtointro duce

some notation Let seq o o o b e a sequence of op erations r eadseq denotes

 p

the database state seen as a result the read op erations in seq Formally

readseq fy z o seq y entity o z valueo actionor g

Denition Aschedule S is strongly correct i S



for all consistent database states DS iffDS gS fDS gthen DS is consistent

 

and

for all transactions t r eadt is consistent

The mechanisms for ensuring correctness in MDBS environments develop ed in

this pap er not only ensure that database consistency is preserved but also ensure

executions are strongly correct

MDBS MODEL

Amultidatabase system MDBS consists of a heterogeneous set of autonomous

preexisting lo cal database management systems DBMS DBMS DBMS

 m

lo cated at sites s s s resp ectively The data asso ciated with DBMS is

 m i

denoted by D We denote the set of all the data items in the

i

S

m

D We assume that the lo cal databases are disjoint that is by D thus D

i

i

D D i j

i j

In an MDBS en vironment transactions are of twotyp es

Lo cal transactions The set of transactions denoted by that execute at

L

a single site All applications existing at a lo cal site b efore integration generate

lo cal transactions

Global transactions The set of transactions denoted by thatmay execute

G

at more than one site Global transactions are normally applications develop ed

after the integration has b een p erformed

The MDBS software consists of a global transaction manager GTM built on

top of the existing databases While lo cal transactions execute directly under the

control of the lo cal DBMSs the execution of global transactions is controlled by

the GTM The GTM forwards the op erations b elonging to the global transactions

to the appropriate lo cal DBMS for execution and orchestrates the commitmentof

global transactions We assume that the interface b etween the GTM and the lo cal

DBMSs provides for op erations to b e submitted by the GTM to the DBMSs and

the DBMSs to acknowledge the completion of op erations to the GTM Further

more the lo cal DBMS considers the op erations of a global transaction as part of

a single subtransaction The lo cal DBMS cannot dierentiate b etween subtransac

tions of the global transaction and other concurrently executing lo cal transactions

Each lo cal DBMS DBMS ensures serializabilityofschedules consisting of lo cal

i

transactions and subtransactions of global transactions at its site The GTM on

the other hand controls the order in which it submits the op erations b elonging to

the global transactions for execution to the lo cal DBMSs in such a manner that

database consistency is not jeopardized

TWOLEVEL SERIALIZABILITY

In an MDBS environment since lo cal transactions execute outside the control of the

GTM the GTM without taking any additional steps cannot determine the order in

which transactions are serialized at the lo cal DBMSs Such information is necessary

for the GTM to ensure global serializability since to ensure global serializabilitythe

GTM needs to ensure that subtransactions of global transactions are serialized the

same way at all the lo cal DBMSs In case no knowledge of the concurrency control



proto col followed by a lo cal DBMS is available a mechanism that the GTM can use

to determine the serialization order among global transactions at the lo cal DBMS is

to force every global transaction that executes at the lo cal DBMS to conict directly

Georgakop oulos for example by writing on a common data item The

order in which the conicting op erations executes determines the serialization order

among the global transactions Unfortunately forcing anytwo global transactions

to conict directly results in low concurrency

A b etter alternative exists if the concurrency control mechanism followed bythe

lo cal DBMS is known and serializes transactions in the execution order of certain

serialization events Elmagarmid and Du Mehrotra et al In this ap

proach global serializability can b e ensured by ensuring that global subtransactions

execute their serialization events in the same order at all sites Note that even this

approachessentially considers two global transactions that execute at a common

lo cal DBMS to conict and thus results in low concurrency In practice a number

of lo cal DBMSs generate rigorous schedules Breitbart et al In case every

lo cal schedule is rigorous then the lo cal schedulers ensure global serializabilityand

the GTM do es not need to p erform any concurrency control related actions

Due to the low concurrency resulting from schemes that ensure global serializabil

itynumerous correctness criteria that are simpler to implement in MDBS environ

ments but may allow certain nonserializable executions have b een prop osed One

such criterion is the notion of twolevel serializability LSR whichwas intro duced

in Mehrotra et al a For a sequence of op erations seq we denote by

d

seq the subsequence of seq consisting of all op erations o suchthatentity o

i i

d where d is a set of data items

seq the subsequence of seq consisting of all op erations o such that for some

i

transaction T o T

j i j

Formally the LSR correctness criterion is dened as follows

Denition A global schedule S is LSR if

G

is serializable and S

D

i

is serializable For all i m S

D

i

where S denotes the subsequence of S consisting of the pro jection of S on data

G

items in D and S denotes the subsequence of S consisting of the pro jection of

i

S to transactions in

G

Notice that the set of serializable schedules is a prop er subset of the set of LSR

w schedules An example of a LSR schedule that is not serializable is given b elo

Example Consider an MDBS consisting of two sites s and s Let D



fa bg D fc dg Consider the following two global transaction programs



P a c

P d b



Let P and P b e lo cal transaction programs executing at sites s and s resp ec

  

tively

P b a



P c d



Consider the lo cal schedules at sites s and s resulting from the execution of P P

 

P and P from database state fa b c d gIntheschedules an

 

op eration o r x v denotes an op eration b elonging to transaction T suchthat

i i

actiono is read entity oxandvalueov Similarly o w x v denotes

i

an op eration b elonging to transaction T suchthatactiono is write entity ox

i

and valueov

S r a w a r b w b

  

S r c r d w c w d

   

The ab oveschedule is LSR but not serializable

Unlike the case of ensuring serializability ensuring that schedules are LSR in

MDBS environments is relatively simple In order to ensure that a global schedule

G

S is LSR the GTM only needs to ensure that S is serializable since the lo cal

D

i

DBMS at each site s ensures the serializabilityof S Since the global transac

i

tions execute under the control of the GTM the GTM has knowledge of the data

G

items eg tuples relations accessed by them and ensuring serializabilityof S

is straightforward The GTM can followany proto col for ensuring serializability

G

in centralized DBMSs in order to ensure that S is serializable For example the

GTM could followa twophase locking PL proto col in which the GTM main

tains lo cks for every data item accessed by global transactions referred to as global

locks Global transactions acquire and release global lo cks in accordance with PL

G

proto col thereby ensuring serializabilityof S

Other proto cols for ensuring LSR app ear in Mehrotra et al M Ouz

zani M and NL Mehrotra et al a Mehrotra et al b Unlikethe

case of serializability in order to ensure that schedules are LSR the GTM do es

not need to force conicts b etween global transactions that execute at the same

lo cal DBMS This p ermits LSR to b e ensured while allowing a high degree of

concurrencyThus the GTM proto cols to ensure LSR are easily implementable

p ermit a high degree of concurrency and do not violate the lo cal autonomy of sites

However since LSR schedules may b e nonserializable from our discussions in

Section they may not preserve database consistency in the presence of arbitrary

integrity constraints and transaction programs The following example illustrates

a LSR schedule that violates database consistency

Example Consider an MDBS consisting of two sites s and s Let D



c b c e Consider fb c eg D fagLetIC a



the following two global transaction programs

P b

if a then c

P c



a

Let P b e a lo cal transaction program executing at site s



P if b then e c



else e

Consider the lo cal schedules at sites s and s resulting from the execution of



P P and P from the database state fa b c e g

 

S w b r b r c w c w e

   

S w a r a

 

The nal database state resulting from the execution of the ab oveschedule is

fa b c e g

which is inconsistenteven though the schedule is LSR

Conditions under which LSR schedules preserve strong correctness is the topic

of the following section

CORRECTNESS OF LSR SCHEDULES

The key to ensuring that LSR schedules preserve database consistency is to ex

ploit the nature of integrity constraints and transaction programs that execute in

the system Therefore b efore wedevelop conditions under which LSR sched

ules preserve correctness let us rst examine the nature of integrity constraints in

MDBS environments more closely

In an MDBS environment each lo cal DBMS denes certain lo cal consistency con

straints among the various lo cal data items The integration of the various DBMSs

into an MDBS results in the intro duction of certain global intersite constraints

whichwere not present prior to integration The observation that intersite in

tegrity constraints are intro duced as a result of integration enables us to partition

the set of data items at a site D into local data items LD and global data items

i i

GD such that LD GD andD LD GD Furthermore if there is

i i i i i i

an integrity constraintbetween d D and d D i j then d GD and

i j i



S

m

d GD The set of all the global data items GD GD and the set of all

j i



i

S

m

LD Data items at dierent sites b etween which the lo cal data items LD

i

i

integrity constraints are intro duced as a result of the integration are thus in GD

For example all replicated data items are global we mo del replicated data items

as distinct data items at dierent sites with an equality constraintbetween them

Note that the denition of lo cal and global data items suggests that there b e no

integrity constraints b etw een lo cal and global data items at a site as that could

lead to an intersite integrity constraintinvolving lo cal data items While it is

generally true that the presence of integrity constraints b etween lo cal and global

data items at a site mayleadtoanintersite integrity constraintinvolving lo cal

data items it is not necessaryFor example consider a database consisting of two

sites s and s Leta LD b GD and c GD Let IC ab cb

 

Note that there is an integrity constraintbetween lo cal and global data items but

there is no intersite integrity constraintinvolving the lo cal data item aOnthe

other hand if IC a b b c then an intersite integrity constraint a c

is closed under inference involving item a results The integrity constraint IC

Partitioning of data items at eachsiteinto lo cal and global data allows us to

dene a hierarchy of MDBS mo dels under which LSR schedules can b e shown

to preserve database consistency These mo dels are based on restricting the lo cal

and global transactions read and write op erations on the various data items The

mo dels considered range from a very restrictive mo del in which global transactions

access only global data and lo cal transactions access only lo cal data to a very

general mo del in which only minimal restrictions are imp osed on the data accesses

of transactions It is the resp onsibility of the database designers and application

programmers to partition the database into lo cal and global data dep ending on the

integrity constraints Furthermore application programmers must ensure that lo cal

and global transaction programs adhere to the restrictions on data item accesses

that are imp osed by the mo del

Note that in an MDBS since preexisting lo cal applications were unaware of the

intersite integrity constraints intro duced after the integration if lo cal transactions

were to write global data items database consistency may not b e preserved For

example the resulting database state may b e inconsistent if lo cal transactions wrote

replicated data items Therefore in all the mo dels considered we will assume that

lo cal transactions do not write global data items

The most restrictive of our mo dels to whichwe refer as the trivial modelturns

out to b e restrictive to the p oint of impracticalityItispresented only to show

one extreme The remaining mo dels that we discuss provide tradeos b etween

restrictions on data access on one hand and restrictions on transaction programs

and integrity constraints on the other

In general it is not p ossible to claim that one mo del is b etter than another

Rather dierent mo dels will b est serve dierent realworld MDBS environments

Our goal in this section is to identify some b ounds on the range of p ossible mo dels

In Section we p erform an analysis of a sp ecic nancial application and show

how it can b e captured by one of our mo dels

Before we discuss the mo dels and showLSRschedules preserve database consis

tencywe rst develop some notation that is used in proving executions are strongly

correct

Notation

Let seq b e a sequence of op erations The following notation dened for seq will b e

used in the remainder of the section note that seq may corresp ond to a transaction

aschedule or simply an arbitrary sequence of op erations

RS seq denotes the set of data items read by op erations in seq

RS seq fy o seq y entity o actiono r g

i i i

WSseq denotes the set of data items written by op erations in seq

WSseq fy o seq y entity o actiono w g

i i i

str uctseq denotes the structure asso ciated with seq whichisderived from

alues asso ciated with the op erations in seq Thus every seq by ignoring the v

op eration o in str uctseq is a tuple actiono entity o

i i i

Wedenoteby d S the set of transactions in S that have at least one write

w

op eration on some data item in dFormally

d S fT WST d g

w i S i



Furthermore we asso ciate the notion of a state with a transaction The state

asso ciated with the transaction is a p ossible state of the data items that the trans

action mayhave seen The state seen by the transaction is an abstract notion and

may never have b een physically realized in a schedule

Denition Let S be a schedule d D and b e a set of transactions

d

suchthat d S and S is serializable Let T T T be a

w S  n

d

serialization order of transactions in S and DS b e a database state suchthat

legalDS S The state of the database b efore the execution of each transaction

with resp ect to data items in d is dened as follows

stateT dSDS

i

d

DS if i

d

dWST 

d

i

stateT dSDS writeT if i

i

i

The state seen by T is the same as that seen by T except for those data items

i i

d

written by T Thus on d WST wetake the state seen by T while on

i i

i

d

WST we take the newly written values from T In the ab ove denition

i

i

d

 dWST

i

stateT dSDS denotes the restriction of stateT dSDS to

i i

d

Therefore stateT dSDS is the state of the data items in d WST

i

i

database with resp ect to data items in d as seen by T Note that from the denition

i

of state and serializabilityitfollows that

d

readT stateT dSDS and

i

i

d d

if fDS gS fDS g then fstateT dSDS gT fDS g

 n

n 

Example Consider the following schedule resulting from the execution of

transaction programs P and P from database state DS fa b c g



S r a r a w b w c

 

S is serializable with serialization order T T or T T With serialization order

 

T T



stateT fa b cgSDS fa b c g



However with serialization order T T



c gFurthermore stateT fa b cgSDS fa b



fabcg fabcg

r eadT fa g Note that readT stateT fa b cgSDS for



 

either serialization order

We next discuss various MDBS mo dels based on the data accesses of transactions

and identify conditions under whichLSRschedules preserve strong correctness

We b egin with a highly restrictive mo del whichwe call the trivial mo del and then

pro ceed to less restrictive and more practical mo dels

The Trivial Mo del

Consider a mo del of MDBS applications with the following restrictions on transac

tions lo cal transactions only read and write lo cal data items and



global transactions only read and write global data items

For this mo del we can establish the following theorem

Theorem Let S b e a LSR schedule in the trivial mo del S is serializable

and thus is strongly correct

Pro of As a result of the mo del there is no edge in the serialization graph

of S between no des corresp onding to transactions T T such that T and

 G

G G

T SinceS is serializable the serialization graph of S is acyclic Also

 L

D

i

for all i m S is serializable and lo cal transactions at site s do not

i

conict with lo cal transactions at site s i j As a result the serialization graph

j

L

of S is acyclic Thus the serialization graph of S is acyclic and S is serializable

The Global Read G Model

r

Consider a mo del of MDBS applications with the following restrictions on transac

tions

lo cal transactions only read and write lo cal data items and

global transactions in addition to reading and writing global data items also

read lo cal data items

In the G mo del LSR schedules may not preserve database consistency as is il

r

lustrated by the follo wing example

Example Consider an MDBS consisting of two sites s and s Let D



fa b eg D fcg LD fag and GD fb c egLetIC a b c



b e Note that a b is an integrity constraintbetween

lo cal data item a and global data item b Consider the following global transaction

programs

P if a then e

else e

c

P if a then b



else b

c

Let P b e a lo cal transaction program executing at site s



P a



Consider the lo cal schedules at s and s resulting from the execution of P P

 

and P from the database state fa c b e g



S r a w e w a r a w b

  

S w c w c

 

The state resulting from the ab ove LSR schedule is fa c b e g

which is inconsistent



In Example ab ove note that there is an integrity constraintbetween a which

is a lo cal data item and b which is a global data item If we further restrict the

mo del and disallow suchintegrity constraints b etween lo cal and global data items

LSR schedules can b e shown to b e strongly correct Before weprove that LSR

schedules preserve strong correctness we rst need to establish conditions under

which transactions and schedules resp ectivelyleave the database in a consistent

state after execution

Lemma Let S be a schedule consisting of a transaction T which results from the

i

execution of a transaction program P note that S T Let DS b e a database

i i

d

state such that fDS gT fDS g and let d D IfDS r eadT is a consistent

i  i

dWST 

i

database state then DS is consistent



dRS T 

i

d

Pro of Let DS b e a consistent database state suchthatDS DS





readT Let fDS gP fDS gLetT b e the transaction and S be the sched

i  i 

i

ule resulting from the execution of P from DS note that S T Since

i 

i

RS T 

i

DS r eadT readT r eadT Since writes are a function of the reads

i i

i



b efore them T and T result from the execution of the same transaction program P

i i

i

dWST  dWST 

i i

d d

ethat T T Since DS DS and T T DS DS wehav

i i

i  i

 

Since P is a correct transaction program DS is consistent and the lemma has

i 

b een proven

d

Note that Lemma requires DS r eadT tobeconsistent in order for it to b e

i

dWST 

i

d

shown that DS is consistent Consistency of DS and readT alone do es

i



dWST 

i

d

r eadT may not b e consistent since DS not ensure consistency of DS

i



d

That is even if DS and r eadT are consistent a consistentstateDS suchthat

i 

dRS T 

i

d

DS DS r eadT may not exist We illustrate this in the following

i



example

Example Consider a database with data items D fa b cg Let IC a

b b c Consider the following transaction program P

P a c

Let d fa bg and DS fa b c g Consider the execution of

transaction program P from DS that results in the following transaction

T r c w a

The database state DS resulting from the execution of P from DS is



DS fa b c g



d

Thus even though DS and r eadT are consistent their union

d

DS readT fa b c g

is inconsistent and as a result

d

g fa b DS



is inconsistent



In Lemma we sp ecied conditions required to ensure that execution of a single

transaction preserves consistency of a set of data items Wenow use Lemma to

develop conditions under whichschedules for a set of transactions preserveconsis

tency of a set of data items

Lemma Let S be a schedule d D b e a set of transactions in S such

that d S and DS b e a database state such that fDS gS fDS gIf

w 

d

S is serializable let T T T b e a serialization order of transactions in

 n

d

S

for all T d S stateT dSDS r eadT is consistent if it is the case

j w j j

that stateT dSDS is consistent and

j

d

DS is consistent

d

then DS is consistentandstateT dSDS is consistentforalli i n

i



Pro of The pro of is by induction on i

d

Basis i stateT dSDS DS whichisgiven to b e consistent

Induction Assume true for i m mn that is stateT dSDS is

m

consistent We need to showthatstateT dSDS is consistent Consider

m

the following two cases

Case T d S Since T d S wehave stateT dSDS

m w m w m

stateT dSDS By IH since stateT dSDS is consistent it follows

m m

that stateT dSDS is consistent

m

Case T d S Since stateT dSDS is consistent it follows by

m w m

the second hyp othesis that stateT dSDS readT is consistent By

m m

Lemma we can conclude that stateT dSDS is consistent

m

i n In particular Thus stateT dSDS is consistentforalli

i

stateT dSDS is consistent Thus by Lemma using a similar argumentas

n

d

ab ove DS is consistent



Using the ab ove lemma we can establish that LSR schedules under appropriate

restrictions preserve strong correctness of schedules in the G mo del

r

Theorem Let S be a LSR schedule in the G mo del If no integrity

r

constraints are presentbetween lo cal and global data items then S is strongly

correct

Pro of Let DS b e a consistent database state such that legalDS S Let

fDS gS fDS g In order to showthatS is a strongly correct we need to show



that DS is consistentandr eadT for all T is consistent As no integrity

 i i S

constraints are presentbetween lo cal and global data items the integrity constraints

L L G L

can b e viewed as IC C C C where C is a conjunct dened over

m e

G

data items in LD and C is a conjunct dened over data items in GD As a result

e

of the mo del LD LD when e f andLD GD

e f e

L L L

Wenow use Lemma to showthat C C C are preserved by S and global

 m

transactions read consistent lo cal data items Since only lo cal transactions at site

D

S k

s write lo cal data items at site s LD S Since S S and S

k k w k L S



LD

S k

is serializable S is serializable Let T T T b e a serialization order of

 n

LD

LD

k

S k

transactions in S By Lemma since DS is consistent DS is consis

tent Since lo cal transactions at site s read only data items in LD RS T LD

k k i k

where T LD S Thus for every transaction T LD S since

i w k i w k

readT stateT LD SDS if stateT LD SDS isconsistent then

i i k i k

LD

k

state is con T LD SDS r eadT is consistent Thus by Lemma DS

i k i



sistent and stateT LD SDS for all i i n is consistent Since

i k

LD LD

k k

is consistent for all T T stateT LD SDS readT readT

i i S i k

i i

Since lo cal transactions access only lo cal data items at a single site readT for

i

all T T is consistent

i i L

G

C can nowbeshown to b e preserved by S using Lemma as follows Since only

G

global transactions write on global data items GD S Since S is serial

w G

GD GD

G

izable S is serializable Since DS is consistent by Lemma DS is con

LD

k

is consistent T sistent As shown ab ove for all k m r eadT

i G

i

m

S

LD

m

LD k

k k

Since LD LD e f by Lemma r eadT readT

e f

i i

k 

T is consistent Th us since GD LD for all e e mand

i G e

GD

stateT GDSDS by Lemma if for some T GD S readT

i i w

i

stateT GD S DS is consistent then stateT GDSDS r eadT iscon

i i i

GD

sistent Thus by Lemma DS is consistent and stateT GDSDS for

i



GD

all T T is consistent Since readT stateT GD S DS and

i i G i

i

m

LD

k

i

readT T is consistent by Lemma r eadT for all T T

i G i i i G

i

is consistent

LD

GD k

Thus DS and DS for all k k m is consistent Hence by





Lemma DS is consistent Thus S is strongly correct



The Lo cal Read L Model

r

Consider a mo del of MDBS applications with the following restrictions on transac

tions

lo cal transactions read and write lo cal data items and also read global data items

and

global transactions only read and write global data items

As in the G mo del LSR schedules may not always preserve database consis

r

tency in the L mo del To see this consider the execution in Example in Section

r

In Example the lo cal transaction program P reads inconsistent data values The



reason for the reads b eing inconsistentisthatP leaves the database at site s in an

inconsistent state which is then seen by P This can b e overcome by requiring



the execution of global transaction programs from a consistent lo cal database state

to leave the lo cal database in a consistent state other lo cal database states might

b e inconsistent This is stated more formally b elow

Denition Transaction program P is Local Database Preserving LDP if for

i

D

k

is consistent and all database states DS and for all k k mifDS

D

k

is consistent fDS gP fDS g then DS

i  



Another example of a LSR schedule that do es not preserve database consistency

in the L mo del is given b elow

r

Example Consider an MDBS consisting of two sites s and s Let D



fa b eg D fcg LD fbg and GD fa c egLetIC c a e



c e a b Note that no integrity constraints are present

between lo cal and global data items Consider the following two global transaction

programs

P a

c

P if c then e



else e

Let P b e a lo cal transaction program executing at site s



P temp a



if e then temp

b temp

Consider the lo cal schedules at sites s and s resulting from the execution of



P P and P from the database state fa e c b g

 

s r a w a w e r e w b

   

s w c r c

 

The nal database state resulting from the execution of the ab oveschedule is

fa e c b g

which is inconsistent

In Example the lo cal transaction program P reads inconsistentdatavalues



The reason for the reads b eing inconsistent is that the value written by the global

transaction program P for the data item e dep ended on the value of data item



cwhich is lo cated at another site As the global transaction programs can access

data at multiple sites the b ehavior of a global transaction at a site may dep end

on the database state seen by it at a dierent site One way to removesuch

inconsistent reads is by restricting the global transaction programs

A restriction similar to LDP absence of value dependencywas imp osed on trans

action programs in Du and Elmagarmid Informally a global transaction

program has value dep endencies if the op erations p erformed by it at one site de

p end on the op erations it has p erformed at another site

Denition Transaction program P has no value dep endency if for all pairs

i

D D

k k

DS DS of database states and for all k k mifDS DS





D D

k k

then T T where T and T are transactions resulting from the execution of





P from DS and DS resp ectively

i 

Consider the following two global transaction programs in an MDBS that inte

grates databases b elonging to two banks lo cated at sites s and s





P if act amt then b egin

act act amt

act act amt

 

end

P returnact act

 

Transaction program P transfers money amt from account act at site s to

account act at site s and has value dep endencies since the write on act dep ends

  

on the value returned by the read op eration on act However P whichreturns



the sum of the balances in act and act has no value dep endency The following



lemma relates the notions of value dep endencies and LDP the pro of is straightfor

ward and can b e found in Mehrotra et al b

Lemma If a transaction program P has no value dep endencies then P is

i i

LDP

Since lo cal transaction programs execute at a single site they havenovalue de

p endencies The following corollary trivially follows

Corollary Every lo cal transaction program is LDP

In the L mo del LSR schedules may not preserve database consistency as was

r

shown in Example However if we restrict the transaction programs to b e LDP

then LSR schedules can b e shown to b e strongly correct Werstshow using

a simple induction argument that a serializable lo cal schedule resulting from the

execution of LDP programs leaves the lo cal database in a consistent state if the

lo cal database was initially consistent

Lemma Let S be a schedule and DS b e a database state suchthat

legalDS SandfDS gS fDS g If for some k k m



D D

k k

is serializable let T T b e a serialization order of transactions in S S

n

transaction programs are LDPand

D

k

DS is consistent

D

k

then DS is consistentandstateT D SDS isconsistent for all i i

i k



n

Pro of We b egin byshowing that stateT D SDS for all i i n

i k

is consistent The pro of is by induction on i

D

k

Basis i stateT D SDS DS whichisgiven to b e consistent

k

Induction Assume true for i r rn that is stateT D SDS

r k

r Let DS be a is consistent We need to show the ab ove true for i



D

k

database state such that DS stateT D SDS and legalDS T By

r k  r



D

k

IH DS is consistent Let fDS gT fDS g Since transaction programs are LDP

 r 



D D

k k

DS is consistent As DS stateT D SDS stateT D SDS

r k r k

 

is consistent

As shown ab ove stateT D SDS is consistent Thus using a similar ar

n k

D

k

gumentasabove DS is consistent 

The ab ove lemma states that a schedule S resulting from the execution of LDP

D

k

is serializable leaves all the lo cal programs such that for all k m S

databases in a consistent state if they were initially consistent However consis

tency of all the lo cal database states do es not necessarily imply the consistency of

the global database state Global database consistency can however b e ensured

by ensuring that executions are LSR as is shown in the following theorem

Theorem Let S be a LSR schedule in the L mo del If all transaction

r

programs are LDP then S is strongly correct

Pro of Let DS b e a consistent database state such that legalDS S Let

fDS gS fDS g In order to showthat S is strongly correct we need to show



that DS is consistentandr eadT for all T is consistent The integrity

 i i S

G

constraints can b e viewed as IC C C C where C is a conjunct

m e

G

dened over data items in D and C is a conjunct dened over data items in

e

GD We rst showthat C C C are preserved by S if transaction programs

 m

D

k D

k

are LDPSince DS is consistent by Lemma DS is consistent Since S is

D

k

serializable and transaction programs are LDPby Lemma DS is consistent



and stateT D SDS for all T T is consistent Furthermore since lo cal

i k i i S

transactions access data items at a single site r eadT stateT D SDS

i i k

T Thus readT for all T T is consistent

i L i i i L

G

C can now b e shown to b e preserved by S using Lemma as follows Since

G

only global transactions write global data items GD S Since S

w G

GD

G

is serializable S is serializable Since global transactions read only data

items in GD RS T GD where T GD S Thus for every transaction

i i w

T GD S since r eadT stateT GDSDS if stateT GDSDS is

i w i i i

consistent then stateT GD S DS r eadT is consistent By Lemma since

i i

GD GD

DS is consistent DS is consistent Thus by Lemma DS is consistentand



stateT GD S DS for all T T is consistent Since global transactions

i i i G

only read data items in GD r eadT stateT GDSDS T Thus

i i i G

readT for all T T is consistent

i i i G

D

GD

k

Thus DS and DS k m is consistent Hence by for all k





Lemma DS is consistent Thus S is strongly correct



Since a transaction program with no value dep endencies is LDP Theorem also

holds if transaction programs havenovalue dep endencies However if transaction

programs havevalue dep endencies then schedules may not preserve database con

sistency as illustrated in Example It must b e noted that Theorem holds even

if integrity constraints are presentbetween lo cal and global data items

The Global ReadWrite G Mo del

rw

Consider a mo del of MDBS applications with the following restrictions on transac

tions

lo cal transactions only read and write lo cal data items and

global transactions in addition to reading and writing global data items also read and write lo cal data items

The G mo del is more general than the G mo del Thus as shown in Exam

rw r

pleinthe G mo del LSR schedules may not preserve database consistency

rw

if integrity constraints are presentbetween lo cal and global data items However

in contrast to the G mo del absence of integrity constraints b etween lo cal and

r

global data items do es not ensure database consistency as is demonstrated bythe

following example

Example Consider an MDBS consisting of two sites s and s Let D



fa b eg D fcg LD fa b c egandGD fgLetIC a b



c e Thus no integrity constraints are presentbetween lo cal and

global data items Consider the following two global transaction programs

P if a then c b

else c

P e c



Let P b e a lo cal transaction program executing at site s



P a



if e then b

Consider the lo cal schedules at sites s and s resulting from the execution of



P P and P from the database state fa b c e g

 

S w a r a r b w e r e

  

S w c r c

 

The nal database state resulting from the execution of the ab oveschedule is

fa b c e g

which is inconsistent

In Example database inconsistency results from the fact that the structure of

transaction program P changes based on the value returned by the read op era



tion on data item eToavoid such inconsistencies we need to restrict transaction

programs to b e xedstructured A transaction program is xedstructured if its ex

ecution from every database state results in transactions with the same structure

This is dened more formally as follows

Denition Transaction program P has xedstructure if for all pairs DS DS

i 

T and T are transactions re of database states str uctT str uctT where

 

sulting from the execution of P from DS and DS resp ectively

i 

Example Consider the following transaction programs P and P WhileP



is a xedstructure transaction program P is not



P if x then y P if x then y



else y else z

If we restrict every transaction program to have xedstructure then LSR sched

ules can b e shown to preserve database consistency In order to prove that LSR

schedules preserve database consistency in case each of the transactions are xed

structured we will need a result ab out predicatewise serializable PWSR schedules

develop ed in Rastogi et al Before pro ceeding we rst dene the notion of

PWSR schedules intro duced in Korth et al and then state the prop ertyof

PWSR schedules whichwe will require in establishing our result

Denition Let IC C C C where IC C are dened over data

 l i

d

i

items in D d resp ectivelyAschedule S is PWSR if S is serializable for all i

i

i l

Theorem Rastogi et al Let IC C C C where IC C are

 l i

dened over data items in D d resp ectively suchthatd d i j LetS be

i i j

aschedule consisting of transactions resulting from the execution of xedstructure

transaction programs If S is a PWSR schedule then it is strongly correct

Using the ab ove theorem we can establish that LSR schedules in the G mo del

rw

preserve strong correctness if the transaction programs are xed structured

Theorem Let S be a LSR schedule in the G mo del resulting from the

rw

execution of xedstructured transaction programs If no integrity constraints are

presentbetween lo cal and global data items then S is strongly correct

Pro of As no integrity constraints are presentbetween lo cal and global data

L L G

items the integrity constraints can b e viewed as IC C C C where

m

L G

C is a conjunct dened over data items in LD and C is a conjunct dened

e

e

ver data items in GD As a result of the mo del LD LD e f and o

e f

LD GD

e

G

is serializable Since only global transactions access global data items and S

GD D LD

k k

S is serializable Since S is serializable S is serializable for all k

k mThus S is a PWSR schedule By Theorem S is strongly correct

Note that Theorem requires even lo cal transaction programs to havexed

structure if database consistency is to b e preserved In Example only the lo cal

transaction program did not have xedstructure and that resulted in the loss of

database consistencyOneway to relax the requirement of transactions to b e xed

structured without jeopardizing strong correctness in the G mo del is by utilizing

rw

another prop ertyofPWSRschedules proved in Rastogi et al which is stated

b elow

Theorem Rastogi et al Let IC C C C whereIC C

 l i

are dened over data items in D d resp ectively suchthatd d i j Let

i i j

S be a schedule If S is PWSR and ACA then it is strongly correct where S is

ACA avoids cascading ab orts Bernstein et al if whenever T reads a data

i

item written by T thenT commits b efore T reads the data item

j j i



Using Theorem it is quite simple to showthatinthe G mo del if every lo cal

rw

schedule is ACA and there are no integrity constraints b etween lo cal and global

data items then LSR schedules are strongly correct

Another way of ensuring LSR schedules are strongly correct without requiring

transaction programs to b e xedstructured is to restrict the transaction programs

to b e LDP In the next subsection it is shown that under this restriction LSR

schedules are strongly correct in the G L mo del which is more general than the

rw r

G mo del

rw

The Global ReadWrite and Lo cal Read G L Model

rw r

Consider a mo del of MDBS applications with the following restrictions on transac

tions

lo cal transactions read and write lo cal data items and also read global data items

and

global transactions read and write global and lo cal data items

The G L mo del is more general than any the mo dels considered so far Since

rw r

the G L mo del is more general than the G mo del presence of integritycon

rw r r

straints b etween lo cal and global data items may result in the violation of database

consistency as was illustrated in Example Similarlyasshown in Example in

the L mo del database consistency may b e violated if transaction programs are not

r

LDP The following theorem states conditions under which LSR schedules preserve

database consistency in the G L mo del

rw r

Theorem Let S b e a LSR schedule in the G L mo del If all transaction

rw r

programs are LDP and no integrity constraints are presentbetween lo cal and global

data items then S is strongly correct

Pro of Let DS b e a consistent database state such that legalDS S Let

fDS gS fDS g In order to prove that S is strongly correct weneedtoshow



that DS is consistentandreadT for all T is consistent Since no integrity

 i i S

constraints are presentbetween lo cal and global data items the integrity constraints

L L G L

can b e viewed as IC C C C where C is a conjunct dened over

m e

G

data items in LD and C is a conjunct dened over data items in GD

e

We use the fact that transaction programs are LDP to prove that conjuncts

L L L

C are preserved by S and global transactions read consistentlocal C C

n



D

k

data items Since DS is consistent by Lemma DS is consistent Since

D

k

S is serializable let T T T b e a serialization order of transactions in

 n

D

D k

k

S and transaction programs are LDPbyLemmaDS is consistentand



stateT D SDS for all i i n is consistent Since LD D and

i k e e

D LD D

k k k

stateT D SDS DS is consistent DS is consistent Since r eadT

i k

 

i

D LD

k k

readT is consistent for all T T SinceLD D readT for all

i i S e e

i i

T T is consistent Also since lo cal transactions only access lo cal data items

i i G

at a single site readT for all T T is consistent

i i i L

G

WenowshowthatC is preserved by S Since only global transactions write on

GD

G G

global data items GD S Since S is serializable S is serializable

w G

GD

y Lemma DS is consistent As shown ab ove for Since DS is consistent b

LD

k

all k m readT is consistent T Since LD LD when

i G e f i



DBMS Global data Branches global customer account information

Local data DBMS DBMS local customer account information bank earnings (via commissions) local financial news stock/option quotes

Wide Area Network

DBMS

Fig Investment bank example

m

S

LD

m

LD k

k k

readT r eadT is consistent Thus since e f by Lemma

i i

k 

GD

GD LD for all e e mandr eadT stateT GDSDS

e i

i

by Lemma if for some T GD S stateT GDSDS is consistent

i w i

GD

then stateT GDSDS r eadT is consistent Thus by Lemma DS

i i



is consistentandstateT GD S DS for all T T is consistent Since

i i i G

m

LD

k

GD

k

T isconsistent readT stateT GD S DS andr eadT

i G i

i i

by Lemma r eadT for all T T is consistent

i i i G

LD

GD k

Thus DS and DS for all k k m is consistent Hence by





Lemma DS is consistent Thus S is strongly correct



AN EXAMPLE APPLICATION

In the previous section we identied a range of MDBS mo dels for which LSR

schedules preserve database consistency Before a sp ecic realworld MDBS appli

cation b enets from the increased concurrency oered by the LSR approach the

application domain needs to b e analysed with resp ect to the nature of transaction

programs and integrity constraints to map it to one of the develop ed mo dels In

this section we p erform such an analysis for a nancial application and showhow

LSR results in increased concurrency in the discussed application

Consider a multinationalinvestment bank with branches all over the world Each

branch has a database that stores information for customers with accounts at the

branch The information for customer account itypically comprises of the sto cks

b onds and cash held in the account the amount of money dep osited by the customer

into the account denoted by d the of money dep osited that is invested in sto cks

i

denoted bys the of money dep osited that is invested in b onds denoted by

i

b and the of money dep osited that is cash denoted byc

i i

A customer mayhaveaccounts at one or more branches Assuming that a

customer has accounts m at dierentbranches the customer can sp ec

s which is the limit on the of dep osits in the accounts that can b e ify max

invested in sto cks similar limits can b e sp ecied by a customer for b onds and

cash Thus the following constraint holds for the customer accounts m

d s d s

m m

s This leads to a natural partition of customer data max

d d

m

into lo cal and global data Account data for customers with accounts at multiple

branches referred to as global customers is global data since integrity constraints

are presentbetween global customer accounts at dierent branches Account data



for customers with accounts at a single branch referred to as lo cal customers is

local data

In addition to customer account data a database at a branch also contains in

formation ab out sto ck quotes interest rates the levels of various indexes news

rep orts ab out various companies eg mergers takeovers new pro ducts new or

ders earnings rep orts analysts ratings eg upgrades downgrades data from

the government ab out unemploymentnumb ers housing starts ination price and

manufacturing indices etc This information is received from news agencies is

up dated lo cally in realtime and is part of lo cal data In addition every time

the branch buyssells sto cks for a customer it charges the customer a commission

These commissions constitute the branchs earnings and earnings a lo cal data item

in the branchs database keeps track of this see Figure Note that there are no

integrity constraints b etween lo cal and global data items

As mentioned b efore transactions that execute at a single site are lo cal trans

actions while those that execute at multiple sites are global transactions Thus

a transaction that dep osits money into a lo cal customer account i at a branchis

a lo cal transaction It invests p ortions of the newly dep osited money into sto cks

and b onds after consulting the news database eg sto ckoption quotes analyst

ratings for companies interest rates ination etc and without violating the

constraints max s It also adds the commissions from the trades to earn

i

ings in the branchs database A transaction that dep osits money into a global

customer account at a branchlet m b e the accounts for the customer

on the other hand is a global transaction since in order to ensure that the con

d s d s

m m

max straint s is not violated the transaction has to read the

d d

m

values for d s d s for accounts m from the dierent branch

m m

databases Thus global transactions read and write b oth global as well as lo cal

data items a global dep osit transaction reads the lo cal news database and also

increments earnings for the branchtoaccount for the commissions

Thus the ab oveenvironment ts the G mo del in which lo cal transactions read

rw

and write only lo cal data while global transactions read and write lo cal as well

as global data items Since no integrity constraints are presentbetween lo cal and

global data items and lo cal and global dep osit transactions can b e written to have

xed structure from Theorem it follows that LSR schedules can result in higher

concurrency without violating database consistency

RELATED AND PREVIOUS WORK

Research on concurrency control in MDBSs has b een done along two complementary

directions Signicantwork has gone into developing techniques to ensure global

serializability These techniques include Breitbart and Silb erschatz Breitbart

et al Georgakop oulos et al Batra et al Du et al Mehrotra

et al Pu Raz Breitbart et al Breitbart and Silb erschatz

As mentioned previously since the MDBS software is unaware of the lo cal

transactions execution schemes that ensure serializability allowa low degree of

concurrency

e research has also b een done to relax the serializabiliy requirementDu Extensiv

and Elmagarmid Wu et al Mehrotra et al a Mehrotra et al

Rastogi et al Rastogi et al Numerous correctness criteria



similar to twolevel serializability that p ermit certain nonserializable executions

have b een prop osed For example a correctness criterion local serializability LSR

only requires eachlocalschedule that is schedules at the lo cal DBMSs to b e

serializable and do es not imp ose any restrictions on the global transactions An

alternativemechanism referred to as quasiserializability QSR was intro duced in

Du and Elmagarmid A schedule S is QSR if each of the lo cal schedules is

serializable and S is conict equivalenttoa schedule in which global transactions

are executed serially It is not to o dicult to see that the following relationship

holds b etween the ab ove describ ed classes of schedules where CSR denotes the set

of conict serializable schedules

CSR QSR LSR LSR

Each of the containments in the equation ab ove are prop er For example the sched

ule in Example illustrates a LSR schedule that is not QSR Like LSR ensuring

LSR and QSR in MDBS environments is relatively straightforward For example

to ensure schedules are LSR the GTM only needs to forward the global transaction

op erations to the lo cal DBMS and do es not need to p erform any concurrency con

trol It can b e ensured that schedules are QSR by following the altruistic lo cking

proto col develop ed in Tal and Alonso

Similar to LSR since QSR and LSR schedules allow nonserializable executions

they preserve database consistency only under appropriate restrictions on the na

ture of the schedules Restrictions under which LSR schedules preserve database

consistency have b een identied in Rastogi et al Breitbart et al

Furthermore the authors in Du and Elmagarmid claimed QSR schedules

preserve database consistency under the following assumptions

There are no integrity constraints b etween data items at dierent sites except

equality constraints which mo del replicated data

Transaction programs do not havevalue dep endencies

Lo cal transactions not b e p ermitted to write replicated data

Since every QSR schedule is also LSR their result follows from Theorem which

proves that database consistency is ensured by LSR and hence also QSR sched

ules under more general conditions

Besides the ab oveschemes that exploit the nature of integrity constraintinan

MDBS environment to relax the serializability requirement another approach to re

laxing serializability is to exploit the semantics of transactions GarciaMolina and

Salem Alonso et al Veijalainen GarciaMolina Buchmann

et al Farrag and Ozsu Rastogi et al These schemes consider a

global transaction to consist of a numb er of subtransactions with eachofwhicha

typ e is asso ciated It is assumed that the application develop er a priori sp ecies the

various subtransaction typ es along with the set of interleavings of the subtransac

tions that do es not result in a loss of database consistency A transaction manager

utilizes this sp ecication to p ermit only acceptable interleavings of the transactions

Schemes in this category dier from one another in the mechanism they employto

sp ecify the interleavings and the algorithm that the transaction manager uses to

ensure that undesirable interleavings are not p ermitted



Finally another approach to relaxing the serializability requirement that has b een

prop osed is to tolerate a b ounded degree of inconsistency due to nonserializable

executions Pu and Le Wu et al Wong and Agrawal These

schemes relax the serializability requirement in that they p ermit transactions to

interleave as long as the degree of inconsistency intro duced due to the interleaving

of the transactions is b ounded Dierentschemes in this category dier from each

other in the mechanism they use to quantify the degree of inconsistency for exam

ple in Pu and Le the degree of inconsistency is quantied as the countof

the numb er of conicts a readonly query is involved in which if not present would

render the schedule serializable

CONCLUSION

In this pap er we adopt a weaker correctness criterion than serializability for MDBS

applications whichwe refer to as twolevel serializability LSR The motivation

for abandoning serializability as the correctness criterion is the low degree of concur

rency that results from proto cols for ensuring serializability in MDBSs Twolevel

serializability requires only the pro jection of the global schedule on the set of global

transactions to b e serializable and each of the lo cal schedules to b e serializable As

a result proto cols for ensuring that schedules are LSR havetheadvantages of

b eing simple allowing a high degree of concurrency and not violating the lo cal

autonomy of sites

HoweverLSRschedules preserve database consistency only in certain restricted

MDBS applications In many MDBS applications it is known which data items

are involved in intersite integrity constrain ts eg replicated data We capture

this knowledge in our MDBS mo del by partitioning data items into two disjoint

sets global and lo cal data items A data item is a global data item if there is an

integrity constraintbetween it and a data item at a dierent site This knowledge

allows us to prove in certain cases that LSR executions even though they are

nonserializable preserve database consistencyWeidentied mo dels for several

of these cases each of whichinvolves dierent restrictions on a transactions read

and write op erations The mo dels provide a range of options to the designer of an

MDBS Wehavecharacterized the relativepower of our mo dels b oth in terms of

concurrency and restrictions imp osed on transactions

Our characterization of the tradeos of restrictions on data access versus restric

tions on transaction programs and integrity constraints oer guidance to designers

of MDBS applications If the required restrictions are deemed to o onerous for a

sp ecic application the designer maycho ose to ignore certain integrity constraints

so that greater dataaccess freedom is p ossible The ignored constraints must b e

managed by external means audit transaction andor human intervention as was

the case prior to integration of lo cal databases into a multidatabase Even in the

case of such compromises our approach helps to b ound inconsistency and ease its

resolution at the user level

The use of our mo dels in sp ecic application contexts as well as design aids for

application develop ers in assessing the tradeos among the mo dels are interesting

avenues for further study Another imp ortant area for further research is automatic

or semiautomatic preservation of integrity constraints that cannot b e formalized

as a predicate over the set of all data items



REFERENCES

Alonso R GarciaMolina H and Salem K Concurrency control and recovery

for global pro cedures in federated database systems Data Engineering Sept

Batra R Georgakopoulos D and Rusinkiewicz M A decentralized deadlo ck

free concurrency control metho d for multidatabase transactions In Proceedings of the

Twelfth International Conference on Systems Yokohoma Japan

Bernstein P A Hadzilacos V and Goodman N Concurrency Control and

Recovery in Database Systems AddisonWesley Reading MA

Breitbart Y Georgakopoulos D Rusinkiewicz M and Silberschatz A

On rigorous transaction scheduling IEEE Transactions on Software Engineering

Breitbart Y H GM and Silberschatz A Overview of multidatabase trans

action management VLDB Journal

Breitbart Y and Silberschatz A Multidatabase up date issues In Proceedings of

ACMSIGMOD International Conference on Management of Data Chicago

pp

Breitbar t Y and Silberschatz A Strong recoverabilityinmultidatabase sys

tems In Proceedings of the Second International Workshop on Research Issues on Data

Engineering Transaction and Query Processing Mission Palms Arizona February

Breitbart Y Silberschatz A and Thompson G R Reliable transaction man

agementina multidatabase system In Proceedings of ACMSIGMOD International

Conference on Management of Data Atlantic City New Jersey pp

Buchmann A Ozsu M T Hornick M Georgakopoulos D and Manola F A

A transaction mo del for active distributed ob ject systems In A K Elmagarmid

Ed AdvancedTransaction Models for new applications MorganKaufmann

Du W Elmagarmid A Leu Y and Ostermann S Eects of lo cal autonomyon

global concurrency control in heterogeneous distributed database systems In Proceedings of

the second International Conference on Data and Know ledge Systems for Manufacturing

and Engineering

Du W and Elmagarmid A K Quasi serializability a correctness criterion for

global concurrency control in InterBase In Proceedings of the Fifteenth International Con

ferenceonVery Large Databases Amsterdam pp

Elmagarmid A and Du W A paradigm for concurrencycontrol in heterogeneous

distributed database systems In Proceedings of the Sixth International ConferenceonData

Engineering Los Angeles

Farrag A A and Ozsu M T Using semantic knowledge of transactions to increase

concurrency ACM Transactions on Database Systems Dec

GarciaMolina H Using semantic knowledge for transaction pro cessing in a dis

tributed database ACM Transactions on Database Systems June

GarciaMolina H and Salem K Sagas In Proceedings of ACMSIGMOD

International Conference on Management of Data San Francisco pp

Georgakopoulos D Multidatabase recoverability and recoveryIn Proceedings of

the Seventh International Conference on Data Engineering Kobe Japan

Georgakopoulos D Rusinkiewicz M and Sheth A On serializabilityofmul

tidatabase transactions through forced lo cal conicts In Proceedings of the Seventh Inter

national Conference on Data Engineering Kobe Japan

Gray J and Reuter A Concepts and Techniques Mor

gan Kaufmann San Mateo California

Korth H F Kim W and Bancilhon F On long duration CAD transactions

Information Sciences

Kung D and Papadimitriou C Optimality theory of databases In Proceedings

of ACMSIGMOD International Conference on Management of Data Ann Arbor pp



M Ouzzani M A A and NL B A topdown approachfortwolevel serializabil

ityIn VLDB Santiago de Chile Chile

Mehrotra S Rastogi R Breitbart Y Korth H F and Silberschatz A

The concurrency control problem in multidatabases Characteristics and solutions In Pro

ceedings of ACMSIGMOD International Conference on Management of Data San

Diego California

Mehrotra S Rastogi R Korth H F and Silberschatz A b Maintaining

database consistency in heterogeneous distributed database systems Technical Rep ort TR

Department of Science UniversityofTexas at Austin

Mehrotra S Rastogi R Korth H F and Silberschatz A a Nonserializable

executions in heterogeneous distributed database systems In Proceedings of the First In

ternational ConferenceonParal lel and Distibuted Information Systems Miami Beach

Florida

Mehrotra S Rastogi R Korth H F and Silberschatz A Relaxing serializ

abilityinmultidatabase systems In Proceedings of the Second International Workshop on

Resear ch Issues on Data Engineering Transaction and Query Processing Mission Palms

Arizona February

Mohan C Arieskvl A keyvalue lo cking metho d for concurrencty control of mul

tiaction transactions op erating on btree indexes In Proceedings of the ConferenceonVery

Large Databases Morgan Kaufman pubs Brisbane March

Papadimitriou C The Theory of Database Concurrency Control Computer Science

Press Ro ckville Maryland

Pu C Sup erdatabases for comp osition of heterogeneous databases In Proceedings of

the Fourth International Conference on Data Engineering Los Angeles

Pu C and Leff A Replica control in distributed systems An asynchronous ap

proach In Proceedings of ACMSIGMOD International Conference on Management

of Data Denver Colorado May pp

Rastogi R Korth H F and Silberschatz A Exploiting transaction semantics

in multidatabase systems Tec hnical Rep ort TR Department of Computer Science

UniversityofTexas at Austin

Rastogi R Mehrotra S Brietbart Y Korth H F and Silberschatz A

On correctness of nonserializable executions In Proceedings of the Twelfth ACM SIGACT

SIGMODSIGART Symposium on Principles of Database Systems Washington DC

Rastogi R Mehrotra S Korth H F and Silberschatz A Transcending

the serializability requirement In Data Engineering Bul letin

Raz Y The principle of atomic or guaranteeing serializability

in a heterogeneous environmentofmultiple autonomous resource managers using atomic

commitment In Proceedings of the Eighteenth International ConferenceonVery Large

Databases pp

Tal A and Alonso R Integration of commit proto cols in heterogeneous databases

In Proceedings of the International Conference on Information and Know ledge Manage

ment Baltimore Maryland Nov pp

Veijalainen J Transaction Concepts in Autonomous Database EnvironmentsR

Oldenbourg Verlag Munich

Wong M H and Agrawal D Tolerating b ounded inconsistency for increasing

concurrency in database systems In Proceedings of the Eleventh ACM SIGACTSIGMOD

SIGART Symposium on Principles of Database Systems San Diego California June

pp

Wu K Yu P and Pu C Divergence control for epsilonserializabilityIn Proceed

ings of the Eighth International Conference on Data Engineering Mission Palms Arizona pp