Ensuring Consistency in Multidatabases by Preserving Two
Total Page:16
File Type:pdf, Size:1020Kb
Ensuring Consistency in Multidatabases by Preserving TwoLevel Serializ abi l i ty Sharad Mehrotra W Springel d Avenue University of Illinois Urbana IL Rajeev Rastogi Henry F Korth Abraham Silb ersc hatz Lucent Technologies Mountain Avenue Murray Hill NJ Asso ciation for Computing Machinery Inc Broadway New York NY USA Tel Fax The concept of serializability has b een the traditionally accepted correctness criterion in database systems However in multidatabase systems MDBSs ensuring global serializability is a dicult task The diculty arises due to the heterogeneity of the concurrency control proto cols used by the participating lo cal database management systems DBMSs and the desire to preservethe autonomy of the lo cal DBMSs In general solutions to the global serializability problem result in executions with a low degree of concurrency The alternative relaxed serializabilitymay result in data inconsistency In this pap er weintro duce a systematic approach to relaxing the serializability requirement in MDBS environments Our approach exploits the structure of the integrity constraints and the nature of transaction programs to ensure consistency without requiring executions to b e serializ able We develop a simple yet p owerful classication of MDBSs based on the nature of integrity constraints and transaction programs For eachoftheidentied mo dels weshowhow consistency can b e preserved by ensuring that executions are twolevel serializable LSR LSR is a cor rectness criterion for MDBS environmentsweaker than serializability What makes our approach interesting is that unlike global serializability ensuring LSR in MDBS environments is relatively simple and proto cols to ensure LSR p ermit a high degree of concurrencyFurthermorewe b elieve the range of mo dels we consider cov er many practical MDBS environments to which the results of this pap er can b e applied to preserve database consistency General Terms Database consistency Multidatabase Systems Beyond serializability Additional Key Words and Phrases Concurrency Control heterogeneous database integration Much of this work was done at the UniversityofTexas at Austin with supp ort from TARP under Grant ARP the NSF under Grants IRI and IRI and grants from the IBM and HP corp orations This is a preliminary release of an article accepted byACM Transactions on Database Systems The denitiveversion is currently in pro duction at ACM and when released will sup ersede this version Permission to make digital or hard copies of part or all of this work for p ersonal or classro om use is granted without fee provided that copies are not made or distributed for prot or direct commercial advantage and that copies show this notice on the rst page or initial screen of a display along with the full citation Copyrights for comp onents of this work owned by others than ACM must b e honored Abstracting with credit is p ermitted To copy otherwise to republish to p ost on servers to redistribute to lists or to use any comp onentofthiswork in other works requires prior sp ecic p ermission andor a fee Permissions may b e requested from Publications Dept ACM Inc Broadway New York NY USA fax or permissionsacmorg INTRODUCTION Databases are usually constructed to supp ort a single enterprise However for many new applications domains there is a need to extend the database environmentto include a broader range of users and to include several distinct databases within a common framework These needs develop from the integration of departmental information systems within a corp oration from corp orate mergers and acquisitions from co op erativeventures involving indep endent corp orations etc Although currentnetwork technology allows one physically to supp ort suchinte gration serious problems exist at the database system level The databases to b e integrated may run on distinct database managementsys tems As a result the data mo dels relational ob jectoriented hierarchical etc may dier and the application programs that access the databases may b e writ ten in distinct and p ossibly incompatible languages The data itself may b e in distinct formats on each system These distinctions may b e in data typ es physical data representation or data semantics units of measure language national or corp orate conventions etc The organizations whose databases are b eing integrated may maintain a signi cant degree of autonomyThismay limit the degree of central control that can b e imp osed on the integrated system Since each database system environment represents an enormous investment in application development it is usually not economically feasible for all of the databases to b e converted to a single database management system Issues of au tonomy also inhibit such conversions Amultidatabase system MDBS is a software system running on top of the indi vidual database management systems DBMSs that manage the v arious databases to b e integrated The job of the MDBS is to present users with the illusion of a single unied database environment and to hide to the extent p ossible the fact that the environment consists of indep endent geographically distributed sites each running its own DBMS This pap er fo cuses on one of the many issues in MDBS design transaction man agement Ideally the MDBS should preserve the usual transactional prop erties of atomicity consistency isolation and durabilityGray and Reuter at the global level Achieving this however is dicult b ecause of the following twochar acteristics of the MDBS environments Heterogeneity Each lo cal DBMS may follow dierent concurrency control and recovery algorithms Autonomy It is not practically feasible to mo dify the underlying lo cal DBMS software to facilitate integration Database consistency is traditionally ensured by requiring that the concurrent execution of transactions b e serializable that is equivalent to a nonconcurrent executionPapadimitriou The problem of ensuring global serializabilityin an MDBS environment has b een studied extensively Breitbart and Silb erschatz Breitbart et al Georgakop oulos et al Batra et al Du et al Mehrotra et al Pu Raz A necessary condition for maintaining global serializabilityisthatallglobal transactions that access data at multiple DBMSs are serialized in the same order at all the sites at whichthey execute The MDBS is limited in its p ower to ensure this prop erty since preexisting applications in the lo cal DBMSs can generate transactions that run entirely within that lo cal DBMS These transactions may generate indirect conicts among global transactions conicts of which the MDBS is unaware One way to guarantee this is to make the p essimistic assumption that anytwo global transactions that execute at a common lo cal DBMS conict This however results in low concurrency One waytoovercome the problem of low concurrency is to relax the serializabil ity requirementforMDBSenvironments Numerous such approaches have b een prop osed Wu et al Du and Elmagarmid Rastogi et al Rastogi et al Mehrotra et al a and are discussed in Section whichcovers re lated work Relaxing the serializability requirement however may result in a loss of database consistency which for many database applications cannot b e tolerated In this pap er weintro duce a systematic approach to relaxing the serializability requirement in MDBS environments without jeopardizing database consistencyWe develop a simple yet p owerful classication for MDBSs based on the structure of the integrity constraints and the nature of the transaction programs present in the system For each of the develop ed mo dels weshowhow consistency can b e preserv ed by ensuring that executions are twolevel serializable LSR Two level serializability is a correctness criterion for MDBS environments intro duced in Mehrotra et al a that is weaker than serializability What makes our approach interesting is that unlike global serializability ensuring that executions are LSR in MDBS environments is relatively simple and proto cols for ensuring LSR allow a high degree of concurrencyFurthermore we b elieve that the range of mo dels for whichweshow LSR executions preserve database consistency cover many practical MDBS environments to which the results of this pap er can b e applied The remainder of the pap er is organized as follows In Section we establish the preliminaries for our work we formalize the notion of database consistency and develop the transaction and schedule mo del used in the rest of the pap er In Sec tion we describ e the MDBS mo del to whichourwork is applicable In Section we discuss the LSR correctness criterion for MDBS environments and mechanisms that can b e used to ensure schedules are LSR In Section we develop a sp ectrum of MDBS mo dels based on the structure of the integrity constraints and the na ture of transaction programs for whichweshow LSR schedules preserve database consistency Section presents a nancial application that can b e captured by one of our MDBS mo dels and thus demonstrates the utility of LSR In Section we discuss related and previous work Finally Section oers concluding remarks PRELIMINARIES In a database system where serializability is ensured through some concurrency control scheme database consistency can b e maintained by simply requiring that each transaction individually maintain consistencyInsuch systems the transac tion manager need not b e concerned with the sp ecic nature of the consistency constraints In contrast for us to b e able to relax the serializability requirements while maintaining consistencywe