Guide to Design, Implementation and Management of Distributed Databases
Total Page:16
File Type:pdf, Size:1020Kb
NATL INST OF STAND 4 TECH H.I C NIST Special Publication 500-185 A111D3 MTfiDfi2 Computer Systems Guide to Design, Technology Implementation and Management U.S. DEPARTMENT OF COMMERCE National Institute of of Distributed Databases Standards and Technology Elizabeth N. Fong Nisr Charles L. Sheppard Kathryn A. Harvill NIST I PUBLICATIONS I 100 .U57 500-185 1991 C.2 NIST Special Publication 500-185 ^/c 5oo-n Guide to Design, Implementation and Management of Distributed Databases Elizabeth N. Fong Charles L. Sheppard Kathryn A. Harvill Computer Systems Laboratory National Institute of Standards and Technology Gaithersburg, MD 20899 February 1991 U.S. DEPARTMENT OF COMMERCE Robert A. Mosbacher, Secretary NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY John W. Lyons, Director Reports on Computer Systems Technology The National institute of Standards and Technology (NIST) has a unique responsibility for computer systems technology within the Federal government, NIST's Computer Systems Laboratory (CSL) devel- ops standards and guidelines, provides technical assistance, and conducts research for computers and related telecommunications systems to achieve more effective utilization of Federal Information technol- ogy resources. CSL's responsibilities include development of technical, management, physical, and ad- ministrative standards and guidelines for the cost-effective security and privacy of sensitive unclassified information processed in Federal computers. CSL assists agencies in developing security plans and in improving computer security awareness training. This Special Publication 500 series reports CSL re- search and guidelines to Federal agencies as well as to organizations in industry, government, and academia. National Institute of Standards and Technology Special Publication 500-185 Natl. Inst. Stand. Technol. Spec. Publ. 500-185, 59 pages (Feb. 1991) CODEN: NSPUE2 U.S. GOVERNMENT PRINTING OFFICE WASHINGTON: 1991 For sale by the Superintendent of Documents, U.S. Government Printing Office, Washington, DC 20402 TABLE OF CONTENTS ABSTRACT v ACKNOWLEDGMENTS V 1.0 INTRODUCTION 1 1.1 The Promises and Realities of Distributed DBMS . 1 1.2 Motivation 2 1.3 Purpose and Scope 3 1.4 Organization of the Guide 4 1.5 Disclaimer 6 2.0 DEVELOPMENT PHASES TO THE DISTRIBUTED DATABASE .... 7 2.1 Planning Phase 7 2.2 Design Phase 8 2.3 Installation and Implementation Phase 8 2 . 4 Support and Maintenance Phase 8 2.5 Detail Development Phase 8 3.0 CORPORATION STRATEGY PLANNING 10 3.1 Summary of Tasks 11 4.0 OVERALL DESIGN OF DISTRIBUTED DATABASE STRATEGY .... 12 4.1 Types of Distributed Environment 12 4.2 Selecting an Architecture for a Distributed Environment 15 4.3 Considerations for Standards 17 4.4 Summary of Tasks 18 5.0 DETAILED DESIGN FOR DISTRIBUTED DATABASE ENVIRONMENT . 2 0 5.1 Hardware Design 2 0 5.2 Software Design 2 0 5.3 Communication Network Design 21 5.4 Summary of Tasks 2 2 6.0 INSTALLATION OF DISTRIBUTED DATABASE ENVIRONMENT .... 25 6.1 Installation of Hardware 25 6.2 Installation of the Network Communication Software 2 6 6.3 Installation of the DBMS Communication Software . 27 6.4 Installation of the DDBMS and Other Support Software 2 7 6.5 Summary of Tasks 28 7.0 DETAILED DESIGN FOR DISTRIBUTED DATABASE APPLICATION . 29 7.1 Application Database Architectural Alternatives . 29 7.2 How to Distribute the Databases 29 7.3 The Role of the Data Dictionary/Directory for DDBMS 31 iii 7.4 Schema and Data Update Control 32 7.5 Data Administration Functions 33 7.6 Summary of Tasks 34 8.0 APPLICATION DEVELOPMENT ; . 36 8.1 Define Schemas 36 8.2 Populate Database 36 8.3 Application Program Construction 36 8.4 Multi-site Requests Processing . 37 8.5 Testing and Verifying Correctness 37 8.6 Summary of Tasks 37 9.0 SUPPORT FOR THE DISTRIBUTED DATABASE 39 9.1 Tuning for Better Performance 39 9.2 Backup, Recovery and Security Protection 39 9.3 User Training and Resolving Problems 40 9.4 Summary of Tasks 4 0 10.0 CONCLUDING REMARKS 41 10.1 Critical Success Factors 41 10.2 Future Issues 41 11.0 REFERENCES 45 APPENDIX A - DESCRIPTION OF DISTRIBUTED ENVIRONMENT .... 47 APPENDIX B - DESCRIPTION OF THE DISTRIBUTED APPLICATION ... 49 iv . ABSTRACT For an organization to operate in a distributed database environment, there are two related but distinct tasks that must be accomplished. First, the distributed database environment must be established. Then, a distributed database application can be designed and installed within the environment. This guide de- scribes both of these activities based on a development life-cycle phase framework. This guide provides practical information and identifies skills needed for systems designers, application developers, database and data administrators who are interested in the effective planning, design, installation, and support for a distributed database environment. In addition, this guide in- structs system analysts and application developers with a step- by-step procedure for the design, implementation and management of a distributed DBMS application. This guide also notes that truly heterogeneous distributed database technology is still a research consideration. Commercial products are making progress in this direction, but usually with many update and concurrency restrictions and often with severe performance penalties. The scope of this guide is based on experiences gained in the development of a distributed DBMS environment using two off-the- shelf homogeneous distributed database management systems. In support of the successful installation of the distributed database environment, a demonstration distributed application was designed and installed. Keywords: databases; DDBMS; distributed application; dis- tributed database management systems; distributed environment; life-cycle phases. ACKNOWLEDGMENTS The technical work for this report was done through hands-on experimentation with two distributed database management systems. These distributed database management systems were provided to us by the system vendors as part of cooperative projects with the Computer Systems Laboratory (CSL) . The cooperative projects were performed in the NIST/CSL Database Laboratory. We acknowledge the contributions of Joseph Collica, and Bruce Rosen of NIST who assisted us with careful review of this paper and suggested important improvements v . 1 . 0 INTRODUCTION This guide provides practical information and identifies skills needed for systems designers, application developers, data and database administrators, who are interested in the effective planning, design, installation, and support for a distributed database management system (DDBMS) environment. In addition, this guide provides system analysts and application developers with a step-by-step procedure for the design, implementation and manage- ment of a DDBMS application. A distributed DBMS application is a software system which runs in the distributed database environ- ment. The procedure is based on a development life-cycle phase framework. At each phase in the life cycle, this guide describes the nature of the tasks, who should do these tasks, and some general guidance on accomplishing each task. 1.1 The Promises and Realities of Distributed DBMS A distributed database environment consists of a collection of sites or nodes, connected together by a communication network. Each node has its own hardware, central processor and software which may, or may not, include a database management system (DBMS) In the ultimate distributed database computing environment, a user will be able to access data residing anywhere in the com- puter network, without regard to differences among computers, operating systems, data manipulation languages, or file structures [CERI84]. Data that is actually distributed across multiple remote computers will appear to the user as if it resided on the user's own computer. This scenario is functionally limited using today's distributed database technology. True distributed database technology is still a research consideration. The functional limitations are generally in the following areas: o transaction management, o standard protocols for establishing a remote connection, and o independence of network topology. Transaction management capabilities are essential to maintaining reliable and accurate databases. In some cases, today's dis- tributed database software places the responsibility of managing transactions on the application program. In other cases, transac- tions are committed or rolled-back at each location independently which means that it is not possible to create a single distributed transaction. For example, multiple site updates require multiple transactions 1 ,. In today's distributed database technology, different gateway software has to be used and installed to connect nodes using dif- ferent DDBMS software. Thus, connectivity among heterogeneous DDBMS nodes is not readily available (i.e., available only through selected vendor markets) In some instances, DDBMS software is tied to a single network operating system. This limits the design alternatives for the DDBMS environment to the products of a single vendor. It is ad- visable to select a product that supports more than one network operating system. This will increase the possibility of success- fully integrating the DDBMS software into existing computer environments. In reality, distributed databases encompass a wide spectrum of possibilities including: o Remote terminal access to centralized DBMS (e.g., airline reservation system) o Remote terminal access to different