<<

IJCST Vol. 2, Issue 3, September 2011 ISSN : 2229-4333(Print) | ISSN : 0976-8491(Online) Transaction Processing and Management in Distributed Systems 1Gunjan Verma, 2Vineeta verma, 3Komal Singhal, 4Megha Maheshwari 1Meerut institute of Engineering & Technology, Meerut 2Sardar Valabhbhai Patel University of Agriculture & Technology, Meerut 3Meerut institute of Engineering & Technology, Meerut 4Meerut institute of Engineering & Technology, Meerut

Abstract 2. Properties of Transactions systems (DDBS) have many different A Transaction has four properties that lead to the consistency problems when accessing database in distributed multiuser and reliability of a distributed data base. These are Atomicity, environment and replicated . Access control and Consistency, Isolation, and Durability. transaction management require different process to handle and Atomicity: Atomicity requires that database modifications must monitor the data access and update in DDBS. In multi-tier client/ follow an “all or nothing” rule. Each transaction is said to be server networks make DDBS a better solution to have access atomic. If one part of the transaction fails, the entire transaction to and control over databases. There is some leading Database fails and the database state is left unchanged. It is critical that Management System we can use two-phase technique to the database management system maintain the atomic nature of maintain consistent state for the database. The objective of this transactions in spite of any application, DBMS, operating system paper is to explain transaction processing and management in or hardware failure. DDBS and how we can implements this technique. An atomic transfer cannot be subdivided and must be processed in its entirety or not at all. Atomicity means that users do not have Keywords to worry about the effect of incomplete transactions. Transaction, commit, nodes, DB2, Commit phase. Transactions can fail for several kinds of reasons: 1. Hardware failure: A disk drive fails, preventing some of the transaction’s database changes from taking effect. Introduction to Distributed Database System ( D D B S ) 2. System failure: The user loses their connection to the A distributed database is a database that is under the control of a application before providing all necessary information. central database management system (DBMS) in which storage 3. Database failure: E.g., the database runs out of room to hold devices are not all attached to a common CPU. It may be stored additional data. in multiple computers located in the same physical location, or 4. Application failure: The application attempts to post data may be dispersed over a network of interconnected computers. that violates a rule that the database itself enforces such as Collections of data can be distributed across multiple physical attempting to insert a duplicate value in a . locations. Distributed database system (DDBS) is system that Consistency: Referring to its correctness, this property deals with has distributed data and replicated over several locations. Data maintaining consistent data in a database system. Consistency may be replicated over a network using horizontal and vertical falls under the subject of . For example, “dirty fragmentation similar to projection and selection operations in data” is data that has been modified by a transaction that has not Structured (SQL). The database shares the yet committed. Thus, the job of concurrency control is to be able problems of access control and transaction management, such to disallow transactions from reading or updating “dirty data.” as user concurrent access control and deadlock detection and Isolation: According to this property, each transaction should see a resolution. On the other hand, however, DDBS must also cope consistent database at all times. Consequently, no other transaction with different problems. Accessing of data control and transaction can read or Durability is the ability of the DBMS to recover the management in DDBS needs different methods to monitor data committed transaction updates against any kind of system failure access and update to distributed and replicated databases. IBM (hardware or software). Durability is the DBMS’s guarantee that DB2, a Database Management Systems (DBMS) employs the once the user has been notified of a transaction’s success the two-phase commit technique to maintain a consistent state for transaction will not be lost, the transaction’s data changes will the databases. We are trying to show the Implementation of two- survive system failure, and that all integrity constraints have been phase commit for transaction management in DDBMS and how satisfied, so the DBMS won’t need to reverse the transaction. DB2 implement this technique. Many DBMSs implement durability by writing transactions into a that can be reprocessed to recreate the system state Distributed Database Security right before any later failure. A transaction is deemed committed The database supports all of the security features that are available only after it is entered in the log. Durability does not imply a with a non distributed database environment for distributed permanent state of the database. A subsequent transaction may database systems, including: modify data changed by a prior transaction without violating the • Password authentication for users and roles durability principle. • Some types of external authentication for users and roles including: 3. Transaction Processing in a Distributed System o Kerberos version 5 for connected user links A transaction is a logical unit of work constituted by one or more o DCE for connected user links SQL statements executed by a single user. A transaction begins with the user’s first executable SQL statement and ends when it is committed or rolled back by that user. A remote transaction

282 International Journal of and Technology www.ijcst.com ISSN : 2229-4333(Print) | ISSN : 0976-8491(Online) IJCST Vol. 2, Issue 3, September 2011 contains only statements that access a single remote node. A 1. The coordinator sends a commit message to all the cohorts. distributed transaction contains statements that access more than 2. Each cohort completes the operation, and releases all the one node. A distributed transaction is a transaction that includes locks and resources held during the transaction. one or more statements that, individually or as a group, update 3. Each cohort sends an acknowledgment to the coordinator. data on two or more distinct nodes of a distributed database. The coordinator completes the transaction when all acknowledgments have been received. 4. Two-Phase Commit of transaction in Distributed database System Failure In transaction processing, databases, and computer networking, the If any cohort votes No during the commit-request phase (or the two-phase commit protocol (2PC) is a type of atomic commitment coordinator’s timeout expires): protocol (ACP). It is a distributed algorithm that coordinates all 1. The coordinator sends a rollback message to all the the processes that participate in a distributed atomic transaction cohorts. on whether to commit or abort (roll back) the transaction (it is a 2. Each cohort undoes the transaction using the undo log, specialized type of consensus protocol). The protocol achieves its and releases the resources and locks held during the goal even in many cases of temporary system failure (involving transaction. process, network node, communication, etc. failures), and is thus 3. Each cohort sends an acknowledgement to the coordinator. widely utilized.[1-3] However, it is not resilient to all possible 4. The coordinator undoes the transaction when failure configurations, and in rare cases user (e.g., a system’s all acknowledgements have been received administrator) intervention is needed to remedy outcome. To accommodate recovery from failure (automatic in most cases) the protocol’s participants use logging of the protocol’s states. 5. The-Phase Commit : DB2 Database Management Log records, which are typically slow to generate but survive System failures, are used by the protocol’s recovery procedures. Many The DB2 database is a distributed database management protocol variants exist that primarily differ in logging strategies system, which employs the two-phase commit to achieve and recovery mechanisms. Though usually intended to be used and maintain data reliability. The following sections infrequently, recovery procedures comprise a substantial portion of explain DB2’s two-phase implementation procedures. the protocol, due to many possible failure scenarios to be considered and supported by the protocol. In a “normal execution” of any How Session maintains between nodes single distributed transaction, i.e., when no failure occurs, which In each transaction, DB2 constructs a session tree for the is typically the most frequent situation, the protocol comprises participating nodes. The session tree describes the relations two phases: between the nodes participating in any given transaction. Each 1. The commit-request phase (or voting phase), in which a node plays one or more of the following roles: coordinator process attempts to prepare all the transaction’s 1. Client: A client is a node that references data from another participating processes (named participants, cohorts, or node. workers) to take the necessary steps for either committing or 2. Database Server: A server is a node that is being referenced aborting the transaction and to vote, either “Yes”: commit (if by another node because it has needed data. A database server the transaction participant’s local portion execution has ended is a server that supports a local database. properly), or “No”: abort (if a problem has been detected with 3. Global Coordinator: The global coordinator is the node that the local portion), and initiated the transaction, and thus, is the root of the session 2. The commit phase, in which, based on voting of the cohorts, tree. The operations performed by the global coordinator the coordinator decides whether to commit (only if all have are as follows: voted “Yes”) or abort the transaction (otherwise), and notifies • In its role as a global coordinator and the root of the session the result to all the cohorts. The cohorts then follow with the tree, all the SQL statements, procedure calls, etc., are sent to needed actions (commit or abort) with their local transactional the referenced nodes by the global coordinator. Instructs all resources (also called recoverable resources; e.g., database the nodes, except the COMMIT point site, to PREPARE data) and their respective portions in the transaction’s other • If all sites PREPARE successfully, then the global coordinator output (if applicable). instructs the COMMIT point site to initiate the commit phase Commit request phase • If one or more of the nodes send an abort message, then the 1. The coordinator sends a query to commit message to all global coordinator instructs all nodes to perform a rollback. cohorts and waits until it has received a reply from all cohorts. 4. Local Coordinator: A local coordinator is a node that must 2. The cohorts execute the transaction up to the point where reference data on another node in order to complete its part. they will be asked to commit. They each write an entry to The local coordinator carries out the following functions their undo log and an entry to their redo log. (DB2): 3. Each cohort replies with an agreement message (cohort • Receiving and relaying status information among the local votes Yes to commit), if the cohort’s actions succeeded, or nodes an abort message (cohort votes No, not to commit), if the • Passing queries to those nodes cohort experiences a failure that will make it impossible to • Receiving queries from those nodes and passing them on to commit. other nodes • Returning the results of the queries to the nodes that initiated Commit phase them. Success : If the coordinator received an agreement message from 5. Commit Point Site: Before a COMMIT point site can be all cohorts during the commit-request phase: designated, the COMMIT point strength of each node must www.ijcst.com International Journal of Computer Science and Technology 283 IJCST Vol. 2, Issue 3, September 2011 ISSN : 2229-4333(Print) | ISSN : 0976-8491(Online)

be determined. The COMMIT point strength of each node of the distributed database system is defined when the initial connection is made between the nodes. The COMMIT point site has to be a reliable node because it has to take care of all the messages. When the global coordinator initiates a transaction, it checks the direct references to see which one is going to act as a COMMIT point site. The COMMIT point site cannot be a read-only site. If multiple nodes have the same COMMIT point strength, then the global coordinator selects one of them. In case of a rollback, the PREPARE and COMMIT phases are not needed and thus a COMMIT point site is not selected. A transaction is considered to be committed once the COMMIT point site commits locally.

6. CONCLUSIONS Transaction processing and management is not a new concept in distributed data base management systems (DDBMS) .though it is complex tohandle transaction commit in distributed database management system but once we get the commit point for each transaction it become easy to handle the transaction in distributed environment Though it was very difficult to obtain information on DB2’s. Many organizations do not implement distributed databases because of its complexity. They simply resort to centralized databases. However, with global organizations and multi-tier network architectures, distributed implementation become a necessity for the organizations to manage the distributed database . With DB2 we try to commit the transaction safely.

REFERENCES 1. D. Agrawal, A.J. Bernstein, P. Gupta, S. Sengupta, “Distributed Optimistic Concurrency Control with Reduced Rollback,” Distributed Computing, vol. 2, no. 1, pp. 45-59, 1987. 2. R. Agrawal, M.J. Carey and L.W. McVoy, “The Performance of Alternative Strategies for Dealing with Deadlocks in Database Management Systems,” IEEE Trans. Software Eng., vol. 13, no. 12, pp. 1,348-1,363, Dec. 1987. 3. Connolly, Thomas; Begg, Carolyn, Strachan, Anne [1997], Database Systems, A Practical Approach to Design, Implementation and Management, Addison-Wesley. 4. Mohan, C.; Lindsay, B.; and Obermarck, R. [1986], 5. “Transaction Management in the R* Distributed 6. P.A. Bernstein, V. Hadzilacos. N. Goodman, Concurrency Control and Recovery in Database Systems, Addison-Wesley, 1987. 7. Database Management System.” ACM Transactions 8. on Database Systems, Vol. 11, No. 4, December

284 International Journal of Computer Science and Technology www.ijcst.com