Testing Database Transaction Concurrency
Yuetang Deng Phyllis Frankl Zhongqiang Chen
Department of Computer and Information Science
Technical Report
TR -CIS-2003-03 10/20/2003
Testing Database Transaction Concurrency1
Yuetang Deng Phyllis Frankl Zhongqiang Chen Department of Computer and Information Science Polytechnic University Brooklyn, NY 11201 USA {ytdeng, phyllis, zchen}@cis.poly.edu
Abstract with transactions, comprising groups of related queries. This paper focuses on concurrency problems associated Database application programs are often designed to with multiple simultaneously running instances of an be executed concurrently by many users. By grouping application program. related database queries into transactions, DBMS Many database applications are designed to support systems can guarantee that each transaction satisfies the concurrent users. For example, in an airline reservation well-known ACID properties: Atomicity, Consistency, system, many clients can simultaneously purchase tickets. Isolation, and Durability. However, if a database To avoid simultaneous updates of the same data item and application is decomposed into transactions in an other potential concurrency problems, database incorrect manner, the application may fail when executed applications employ program units called transactions, concurrently due to potential offline concurrency consisting of a group of queries that access, and possibly problems. update, data items. The DBMS schedules query This paper presents a dataflow analysis technique for executions so as to insure data integrity at transaction identifying schedules of transaction execution aimed at boundaries while trying to utilize resources efficiently. revealing concurrency faults of this nature, along with Although the DBMS employs sophisticated techniques for controlling the DBMS or the application mechanisms to assure that transactions satisfy the ACID so that execution of transaction sequences follows properties – Atomicity, Consistency, Isolation, and generated schedules. The techniques have been integrated Durability – an application can still have concurrency- into AGENDA, a tool set for testing relational database related faults if the application programmer erroneously application programs. Preliminary empirical evaluation placed queries in separate transactions when they should is presented. have been executed as a unit. The focus of this paper is on detecting faults of this nature. 1. Introduction The rest of the paper is organized as follows. Section 2 introduces the background, including the AGENDA tool Database and transaction processing systems occupy a set for database application testing. Section 3 identifies central position in our information-based society. It is some typical transaction concurrency problems. Our essential that these systems function correctly and provide approach to test for concurrency problems is presented in acceptable performance. Substantial effort has been Section 4. Section 5 discusses the problems with devoted to insuring that the algorithms and data structures transaction atomicity/durability in database applications used by Database Management Systems (DBMSs) work and our solution. Section 6 presents preliminary empirical efficiently and protect the integrity of the data. However, evaluation. Section 7 describes the related work. Section relatively little attention has been given to developing 8 concludes with our contributions and discusses future systematic techniques for assuring the correctness of the work. related database application programs. Given the critical role these systems play in modern society, there is clearly 2. Background a need for new approaches to assess the quality of the database application programs. 2.1. Database Transaction To address this need, we have developed a systematic, partially-automatable approach and a tool set based on A transaction is a means by which an application this approach to testing database applications [4,5,6] programmer can package together a sequence of database Initially, we focused on problems associated with operations so that the database system can provide a individual queries. In [8] we explore problems dealing number of guarantees, known as the ACID properties of a
1 Supported in part by NSF grant CCR-9988354 transaction [11]. Atomicity ensures that the set of updates AGENDA takes as input the database schema of the contained in a transaction must succeed or fail as a unit. database on which the application runs, the application Consistency means that a transaction will transform the source code, and “sample value files”, containing some database from one consistent state to another. The suggested values for attributes. The tester interactively isolation property guarantees that partial effects of selects test heuristics and provides information about incomplete transactions should not be visible to other expected behaviors of test cases. Using this information, transactions. Durability means that effects of a committed AGENDA populates the database, generates inputs to the transaction are permanent and must not be lost because of application, executes the application on those inputs and later failure. Therefore, transactions can be used to solve checks some aspects of correctness of the resulting problems such as creation of inconsistent results, database state and the application output. concurrent execution errors within transactions and lost This approach is loosely based on the Category- results due to system crash [16]. Partition method [17]: the user supplies suggested values A database application program typically consists of for attributes, optionally partitioned into groups, which SQL statements embedded in a source language, such as we call data groups. The data are provided in the sample C or Java. Ends of transactions are delimited by the value files. The tool then produces meaningful keyword COMMIT (or ROLLBACK). Once a transaction combinations of these values in order to fill database begins execution, the DBMS assures that no other tables and provide input parameters to the application transaction executes a statement that interferes with this program. Data groups are used to distinguish values that transaction (e.g. by updating data being accessed by the are expected to result in different application behaviors. transaction) until the transaction commits or rolls back. For example, in a payroll system, different categories of The system can initiate a rollback when it enters a state in employees may be treated differently. Additional which ACID properties may be violated, either due to information about data groups, such as probability for overly optimistic scheduling decisions or to external selecting a group, can also be provided via sample value events such as system failure. When a rollback occurs, the files. system is restored to the state before execution of the Using these data groups and guided by heuristics, transaction. Once a transaction commits, all of the AGENDA produces a collection of test templates changes it has made to the database state become representing abstract test cases. The tester then provides permanent. information about the expected behavior of the In this paper, we consider application programs written application on tests represented by each test template. For in a high-level language such as C, with SQL statements example, the tester might specify that the application interspersed. The SQL statements may involve host should increase the salaries of employees in the “faculty” variables, which are variables declared in the host group by 10% and should not change any other attributes. program (host variables are preceded by colons). Multiple In order to control the explosion in the number of test clients may execute an application program concurrently. templates and to force the generation of particular kinds In this case, each client has its own host variables. The of templates, the tester selects heuristics. Finally, only “variable” shared by concurrently executing clients AGENDA instantiates the templates with specific test is the database. In this paper we do not consider stored values, executes the test cases and checks that the output procedures or shared variables arising through multi- and new database state are consistent with the expected threading within an application instance. behavior indicated by the tester. The architecture of AGENDA is shown in Figure 1. 2.2. AGENDA Tool Set The first component (AGENDA Parser) extracts relevant information from the application’s database In our previous work [4,5], we have discussed issues schema, the application queries, and tester-supplied arising in testing database systems, presented an approach sample value files, and then makes this information for testing database applications, and described available to the other four components by creating an AGENDA, a set of tools based on this approach. In internal database called AGENDA DB to store the testing such applications, the states of the database before extracted information. The AGENDA DB is used and/or and after the application’s execution play an important modified by the remaining four components. AGENDA role, along with the user’s input and the system output. A Parser is a modified version of from the open-source framework for testing database applications was DBMS Postgresql parser [18]. introduced. A complete tool set based on the framework The second component (State Generator) uses the has been prototyped. The components of this system are: database schema along with information from the tester’s AGENDA Parser, State Generator, Input Generator, State sample value files indicating useful values (optionally Validator, and Output Validator. partitioned into different groups of data) for attributes, and populates the database tables with data satisfying the integrity constraints. It retrieves the information about the execute concurrently, which could not occur if the application’s tables, attributes, constraints, and sample instances executed serially. Transaction mechanisms in a data from the AGENDA DB and generates an initial DB DBMS provide protection for the shared resource (i.e., state for the application. We refer to the generated DB as the database). However, designers must use them the application DB. Heuristics are used to guide the carefully to assure correct behavior. generation of both the application DB state and The transaction manager of a DBMS schedules the application inputs. execution of queries from concurrent instances. It assures that no concurrency failures occur, provided that the application programmer uses transactions appropriately. Without intervention by the transaction manager, the following phenomena are possible for two concurrent database transactions Ti and Tj (executed concurrently by two instances)[2, 12].
P0 (Lost update/Dirty Write): The update to a data element committed by Ti is ignored by transaction Tj, which writes the same data element based on the original value. P1 (Dirty Read): Ti modifies a data element. Tj reads that data element before Ti commits. If Ti rolls back, Tj will use an uncommitted value which does not exist in the database. P2 (Non-repeatable Read): Ti reads a database element Figure 1: Architecture of the AGENDA tool set twice, once before transaction Tj updates it and once after transaction Tj has updated it and commits. These two The third component (Input Generator) generates input read operations return different values for the same data data to be supplied to the application. The input data are element. created by using information generated by the AGENDA P3 (Phantom): Ti reads a set of tuples based on some Parser and State Generator components, along with specified conditions. Tj then executes and may insert one information derived from parsing the SQL statements in or more tuples that satisfy the same conditions as Ti. the application program and information that is useful for When Ti repeats the reading, it will obtain a different set checking the test results. of tuples. The difference between P2 and P3 is that new The fourth component (State Validator) investigates tuples appear in P3 while the same tuple with different how the state of the application DB changes during values may be read in P2. execution of a test. It automatically constructs a log table and triggers/rules for each table in the application DBMS systems allow transactions to run at four schema. The State Validator uses these triggers/rules to isolation levels from least stringent (highest concurrency) capture the changes in the application DB automatically, to most stringent (lowest concurrency) [15]: READ and uses database integrity constraint techniques to check UNCOMMITTED (level 0), READ COMMITTED (level whether the application DB state has changed in the right 1), REPEATABLE READ (level 2), SERIALIZABLE way. (level 3). P0 is impossible at any isolation level while P1, The fifth component (Output Validator) stores the P2, and P3 may happen at levels 0, 1 and 2. When tuples satisfying the application query and constraints in transactions are executed at the SERIALIZABLE level by the log tables. It uses integrity constraint techniques to default, the DBMS transaction manager assures that none check that the preconditions and post-conditions hold for of the four phenomena can occur. It schedules execution the output returned by the application. of the queries in the concurrent transactions in such a way that the result of the concurrent executions is the same as 3. Transaction concurrency problems that of a sequential execution of transactions in some certain order. Concurrency is one of the trickiest aspects of software However, phenomena analogous to P0, P1, P2, and P3 development. Even if an application runs correctly for all may occur in an application even if all transactions run at inputs in an isolated environment, it may have incorrect the SERIALIZABLE level. In this case, these phenomena behavior when running concurrently with other instances occur between transactions instead of within a single of the same application. We define concurrency failure to transaction and they are classified as offline concurrency be a failure that occurs when two or more instances problems [10], which occur when data are manipulated across multiple database transactions. A buggy Figure 2: A sample database application system transaction design/implementation may group queries into illustrating a potential offline concurrency error. transactions in an incorrect manner. In the worst case, it If two clients A and B run chooseSeat() may put each query into a separate transaction. These simultaneously, the following schedule could occur: incorrect designs and/or implementations of database client A runs T1, client B runs T1, client A runs T2, client applications may cause concurrency failures. B runs T2. After execution of this schedule, avail is Concurrency control mechanisms in DBMS systems decreased by 1 instead of 2. It implies phenomenon P0. cannot avoid these problems. Figure 3 shows a partial implementation of a A common scenario in a database application is that warehouse application. The sum of attribute ol_amount in the application selects some data from a database, the table customer represents the total balance for one processes the data, and updates the database with new order. In line 1, this value is stored in the host variable data. Consistency constraints can be violated if more than ol_total. In line 2, c_balance is increased by ol_total one instance tries to modify the same data element indicating that this order is delivered. The application concurrently. To avoid interference from other instances, erroneously commits between line 1 and line 2, thus related operations should be grouped into a transaction. allowing another concurrent application instance to The entire application can be implemented as a modify the database between the executions of these transaction to avoid concurrency problems, but this would lines. prevent any concurrency since all instances would essentially be forced to execute sequentially. To achieve Example 2: A warehouse application some degree of concurrency, an application is composed CREATE TABLE order_line of a set of transactions. There is a close relation between (o_id INT, ol_amount MONEY ); the concurrency and the number of transactions in the CREATE TABLE customer application. The incorrect manipulation of this relation to (c_id INT, c_balance MONEY); over-optimize efficiency easily introduces concurrency Delivery ( ) { faults into the system. The examples below illustrate 1) SELECT SUM(ol_amount) INTO :ol_total offline concurrency problems. FROM order_line WHERE c_id = :c_id; 3 A partial and buggy implementation of an airline 2’) COMMIT; // Transaction T 3 4 reservation system is shown in Figure 2. The attribute // Erroneously ends T and begins T avail in table Flights represents the number of seats still 2) UPDATE customer SET c_balance = c_balance + :ol_total where c_id = :c_id; available on a given flight. In line 5, the number of seats 4 available on a given flight is stored in the host variable 3) COMMIT; // T avail and is decremented by 1 in line 6 indicating that } one seat was booked for the flight. The application New_order ( ) { …. INSERT INTO order_line VALUES erroneously commits between lines 5 and 6, thus allowing 5 another concurrent application instance to modify the (:c_id, :amount); // T database between executions of these lines. }
Example 1: Ticket booking system Figure 3: Partial implementation of TPC-C CREATE TABLE Flights(fltNum INT, avail INT); warehouse application. 1) BEGIN DECLARE SECTION; 2) int flight; If two clients, A and B, run Delivery( ) with the same 3) int avail; c_id simultaneously, then the following schedule could 4) END DECLARE SECTION; cause the database to be inconsistent between the 3 5) void chooseSeat( ) { order_line and customer tables: client A runs T , client B 5 4 EXEC SQL SELECT avail INTO :avail runs T , client A runs T . This schedule implies FROM Flights WHERE fltNum = :flight phenomenon P0. We describe technique for analyzing 5’) COMMIT; //Transaction T1 and testing application programs for potential offline // Erroneously ends T1 and begins T2 concurrency faults in the next section. In this paper, we 6) if (avail > 0) { only consider concurrent transactions running at the EXEC SQL UPDATE Flights SET SERIALIZABLE, the most stringent level. All avail =:avail -1 WHERE fltNum = :flight ; concurrency problems at a higher isolation level could 7) COMMIT; // Transaction T2 also appear at a lower isolation levels. Additional faults } could occur if the application programmer erroneously assigns a less stringent level than needed.
4. Testing for concurrency failures application instance A can follow a path containing Ti k followed by T . Testing concurrent systems is notoriously difficult due To define legal schedules, we introduce a transaction to non-determinism and synchronization problems. control flow graph (TCFG) based on the application Multiple executions of the same test may have different source code. Each node represents a host language interleavings and generate different results, making it statement or an embedded SQL statement. There is a particularly difficult to diagnose and debug faults that directed edge from node i to node k if the execution of i have been detected. In addition to the problems that can can be followed immediately by the execution of k. All occur in sequential programs, concurrent programs have the consecutive non-transaction nodes (those nodes their unique problems: data access control and consisting of host language statements only) can be synchronization. In a database application, there is collapsed together to reduce the size of the TCFG as long usually no direct synchronization between different as the control structure of the application is not changed. application instances. Concurrency problems may occur The TCFG is constructed by using source code analysis when two application instances simultaneously access tools [8]. The flow information inside a transaction is and/or update the same database element (the same only helpful in testing transaction consistency. In this attribute of the same table) directly or indirectly through paper, the TCFG is further simplified by collapsing all host variables storing local copies of the same database SQL statements of one transaction into one single element. Based on the above observations, we propose transaction node. the following procedure to test transaction concurrency. Each transaction node in the TCFG is associated with 1. Find schedules that could potentially produce the set of database elements and the set of host variables concurrency failures. A schedule can be represented by the transaction accesses. This information can be obtained i j k i j k the form S =
4.1. Failure-prone transaction schedules Table 3: table xact_parameter for example 2, U stands for “USE”; D stands for “DEFINE”. The simplest schedule which may induce XactId Param Op Tname Aname T3 ol_total D order_line ol_amount concurrency failure is one that involves only three 3 transactions and has the form of
Table 4: access patterns and their problems. For each node (transaction) Tk, GEN[Tk] contain a Transactions (Ti, Tk) in one instance and Tj in tuple for each data element elm accessed by Tk. For each another access the same database element. data element elm that is updated in node Tk , KILL[Tk] i T R W R R R W W W includes all tuples involving a data element, elm, that is Tj R R R W W W W R defined or written in Tk. IN[Tk] and OUT[Tk] can be Tk R R W R W R W W calculated iteratively by using a work-list algorithm Conflict N N N Y Y Y Y Y derived from the method for computing the Reaching Pheno- P2 P0 P0 P0 P1 Definitions [1], which finds all the tuples reaching node mena P3 Tk by solving the following equations:
k i Table 5: Access patterns and their problems. IN[T ] = ∪Ti ∈ predecessors(Tk) ( OUT [T ] ). Transaction (Ti, Tk) belongs to an instance, a host OUT[Tk] =( IN [Tk] – KILL [Tk] ) ∪ GEN [Tk] variable is defined/used in Ti and Tk. Tj in another instance accesses the database element associated After the algorithm converges, IN[Tk] and GEN[Tk] with the host variable in Ti. can be used to derive pairs of transactions accessing the Ti U D U U U D D D same data element in the same execution path. If a tuple Tj R R R W W W W R
4.2. Generation of interesting and legal schedules k Gen_XactPair( T ) { For each tuple
Many static and dynamic analysis techniques have 8. Conclusions and future work been proposed for the difficult problem of detecting concurrency related faults. These techniques must be To test database applications, we have proposed a framework and have designed and implemented a tool set to partially automate the testing process. In this paper, we [5] D. Chays, Y. Deng, et al. “An AGENDA for Testing extend our previous work to test transaction concurrency Relational Database Application”, Journal of Software at the isolation level of SERIALIZABLE and transaction Testing, Verification and Reliability, to appear. atomicity/durability. We have identified the potential [6] D. Chays, Y. Deng, “Demonstration of AGENDA offline concurrency problems in database applications. A Tool for Testing Database Applications”, International data flow analysis technique is used to identify interesting Conference on Software Engineering, Portland, Oregon, and legal schedules, which may reveal concurrency USA, 2003. errors. Two approaches are suggested to execute a given [7] J. Corbett, M. Dwyer, et al. “Bandera: Extracting schedule. The instrumentation method is also used to test finite-state models from Java Source Code”, International atomicity/durability. Preliminary empirical evaluation Conference on Software Engineering, Limerick, Ireland, based on the TPC-C benchmark is presented and 2000. demonstrates our approach’s effectiveness and efficiency. [8] Y. Deng, D. Chays and P. Frankl, “Testing database Testing database applications involves testing that transaction consistency”, Technical Report, CIS transactions lead to consistent database states and Department, Polytechnic University, 2003. transform the database space in manner consistent with [9] O. Edelstein, E. Farchi, et al. “Multithreaded Java the application’s specification and that no failures occur program test generation”, IBM Systems Journal, Vol. 41, in concurrent executions that would not occur in serial 2002. executions. This paper focuses on the last of those issues. [10] M. Fowler, Patterns of Enterprise Application Techniques for checking consistency are discussed in a Architecture, Addison-Wesley & Benjamin Cummings, related paper [8]. Future work includes techniques to 2003. expose potential problems associated with “dirty read”, [11] H. Garcia-monlina, J. Ullman, J. Widom, Database “non-repeatable read”, and “phantom”, for transactions Systems: the Complete Book, Prentice Hall, 2000. running at isolation levels lower than SERIALIZABLE. [12] J. Gray, A. Reuter, Transaction Processing: Concept We also plan more extensive empirical evaluation. and Techniques, Morgan Kaufmann Publishers Inc., 1993. Acknowledgments [13] G. Holzmann, “The Spin Model Checker”, IEEE Transaction on Software Engineering, May 1997. The authors are grateful for the comments and [14] G. Hwang, K. Tai, and T. Huang, “Reachability suggestions from anonymous reviewers. We would thank testing: an approach to testing concurrent software”, David Chays, Gleb Naumovich, Torsten Suel, and Alex Inernational. Journal on Software Engineering and Delis for helpful discussions. Knowledge Engineering, Vol. 5, No. 4, 1995, 493-510. [15] J. Melton, A. Simon, SQL: 1999 Understanding References Relational language components, Morgan Kaufmann Publishers Inc., 2002. [16] P. O’Neil, Database: Principles, Programming, and [1] A. Aho, R. Sethi, and J. Ullman, Compipler: Performance, Morgan Kaufmann Publishers, 2000. Principles, Techniques, and Tools, Addison-Wesley, [17] T. Ostrand and M. Balcer. “The category-partition 1988. method for specifying and generating functional tests”. [2] H. Berenson, et al. “A critique of ANSI SQL Communication of ACM, Vol. 31 no. 6, 1988. Isolation Levels” ACM Special Interest Group on [18] PostgreSQL Global Development Team, Management of Data Conference, San Jose, CA USA PostgreSQL, http://www.postgresql.org, 2002. 1995. [19] R. Stevens, Advanced programming in the UNIX [3] D. Bruening, J. Chapin, “Systematic testing of environment, Addison-Wesley, 1993. multithreaded programs”, MIT/LCS Technical Memo, [20] TPC-Benchmark C. Transaction Processing Council. LCS-TM 607, April 2000. http://www.tpc.org, 2002. [4] D. Chays, P. Frankl, et al. “A Framework for Testing
Database Application” ACM International Symposium on Software Testing and Analysis, Portland, Oregon, 2000.