<<

Testing Transaction Concurrency

Yuetang Deng Phyllis Frankl Zhongqiang Chen Department of Computer and Information Science Polytechnic University Brooklyn, NY 11201 USA {ytdeng, phyllis, zchen}@cis.poly.edu

individual queries. In [19] we explore problems dealing Abstract with transactions, comprising groups of related queries. This paper focuses on concurrency problems associated Database application programs are often designed to with multiple clients simultaneously running an be executed concurrently by many clients. By grouping application program. related database queries into transactions, DBMS Many database applications are designed to support systems can guarantee that each transaction satisfies the concurrent users. For example, in an airline reservation well-known ACID properties: Atomicity, Consistency, system, many clients can simultaneously purchase tickets. Isolation, and Durability. However, if a database To avoid simultaneous updates of the same item and application is decomposed into transactions in an other potential concurrency problems, database incorrect manner, the application may fail when applications employ program units called transactions, executed concurrently due to potential offline consisting of a group of queries that access, and possibly concurrency problems. , data items. The DBMS schedules query This paper presents a dataflow analysis technique for executions so as to insure at transaction identifying schedules of transaction execution aimed at boundaries (while also trying to utilize resources revealing concurrency faults of this nature, along with efficiently). techniques for controlling the DBMS or the application Although the DBMS employs sophisticated to force execution to follow these transaction sequences. mechanisms to assure that transactions satisfy the ACID The techniques have been integrated into AGENDA, a properties – Atomicity, Consistency, Isolation, and tool set for testing application Durability – an application can still have concurrency- programs. Preliminary empirical evaluation is presented. related faults if the application programmer erroneously placed queries in separate transactions when they should 1. Introduction have been executed as a unit. The focus of this paper is on detecting faults of this nature.

The rest of the paper is organized as follows. Section 2 Database and systems occupy introduces the background and a tool set called AGENDA a central position in our information-based society. It is for database application testing. Section 3 identifies some essential that these systems function correctly and typical transaction concurrency problems. Section 4 provide acceptable performance. Substantial effort has presents our approach to test for concurrency problems. been devoted to insuring that the algorithms and data Section 5 discusses problems with transaction structures used by Database Management Systems atomicity/durability in database applications and our (DBMSs) work efficiently and protect the integrity of the solution. Section 6 presents preliminary empirical data. However, relatively little attention has been given to evaluation. Section 7 describes related work. Section 8 developing systematic techniques for assuring the concludes with our contributions and discusses future correctness of the related database application programs. work. Given the critical role these systems play in modern society, there is clearly a need for new approaches to assessing the quality of the database application 2. Background programs. To address this need, we have developed a systematic, 2.1. Database Transaction partially-automatable approach and a tool set based on this approach to testing database applications [1,5,11]. A transaction is a means by which an application Initially, we focused on problems associated with programmer can package together a sequence of database operations so that the database system can provide a Generator, Input Generator, State Validator, and Output number of guarantees, known as the ACID properties of a Validator. transaction [4]. Atomicity ensures that the set of updates AGENDA takes as input the database schema of the contained in a transaction must succeed or fail as a unit. database on which the application runs; the application Consistency means that a transaction will transform the source code; and “sample value files”, containing some database from one consistent state to another. The suggested values for attributes. The tester interactively isolation property guarantees that partial effects of selects test heuristics and provides information about incomplete transactions should not be visible to other expected behaviors of test cases. Using this information, transactions. Durability means that effects of a committed AGENDA populates the database, generates inputs to the transaction are permanent and must not be lost because of application, executes the application on those inputs and later failure. Therefore, transactions can be used to solve checks some aspects of correctness of the resulting problems such as creation of inconsistent results, database state and the application output. concurrent execution errors within transactions and lost This approach is loosely based on the Category- results due to system crash [17]. method [16]: the user supplies suggested values A database application program typically consists of for attributes, partitioned into groups, which we call data SQL statements embedded in a source language, such as groups. This data is provided in the sample value files. C or Java. Transactions are delimited by the keywords The tool then produces meaningful combinations of these BEGIN and (or ROLLBACK). During execution, values in order to fill database tables and provide input once a transaction’s BEGIN is executed, the DBMS will parameters to the application program. Data groups are assure that no concurrently executing application executes used to distinguish values that are expected to result in a statement that interferes with this transaction (e.g. by different application behaviors. For example, in a payroll updating data that this transaction is using) until the system, different categories of employees may be treated transaction’s COMMIT is executed or until the transaction differently. Additional information about data groups, is rolled back. The system can initiate a rollback when it such as probability for selecting a value from the list of enters a state in which ACID properties may be violated, values that follows, can also be provided via sample value either due to overly optimis tic scheduling decisions or to files. external events such as system failure. When a rollback Using these data groups and guided by heuristics occurs, the system is restored to the state before selected by the tester, AGENDA produces a collection of execution of the transaction’s BEGIN. Once a transaction test templates representing abstract test cases. The tester commits, all of the changes it has made to the database then provides information about the expected behavior of state become permanent. the application on tests represented by each test template. Although ACID properties are related to each other, For example, the tester might specify that the application testing each of the properties may expose problems in should increase the salaries of employees in the “faculty” different aspects of the database application. To test group by 10% and should not change any other attributes. transactions in the application, all the ACID properties of In order to control the explosion in the number of test transactions need to be addressed. In this paper, we focus templates and to force the generation of particular kinds of on testing transaction concurrency (isolation) and templates, the tester selects heuristics. Finally, AGENDA atomicity/durability. instantiates the templates with specific test values, executes the test cases and checks that the output and 2.2. Agenda Tool Set new database state are consistent with the expected behavior indicated by the tester. The architecture of In our previous work [1,11], we have discussed issues AGENDA is shown in Figure 1. arising in testing database systems, presented an approach to testing database applications, and described AGENDA, a set of tools based on this approach. In testing such applications, the states of the database before and after the application’s execution play an important role, along with the user’s input and the system output. A framework for testing database applications was introduced. A complete tool set based on the framework has been prototyped and is shown in Figure 1. The components of this system are: Agenda Parser, State The fifth component (Output Validator) stores the tuples which are satisfied by the application query and constraints in the log tables. It uses integrity constraint techniques to check that the preconditions and post- conditions hold for the output returned by the application. In this paper, we consider application programs written in embedded SQL, i.e., a high-level language, such as C, with SQL statements interspersed. The SQL statements may involve host variables, which are variables declared in the host program; within the SQL statements, host variables are preceded by colons. Multiple clients may execute an application program concurrently. In this case, each client has its own host variables. The only “variable” shared by concurrently executing clients is the database, itself. Figure 1 Architecture of the AGENDA tool set 3. Transaction concurrency problems The first component (Agenda Parser) extracts relevant information from the application’s database schema, the Concurrency is one of the trickiest aspects of software application queries, and tester-supplied sample value files, development. Even if an application runs correctly for all and makes this information available to the other four inputs in an isolated environment, it may have incorrect components. It does this by creating an internal database behavior when running concurrently with other instances called Agenda DB to store the extracted information. The of the same application. Transaction mechanisms in a Agenda DB is used and/or modified by the remaining four DBMS system provide protection for the shared resource components. Agenda Parser is a modified version of from (i.e., the database). However, designers must use them the open-source DBMS Postgresql parser [9]. carefully to assure correct behavior. A common scenario The second component (State Generator) uses the in a database application is that the application selects database schema along with information from the tester’s some data from a database, processes the data, and sample value files indicating useful values (optionally updates the database with new data. Consistency partitioned into different groups of data) for attributes, constraints can be violated if more than one instance tries and populates the database tables with data satisfying the to modify the same data element concurrently. To avoid integrity constraints. It retrieves the information about the interference from other instances, related operations application’s tables, attributes, constraints, and sample should be grouped into a transaction. To improve data from the Agenda DB and generates an initial DB state efficiency, operations on different tables or different for the application. We refer to the generated DB as the tuples of the same could perform concurrently. application DB. Heuristics are used to guide the The entire application can be implemented as a generation of both the application DB state and transaction to avoid concurrency problems , but this would application inputs. deny any concurrency since all instances are essentially The third component (Input Generator) generates input forced to execute sequentially. To achieve some degree of data to be supplied to the application. The data are created concurrency, an application is composed of a set of by using information that is generated by the Agenda transactions. There is a close between the Parser and State Generator components, along with concurrency and the number of transactions in the information derived from parsing the SQL statements in application. The incorrect manipulation of this relation to the application program and information that is useful for over-optimize efficiency easily introduces concurrency checking the test results. faults into the system. This kind of concurrency problem The fourth component (State Validator) investigates which occurrs when data are manipulated across multiple how the state of the application DB changes during database transactions is termed offline concurrency execution of a test. It automatically constructs a log table problem [14]. and triggers/rules for each table in the application schema. A partial and buggy implementation of an airline The State Validator uses these triggers/rules to capture reservation system is shown in Figure 2. The attribute the changes in the application DB automatically, and uses avail in table Flights represents the number of seats still database integrity constraint techniques to check that the available on a given flight. In line 5, the number of seats application DB state changed in the right way. available on a given flight is stored in the host variable avail and is decremented by 1 in line 6 indicating that one } seat is booked for the flight. The application erroneously commits between lines 5 and 6, thus allowing another Figure 3: Partial implementation of TPC-C concurrent application instance to modify the database warehouse application. between executions of these lines. If two clients, A and B, run Delivery( ) with the same c_id 3 5 4 Example 1 Ticket booking system simultaneously, then the schedule could CREATE TABLE Flights(fltNum INT, avail INT); cause the database to be inconsistent between the 1) BEGIN DECLARE SECTION; order_line and customer tables. 2) int flight; 3) int avail; 3.1. Transaction Isolation Levels and 4) END DECLARE SECTION; Phenomena 5) void chooseSeat( ) { EXEC SQL SELECT avail INTO :avail More generally, isolation can be defined in terms of FROM Flights WHERE fltNum = :flight 1 phenomena that can occur during the execution of 5’) COMMIT; // Transaction T , BUG/ concurrent database transactions. The following 6) if (avail > 0) { phenomena are possible for two concurrent database EXEC SQL UPDATE Flights SET transactions Ti and Tj [2, 3]. avail =:avail -1 WHERE fltNum = :flight ; 2 7) COMMIT; // Transaction T a) P0 (Lost update/ Dirty Write): The update to a data } element committed by Ti is ignored by transaction Tj, Figure 2: A sample database application system which writes the same data element based on the original illustrating a potential offline concurrency error. value. b) P1 (Dirty Read): Ti modifies a data element. Tj reads If two clients A and B run chooseSeat( ) i i j 1 1 2 2 that data element before T commits. If T rolls back, T will simultaneously, the schedule may use an uncommitted value which does not exist in the occur. After the execution of this schedule, avail is database. decreased by 1 instead of 2. c) P2 (Non-repeatable Read): Ti reads a data element Figure 3 shows a partial implementation of a warehouse before and after Tj updates and commits. These two read application. The sum of attribute ol_amount in the table operations return different values for the same data customer represents the total balance for one order. In line element. 1, this value is stored in the host variable ol_total. In line d) P3 (Phantom): Ti reads a set of tuples based on 2, c_balance is increased by ol_total indicating that this some specified conditions. Tj then executes and may order is delivered. The application erroneously commits insert one or more tuples which satisfy the same between line 1 and line 2, thus allowing another conditions as Ti. When Ti repeats the previous operation, concurrent application instance to modify the database it will obtain a different set of tuples. The difference between the executions of these lines. between P2 and P3 is that new tuples appear in P3 while in P2 different values may be read for the same tuple at Example 2 A warehouse application different times. CREATE TABLE order_line ( o_id INT, ol_amount MONEY ); DBMS systems allow transactions to run at four CREATE TABLE customer (c_id INT, c_balance isolation levels from least stringent (highest concurrency) MONEY); to most stringent (lowest concurrency) [6]: READ Delivery ( ) { UNCOMMITTED (level 0), READ COMMITTED (level 1), 1) SELECT SUM(ol_amount) INTO :ol_total REPEATABLE READ (level 2), SERIALIZABLE (level 3). FROM order_line WHERE c_id = :c_id; 3 P0 is impossible at any isolation level while P1, P2, and P3 2’) COMMIT; // T , BUG may happen at levels 0, 1 and 2. When a transaction is 2) UPDATE customer SET c_balance = executed at the SERIALIZABLE level, the DBMS c_balance + :ol_total c_id = :c_id; 4 transaction manager assures that none of the phenomena 3) COMMIT; // T can occur. It allows transaction concurrency only if the } result of concurrent executions is the same as that of a New_order ( ) { …. sequential execution of transactions in some order. In this INSERT INTO order_line VALUES (:c_id, :amount); // T5 paper, we only consider concurrent transactions running 1. Find interesting and legal schedules (or partial at the SERIALIZABLE level. schedules). A schedule is interesting if it could potentially produce concurrency errors and it is legal if it is a feasible 3.2. Concurrency at SERIALIZABLE level path in the application execution. A schedule can be m i k m i represented by the form S = , where T , T k A buggy transaction design/implementation may group and T are transactions and are not necessarily distinct, queries into transactions in an incorrect manner. In the and A and B are two different application instances. m i k worst case, it may put each query into a separate Schedule S is interesting if T A, TB, T A access the same transaction. These incorrect designs/implementations of data element. database applications may introduce concurrency 2. Generate test cases to execute the given transactions problems and cause the database to be in inconsistent for each generated schedule. Transactions in a schedule states. mechanisms in DBMS usually access the same data element (table and attribute). systems can not avoid these concurrency problems since To guarantee that all transactions in a schedule access the they only solve the online concurrency. All the same tuple in the same database table, the test case phenomena P0, P1, P2, and P3 may occur in the generator should populate the inputs to these applications even if all transactions run at the transactions properly. SERIALIZABLE level. Of course, these phenomena occur 3. Execute the test cases in such a way that the between transactions instead of within one transaction interleaving of executions of application instances and they are classified as offline concurrency problems . conforms to the specified schedule. This can be achieved A concurrency error (or failure) in database by using two methods, modification of the concurrency applications is defined as one that occurs in concurrent control engine in the DBMS backend and instrumentation execution of application instances but does not occur in of the application source code. sequential execution. For instance, dirty read, non- repeatable read, and phantom could result in concurrency 4.1. Transaction control flow graph errors. Examples 1 and 2 show that concurrency problems can occur even if transactions run at the isolation level of The simplest schedule which may induce concurrency SERIALIZABLE. All concurrency problems which appear error is one that involves only three transactions and has m i k at a higher isolation level could also appear at a lower the form of . A naïve approach would m i k isolation level. generate all possible triples of the form . . However, many of them are not interesting since the 4. Methodology for transaction concurrency transactions access totally different data elements. It is test more efficient if we only generate transaction sequences accessing the same data element directly or through host m i k Testing concurrent systems is notoriously difficult due variable. It is noted that a sequence is legal only if application instance A can follow a path containing to non-determinism and synchronization problems . m k Multiple executions of the same test may have different T followed by T. To achieve the above goals, a flow- interleavings and generate different results. Thus, it is sensitive data flow analysis technique is used to construct even difficult to diagnose and debug faults that have been schedules. detected. In addition to the problems that can occur in Data flow analyses (DFAs) are widely used in static sequential programs, concurrent programs have their program analysis [18]. In DFAs, a graph is constructed for unique problems : data access control and the control and/or data flow in the program. Each node synchronization. In a database application, there is usually represents a statement or a block of statements. Control no direct synchronization between different application and data information are associated with each node and instances. Data access control and synchronization can be propagated along edges. Propagation of between different application instances happen when they information is usually iteratively re-computed by the DFA try to access the same data element. Furthermore, it is algorithm. intuitive that concurrency problems may occur when two application instances simultaneously access and/or Table 2: table xact_read_write for example 1 update the same database element (same attribute of the XactId Op Tname Aname 1 same table) directly or through host variables that store T READ Flight Available 1 local copies of the same database element. Based on the T READ Flight Fltnum 2 above observations, we propose to use the following T READ Flight Fltnum procedure to test transaction concurrency: T2 WRITE Flight available Tm R W R R R W W W Table 3: table xact_parameter for example 2 Ti R R R W W W W R XactId Param Op Tname Aname Tk R R W R W R W W 3 T :ol_total DEF order_line ol_amount Error? N N N Y Y Y Y Y 3 T :c_id USE order_line c_id phenome P2 P0 P0 P0 P1 T4 :ol_total USE customer c_balance na P3 T4 :c_id USE Customer c_id T5 :c_id USE customer c_id Table 5: Access patterns and their problems. T5 :amount USE customer c_balance Transaction (Tm, Tk) belongs to an instance, a host m k i variable is defined/used in T and T. T in another DFAs can be used to find interesting and legal instance accesses the data element associated with m schedules. First, we construct a transaction control flow the host variable in T . U stands for “USE”; D stands graph (TCFG) based on the application source code. In the for “DEFINE”. TCFG, each node represents a transaction or a group of Tm U D U U U D D D host language statements. There is a directed edge from Ti R R R W W W W R node m to node k if the execution of m can be followed Tk U U D U D U D D immediately by the execution of k. All the consecutive Error? N N N Y Y Y Y N non-transaction nodes (those nodes consisting of host phenom P0 P0 P0 P2 language statements only) can be collapsed together to ena P1 P1 P1 P3 reduce the size of the TCFG as long as the control P2 P2 P2 structure of the application is not changed. The TCFG can P3 P3 P3 be constructed by using source code analysis tools [19]. To figure out the set of data elements and the set of host Similarly, as illustrated in Example 2, if two transactions variables a transaction accesses, we can use the Agenda Tm and Tk define/use the same host variable in some path tool set introduced in [1]. When SQL queries in a of TCFG, and a transaction Ti in another application transaction are parsed by Agenda Parser, information instance writes the data element associated with this host about all data elements to be accessed (reads or writes) is variable, then Ti may change the data element, so the host stored in the table xact_read_write, and information variable is not consistent with the associated data about host variables to be accessed (defines or uses) and element. Again, a concurrency error may occur. their associated tables and attributes is stored in the table In example 2 above, in Tk client A is using a host xact_parameter. For instance, table xact_read_write for variable (defined in Tm) whose value is out of date Example 1 and table xact_parameter for Example 2 are because client B updated the data element from which the given in Tables 2 and 3. host variable is defined before client A used the host m As illustrated in Example 1, if two transactions T , and variable. More generally, Tk could use a host variable that k T in the same instance access the same data element in depends on the host variable defined in Tm, through some some path of the TCFG, there is a potential concurrency chain of definitions and uses. i i m error: a transaction T in another instance (T could be T By considering all possible interleavings between a k or T !) accesses the same data element, and their transaction Ti i in one instance and transactions (Tm, Tk) operations are not compatible (i.e., at least one operation from another instance, we list all the possible access is WRITE). patterns and possible errors with each pattern in Table 4 More generally, we can characterize anomalous access and Table 5. patterns of this nature by considering the flow of data elements (DB table and attribute) between transactions. i m k 4.2. Generation of interesting and legal Suppose that transactions T, T and T all access the schedules same data element, and Tm appears before Tk on some execution of Client A. Consider what happens when Client After we construct the TCFG and find out sets of data B executes transaction Ti at an arbitrary point between elements and host variables that each node accesses, we execution of Tm and Tk. can generate interesting and legal schedules as follows.

First, DFA algorithm is used to propagate data information Table 4: access patterns and their problems. in the TCFG. Recall that each node in the TCFG Transactions (Tm, Tk) in one instance and Ti in corresponds to a transaction, and the DFA information another access the same data element. R stands for with each node is a set of tuples consisting of data “READ”; W stands for “WRITE”. elements read/written and host variables defined/used by the node (transaction). There are two kinds of tuples with If exists tuple (Tj, READ/WRITE, Tname, each node in the TCFG. One is of the form (xact_id, Aname) in GEN_DB[Tj] operation, table_name, attr_name) which indicates that Add (Ti, Tj, Tname, Aname) to DBPair transaction xact_id applies the operation (READ or WRITE) to data element (table_name, attr_name). The GEN_HVPair( Tj ) other is of the form (xact_id, operation, p) which means For each tuple ( Ti, DEF/USE , p) in IN_HV[i] that transaction xact_id accesses parameter p via the If exist tuple ( Tj , DEF/USE , p’) in GEN_HV[i] operation (DEF or USE). AND the def of p in Ti reaches this use of p’ The DFA information for each node can be represented Add (Ti, Tj, p) to HVPair by two groups of sets: (IN_DB, OUT_DB, GEN_DB, KILL_DB) and (IN_HV, OUT_HV, GEN_HV, KILL_HV). Figure 5: GEN_DBPair and GEN_HVPair The former group represents data element flow information, while the latter represents host variable flow After DBPair and HVPair sets are generated, we use information. For each node (transaction) Tj, GEN_DB[Tj] two heuristics to generate interesting schedules: and GEN_HV[Tj] contain a tuple for each data element and For each DBPair/HVPair , find all (partial) j m each host variable accessed by T, respectively. For each schedules (or randomly find one (partial) schedule) such that the access pattern in node Tj, KILL_DB[Tjj] includes all tuples that re-write the matches one of patterns in Table 4 or Table 5 which may same data element. Similarly, KILL_HV[Tj] includes all reveal concurrency error. tuples that re-define the same host variable by node j. For instance, by applying the above algorithm to IN_DB[Tj], IN_HV[Tj], OUT_DB[Tj], and OUT_HV[Tj] can example 1, and are added to DBPair because both T1 and T2 access derived from the method for computing the Reaching the same data items and . Definitions [18], which finds all the tuples reaching node Tj Both transactions read data element by solving the following equations: only, so no schedule is generated. For data element , T1 reads it and T2 writes it. So 2 schedules j i 1 1 2 1 2 2 IN_DB[T] = È (OUT_DB[T]). are generated: and . Based // Union over all predecessors Ti of Tj on the same reasoning, < T3, T4, c_id> and < T3, T4, OUT_DB[Tj] = IN_DB[Tj] – KILL_DB[Tj] È ol_amount> are added to HVPair for example 2 and j 3 5 4 GEN_DB[T] schedule is generated. Notice that there are no pairs HVPair < T3, T5, c_id> and < T4, T5, c_id> IN_HV[Tj] = È (OUT_HV[Ti]) because T3 (or T4) and T5 do not appear in the same // Union over all predecessors Ti of Tj execution path. It is clear that, by using our proposed OUT_HV[Tj] = IN_HV[Tj] –KILL_HV[Tj] È flow-sensitive analysis techniques, the number of GEN_HV[Tj] generated schedules can be reduced dramatically.

After that, IN_DB, GEN_DB, IN_HV, and GEN_HV can be 4.3. Generation and execution of test cases for used to derive pairs of transactions (denoted as DBPair) schedule s accessing the same data element and pairs of transactions (denoted as HVPair) accessing the same data element To enforce the generated schedule (or partial through host variables in the same execution path. If data schedule), test cases must be generated in such a way that element (table_name, attr_name) is in a node’s IN_DB set transactions in the given schedule manipulate the same and GEN_DB set, then a tuple is added to the set DBPair indicating that documentation of the application design and transactions xact1 and xact2 access the same data implementation, testers can get some ideas about each element (table_name, attr_name). Similarly, if a host transaction. For instance, for each test template, testers variable defined in transaction xact1 reaches xact2 which can figure out which transactions will be executed. uses this host variable, then a tuple is added to HVPair. The procedures for appropriate test templates and then instantiate them. computing DBPair and HVPair sets are given in Figure 5. 1 2 In example 1, the test template for schedule involves the executions of T1 and T2 . For a test case j A GEN_DBPair( T ) for this test template, input to T1 , T2 , and T2 must be i A A B For each tuple (T, READ/WRITE, Tname, generated appropriately such that the executions of these Aname) in IN_DB[Tj] transactions indeed follow the given schedule. To make instrumentation tool finds the starting points of all sure that transactions operate on the same tuple in the transactions involved in the schedule, and adds a function database, some parameters (host variables) in the call to wait_for_my_turn(). This function will access the preconditions of these transactions must be instantiated scheduling information in the shared memory, and check if with the same values. it is the current transaction’s turn to execute. If not, the Generally speaking, for a test case for a database transaction yields its execution and keeps on waiting. application, the DBMS scheduler will execute the test Otherwise, the instance will execute until the end of the according to its own scheduling policy, and the schedule current transaction. The above procedure is repeated until chosen by the DBMS may not be the one in which we are all instances are finished. Similar techniques discussed in interested. To guarantee that the test is executed the previous method can be used to handle infeasible or according to the desired schedule, we suggest two non-executable schedules. methods, modification of the DBMS transaction manager In the first method, application instances are and instrumentation of the application. Both methods use independent of each other and they can run on different shared memory and semaphores for synchronization and machines. The second method is independent of the coordination of different application instances. Shared DBMS systems and has better portability and flexibility. In memory is used since it is the fastest form of IPC (Inter- section 6, we will compare their overhead and efficiency. process Communication) and shared data does not need to be copied between processes [12]. 5. Testing transaction atomicity/durability In a DBMS system, the transaction manager monitors transaction executions, guarantees transaction ACID The property of atomicity/durability is ensured by properties, and recovers from failure [3]. To modify the DBMS systems . The atomicity property requires that we transaction manager so that it will execute our tests execute a transaction to completion. If a transaction fails according to our policy, we consider two scenarios. First, to complete for some reason (e.g., system crash), the the schedule is a complete one, i.e., the execution order of DBMS system must undo any effect the transaction all transactions is specified. We modify the transaction imposed on the database. However, if a transaction is manager so that it reads information about the desired chopped into two or more transactions due to buggy schedule (e.g., from a configuration file) into its shared design/implementation, then the DBMS system can not memory. Semaphore is used to control the access by all recover if the application fails after the first transaction DBMS back-ends. Based on the connection identifier (i.e. commits. Again, this is the offline concurrency problem process identifier) and transaction identifier, the which can not be solved by the DBMS itself. Example 3 is transaction manager can determine if the current such a buggy implementation of balance transfer in a bank application instance is the desired one. If not, the manager application. just schedules it in a normal way. Otherwise, the manager will not schedule the instance until it is the instance’s Example 3: balance transfer turn. The second scenario is that only a partial schedule is CREATE TABLE checking (acctNum INT, balance generated and the execution order of a subset of INT); transactions is specified. Different transactions may CREATE TABLE saving (acctNum INT, balance contain the same query. To distinguish transactions, we INT); assign a static identifier (xid) for each transaction by Transfer ( ) { adding the following statement to the head of each SELECT balance into :out_balance FROM transaction: saving WHERE acctNum = :in_acct1; Select * from xact_table where id = xid If (:out_balance > :amount) { By xid, the transaction manager can identify whether or UPDATE saving SET balance = balance - not the current transaction should be regulated and :amount WHERE acctNum = :in_acct1; subjected to the specified schedule. If the specified COMMIT; //T6, bug schedule is infeasible or non-executable for the given test UPDATE checking SET balance = balance case, the transaction manager will raise an exception and + :amount WHERE acctNum = abort the test after waiting for some given time. :in_acct2; The second way to execute a schedule is to add control COMMIT; //T7 } in the application source code. Similar to the first method, } we read information about the desired schedule into Figure 6 Partial C program for bank transfer. shared memory from a configuration file. The shared memory can be accessed by all instances; however, the access is mutually exclusive via semaphores. The To test atomicity/durability problems, we instrument instrumentation of application source code in terms of the database application source code in the following way. running time. For each transaction in the test, we find its end point (commit or rollback), and insert code which sends a signal Table 6: Overhead based on TPC-C transaction to Agenda’s Validator and then suspends on the pause( ) new_order order-status delivery function. After verifying that database is in a consistent S0 7.980 1.049 5.465 state, the State Validator will send a signal back to the S1 8.946 1.123 6.277 application to resume its execution. To verify database S2 16.186 3.954 14.428 states, the State Validator needs to know the OH1 12% 7% 15% preconditions and post-conditions of transactions. For OH2 102% 277% 264% instance, in example 3, the precondition and post- condition are that the sum of two accounts involved does In the TPC-C application, there are 34 queries. To not change. This knowledge is then converted into a evaluate the number of schedules our proposed methods check constraint in the log tables. will generate, we design the following experiments.

Suppose that each single query is considered as a 6. Preliminary evaluation transaction, and the data and control flows of the original program are kept unchanged. Then we can find or We have implemented the proposed methods and compute the total number of schedules consisting of 3 measured some aspects of their performance on the TPC-C transactions for the following 4 different situations, and benchmark. We performed all experiments on the platform their results are given in Table 7. of Sun ULTRA 10 workstation. The CPU clock rate is 440 W1: any 3 transactions. Mhz. Main memory is 384 MB. The TPC-C application is W2: 2 of the 3 transactions in the same application implemented in C programming language with embedded instance access the same data element or host variable. SQL. The instrumentation is implemented in Perl. TPC W3: 2 of the 3 transactions in the same application Benchmark™ C (TPC-C) is the standard benchmark for instance are generated from DBPair/HVPair. online transaction processing (OLTP). It is a mixture of W4: 2 of the 3 transactions in the same application read-only and update-intensive transactions that simulate instance are generated from DBPair/HVPair; only those the activities found in a complex OLTP [7]. The TPC-C schedules are considered which match the patterns that application models a wholesale supplier managing orders may reveal concurrency errors in table 4 or table 5. and stocks. Experiments based on TPC-C are studied to test transaction consistency in our related work [19]. The Table 7: Schedules analysis five OLTP transactions are new-orders, payment, order- Situation W1 W2 W3 W4 status, delivery, and stock-level. Schedules 39304 916 127 11 To evaluate the time overhead of the proposed methods, we chop each original transaction into 2 As we can see from Table 7, by taking account of the transactions arbitrarily. This introduces potential information about the data element, the number of concurrency faults. We run the TPC-C application in three schedules to consider can be reduced significantly different ways and record their corresponding CPU elapse (W2/W1= 2.33%). Similarly, if we also consider data flow time. information, we can further reduce the number of S0: The original five transactions are executed without schedules (W3/W2 = 13.43%). Finally, our proposed any instrumentation and without scheduling enforcement. method also considers the access patterns; the number of S1: The desired schedule is enforced by the DBMS interesting schedules is the smallest (W4/W3 = 8.66%). transaction manager. S2: The transactions are instrumented in the application source code according to the desired schedule. 7. Related work We run S0, S1 and S2 separately for 5 times and record their total elapse time in the first three rows of Table 6. The The issue of test suite adequacy for database-driven unit of time is seconds. The last two rows of Table 6 show applications is addressed in [21]. A dataflow-based family the overhead for situations S1 and S2, defined as OHi = of test adequacy metrics is proposed to assess the quality (Si-S0)/S0 * 100% (i=1 or 2). of test suites where a database interaction control flow As we see from Table 6, the overhead for S1 is much graph is given as a unified representation. smaller than that for S2, indicating that modification of the Model checking tools [15] systematically explore the DBMS transaction manager is more efficient than state spaces of concurrent/reactive software systems. They have been shown to be effective in verifying many types of properties, including checking for assertion offline concurrency problems in database applications. A violations [15, 20]. However, model checking depends on a data flow analysis technique is used to identify interesting manageable number of program states in the model which and legal schedules which may reveal concurrency errors. represents the specified program, so it cannot be directly Two approaches are suggested to execute a given applied to testing a database application implementation schedule. The instrumentation method is also used to test considered in this paper. atomicity/durability. Preliminary empirical evaluation The instrumentation technique is widely used in white- based on the TPC-C benchmark is presented and box testing. Contest [8] is a tool for detecting demonstrates our approach’s effectiveness and efficiency. synchronization faults in multithreaded Java programs. In database applications, we believe that the unit test The program under test is instrumented with sleep(), involves testing transaction consistency; the integration yield(), and priority() primitives at points of shared test involves testing transaction concurrency; and the memory accesses and synchronization events. At run system test involves testing atomicity/durability. To time, based on random or coverage-based decisions, expose potential problems associated with “dirty read”, Contest can determine whether the seeded primitive is to “non-repeatable read”, and “phantom”, we plan to extend be executed. A replay algorithm facilitates debugging by our tools to control schedules at the query level. We also saving the orders of shared memory accesses and plan to refine our schedule generation algorithm to avoid synchronization events. This kind of instrumentation deadlock. technique is integrated into our tools for testing database applications. 9. Acknowledgments Race condition is the major concurrent error which occurs in concurrent systems. ExitBlock [10] is a practical Thanks to David Chays, Gleb Naumovich, Torsten testing algorithm that systematically and deterministically Suel, and Alex Delis for helpful discussions. finds program errors resulting from unintended timing dependencies. ExitBlock executes a program or a portion 10. References of a program on a given input multiple times, enumerating meaningful schedules in order to cover all program [1] D. Chays, P. Frankl, et al. “A Framework for Testing behaviors. However, in database applications, a race Database Application” ACM International Symposium on condition occurs when two or more application instances Software Testing and Analysis, Portland, Oregon, 2000. try to access the same data element in the database. This [2] H. Berenson, et al. “A critique of ANSI SQL Isolation can be controlled by the transaction manager of the Levels” SIGMOD95, San Jose, CA USA 1995. DBMS system. The major concern in database [3] J. Gray, A. Reuter, Transaction Processing: Concept applications is how to group queries into transactions so and Techniques, Morgan Kaufmann Publishers Inc., 1993. that a balance between efficiency and robustness can be [4] H. Garcia-monlina, J.D. Ullman, J. Widom, Database achieved. Systems: the Complete Book, Prentice Hall, 2000. Flow analysis techniques are used in many fields such [5] D. Chays, Y. Deng, “Demonstration of Agenda Tool for as compilation, code analysis, and reverse engineering. Testing Database Applications”, International Reachability testing [13] systematically executes all Conference on Software Engineering, Portland, Oregon, possible orderings of operations on shared variables in a USA, 2003. program consisting of multiple processes. It parallelizes [6] J. Melton, A.R. Simon, SQL: 1999 Understanding the executions of different schedules to speed up testing. Relational language components, Morgan Kaufmann Reachability testing requires explicit declarations of which Publishers Inc., 2002. variables are shared. Flow analysis techniques can be [7] TPC-Benchmark C. Transaction Processing Council. used to generate flow-sensitive schedules in database http://www.tpc.org, 2002. applications. [8] O. Edelstein, E. Farchi, et al. “Multithreaded Java

program test generation”, IBM Systems Journal, Vol. 41, 8. Conclusions and future work 2002. [9] PostgreSQL Global Development Team, PostgreSQL, To test database applications, we have proposed a http://www.postgresql.org, 2002. framework and have designed and implemented a tool set [10] D. Bruening, J. Chapin, “Systematic testing of to partially automate the testing process. In this paper, we multithreaded programs”, MIT/LCS Technical Memo, LCS- extend our previous work to test transaction concurrency TM 607, April 2000. and transaction atomicity/durability at the isolation level of SERIALIZABLE. We have identified the potential [11] D. Chays, Y. Deng, et al. “An AGENDA for Testing [17] P. O’neil, Database: Principles, Programming, and Relational Database Application”, Technical Report (TR- Performance, Morgan Kaufmann Publishers, 2000. CIS-2002-04), CIS Department, Polytechnic University. [18] A.V. Aho, R. Sethi, and J.D. Ullman. Compipler: [12] W.R. Stevens, Advanced programming in the UNIX Principles, Techniques, and Tools, Addison-Wesley, environment, Addison-Wesley, 1993. 1988. [13] G. H. Hwang, K. C. Tai, and T. L. Huang, “Reachability [19] Y. Deng, D. Chays and P. Frankl, “Testing database testing: an approach to testing concurrent software”, Int. transaction consistency”, Technical Report, CIS Journal on Software Engineering and Knowledge Department, Polytechnic University, 2003. Engineering, Vol. 5, No. 4, 1995, 493-510. [20] H. Hong, S. Cha, I. Sokolsky, and H. Ural, “Data Flow [14] M. Fowler, Patterns of Enterprise Application Testing as Model Checking”, 25th international conference Architecture, Addison-Wesley & Benjamin Cummings, on software engineering, Portland, Oregon, 3-10 May, 2003. 2003. [15] G. Holzmann, “The Spin Model Checker”, IEEE Trans. [21] G. Kapfhammer and M. Soffa, “A Family of Test on Software Engineering, May 1997. Adequacy criteria for Database-Driven Applications”, [16] T.J. Ostrand and M.J. Balcer. “The category-partition ESEC/FSE’03, Helsinki, Finland, September 2003. method for specifying and generating functional tests”. Comm. ACM, Vol. 31 no. 6, 1988.