Test-L[Iuen Lleuelopment Ol Belational Llatabases

I0$l-[[ill0[ ll 0 ll s I ll I lll 0 ll I . o . Test-l[iuen lleuelopment ol Belational llatabases Scott W. Ambler, rrM n test-first development, developers formulate and implement a de- lmplemenling tailed design iteratively one test at a time.1,2 Test-driven development lesl-driven (also called test-driven design) combines TFD with refactoring,3 dolshuse design wherein developers make small changes (refactorings) to improve involves dolobose code design without changing the code's semantics.'Sfhen developers decide refocloring, to use TDD to implement a new feature, they must first ask whether the cur- regression lesling, rent design is the easiest possible design to enable the feature's addition. If ond conlinuous so, they implement the feature via TFD. If not, because sometimes you'll write a developer test inlegrolion. they first refactor the code, then use TFD. In that specifies application code behavior, while Although technicolly the behavior-driven deveiopment4 approach to other times it will specify database behavior. naming and TDDD is important for several reasons. slroightforword, TDD, the conventions terminology make the developer's primary goal explicit: to First, a1l of application TDD's benefits apply to TDDD involves specify, rather than validate, the system. TDDD: you can take small, safe steps; refactor- culturol chollenges Test-driven database design applies TDD to ing lets you maintain high-quality design database development. In TDDD, developers throughout the life cycle; regression testing lets lhot slow ils implement critical behaviors (including business you detect defects earlier in the li{e cycle; and, odoplion. rules about data invariants, such as "The only because TDDD gives you an executable system valid colors are red, blue, and purple") and specification,you'remotivatedtokeepitup-to- business functionality within relational data- date (unlike with traditional design documenta- bases. Theret nothing special about relational tion). Moreover, with TDD, your database de- databases-we specify database behaviors via velopment efforts effectively dovetail into your database tests in the same way we implement overali application development efforts. The IT application-code behaviors via developer tests. community has long suffered because of the TDDD is not a stand-alone activity. Itt best cultural mismatch between application devel- viewed as simply the database aspects of TDD. opers and data professionals.5 This mismatch That is, you should develop your database results largely from differing philosophies and schema in lock-step with your application codes ways of working. Modern methodologies, in- 0740-74591071$25.00 @ 2007 rEEE May/June 2007 IEEE s0FTWABE 37 r SQL statement calculations return the expected results, a particular user ID can't access specific information, or I a particular user ID can access specific information. Interface tests are fairly straightforward to implement because good tools exist for many platforms, especially the xUnit framework. Commercial tools are also viable options. My - Clear-box testing advice: use the same tools that you're using for o Stored procedures/funGlions a pplication regression resring. Black-box teslingl . Triggers . Data values being persisted o Views o Dala values being relrieued . Constrainls lnternal testing. Internai database tests specify . Stored procedures/functions . Existing quality data both behaviors that the database implements 4... Relerential inte0rityidata consistency 4... and data validity. There's nothing special about stored procedure code-if you can write Figure l. Testing a tests for Java code, surely you can write tests relational database. cluding Rational Unified Process, Extreme Pro- for procedural language extensions to SQL llevelopers must test gramming, and Dynamic System Development (PLSQL) code. According to current rhetoric, the database both Method, work in an evolutionary (iterative and data is a corporate asset-shouldn't you there- internally and at its incremental) manner. It therefore behooves us fore specify what kind of data you intend to interface-the to find techniques that let data professionals store, along with the data's invariants? From equivalent oI clear-box work in an evolutionary manner with their ap- the database's viewpoint, these are effectively testing and black-box plication developer counterparts. TDDD is one clear-box tests. Such tests might specify testing, respectively. such technique. r some of a stored procedure's behavior, Extending Tllll to database r a validity check for a paramerer being development passed to a stored procedure, To extend TDD to database development, r a validity check for the value that a stored we need database equivalents of regression test- procedure returns, ing, refactoring, and continuous integration. I a stored procedure to return an expected error code, Ilatabase legression testing I a column invariant, In database regression testing,6,7 you regu- I a referentiai integrity rule, and larly validate the database by running a com- I a column default. prehensive test suite-ideally whenever you change the database schema itself or access the Tools for internal database testing are a bit database in a new way. As the threat bound- harder to come by, but that's changing. For ex- (dashed ariess lines) in figure 1 show, you must ample, tools such as Quest's Qute validate Or- test both the database interface and the database acle PLSQL code and Microsoft's Visual Stu- itself. dio Team System for Database Professionals includes testing tools for SQL Server. 'We hfurtace testing. use database interface tests to specify how systems will access the database. Testing example. The following example illus- From the database's viewpoint, these are effec- trates regression testing. Assume that we're tively black-box tests. Such a test might specify building a banking system and have two fea- a portion of a hard-coded SQL statemenr that tures to implement: Retrieve Account from you might submit to the database. It might also Database and Save Account to Database. Let's specify that also assume that this is the first time we've run into the Account concept and that we're im- r an SQL statement returns an expected result, plementing these two features. For account re- r a view produces an expected result, trieval, we'd run several database tests (one at a 3 8 tEEE s0FTWARE www.computer.org/software time), including shouldExistOneOrMoreAccounts- ForCustomer, shouldExistOneorMorecustomers- ForAccount, shouldHaveUniqueAccountlD, and shouldHavePositiveAccountBalance. For saving an account into the database, we'd add tlvo new tests: shouldBeAssignedUniqueAccountlDOn- Creation and shouldBeAssignedZeroBalanceOn- Creation. I'm using the BBD "should" naming conventions4 here, but the "traditional" TDD naming convention (stating tests with "test") would also work fine. With a TDDD approach, we'd write each of these tests one at a time, then evolve the database schema to fulfill the test-specified functionality. Consider the test shouldHavePositiveAc- countBalance. If this is the first time we've considered the account-balance concept, we'd add the Account.Balance column. Depending on our database design standards, implementing such a data-oriented business rule could moti- vate us to also add a column constraint or an update trigger that checks the value of Ac- count.Balance. Regardless of the implementa- tion strategy, it's critical to validate that your database properly supports the business ruie. There's no magic to identifying a database Figure 2.Ilatabase test-the advice presented in this special is- renaming a column, splitting a multiple-use col- relactoring. Ileuelopers sue's other articles, as well as in TDD books, umn, and parameterizing a database method to euolue the database is pertinent to TDDD.1,2 The primary differ- consolidate several similar ones. sehema in their own ence in TDDD is that we must recognize how Conceptually, a database refactoring is more sandboxes belore important it is to validate both the functional- difficult than a code refactoring because you sharing the change ity implemented by, and the data contained must preserve informational semantics along with their teammates. within, the database. A database is nothing with behavioral semantics. Preserving informa- special-like any other system component. it tional semantics implies that if you change the must be validated. values of a column's data, the change shouldn't materially affect that information's clients. For Ilatabase rcIactoring example, assume that a phone number is stored Refactoring is a disciplined approach to re- in character-based format. If you change the structuring code: you make small changes to value from "(416) 555-1212" to "416555121.2," improve the code's design, without changing its you haven't materially changed the informa- behavioral semantics-that is, you neither re- tion's semantics, as long as all that column's move nor add behavior.3,e Refactoring lets you clients can still work with the new format. In improve your code over time in a safe, evolu- contrast, changing the value to "(416) 555- tionary manner. Similarly in database refactor- 5555" clearly changes the informational se- ing, you make a simple change to a database mantics. Similarly, to preserve behavioral se- that improves its design, while retaining both mantics, you must keep the same black-box its behavioral and informational semantics. functionality. The implication? You must re- Databases include structural

Test-L[Iuen Lleuelopment Ol Belational Llatabases

Ssql-Schema-Comparer: Support of Multi-Language Refactoring With

Automated Testing of Database Schema Migrations

LESSQL: an Approach to Deal with Database Schema Changes in Continuous Deployment / Ariel Antony Afonso

Devops for the Database

Enhancing Database-Centric Web Applications Through Informed Code Generation

Artificial Intelligence and Knowledge Engineering

Database Schema Software Open Source

Refactoring Rules for Graph Databases Regras De

Database Refactoring with Liquibase

Refactoring Has Proven Its Value in a Wide Range of Development

Database Schema Design Best Practices

Database Refactoring to Increase/Retrieve Information from Precision Agriculture Information System