A Comprehensive Framework for Testing Database-Centric Software Applications
Gregory M. Kapfhammer Department of Computer Science University of Pittsburgh
PhD Dissertation Defense April 19, 2007
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 1 Dissertation Committee
Director: Dr. Mary Lou Soffa (University of Virginia)
Dr. Panos Chrysanthis (University of Pittsburgh) Dr. Bruce Childers (University of Pittsburgh) Dr. Jeffrey Voas (SAIC)
With support and encouragement from countless individuals!
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 2 Motivation
The Risks Digest, Volume 22, Issue 64, 2003
Jeppesen reports airspace boundary problems
About 350 airspace boundaries contained in Jeppesen NavData are incorrect, the FAA has warned. The error occurred at Jeppe- sen after a software upgrade when information was pulled from a database containing 20,000 airspace boundaries worldwide for the March NavData update, which takes effect March 20.
Important Point: Practically all use of databases occurs from within applica- tion programs [Silberschatz et al., 2006, pg. 311].
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 3 Research Contributions
Comprehensive framework that tests a program’s interaction with the complex state and structure of a database
Database interaction fault model Database-aware representations Test adequacy Test coverage monitoring Regression testing Worst-case analysis of the algorithms and empirical evaluation with six case study applications
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 4 Traditional Software Testing P Input Output
a 5 print ... exit Final Result: 45 Byte Code
Virtual Graphical Database Machine Interface
File Operating System System
Execution Environment Defects (e.g., bugs, faults, errors) can exist in program P and all aspects of P ’s environment
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 5 Testing Environment Interactions P Input Output
a 5 print ... exit Final Result: 45
Virtual Machine Database Operating System File System
Execution Environment
Defects can also exist in P ’s interaction with its environment
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 6 Focus on Database Interactions
P m
select insert update delete
D1 De
Program P can view and/or modify the state of the database
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 7 Types of Applications
Database−Centric Applications
Interaction Approach Program Location
Embedded Interface Inside DBMS Outside DBMS
Testing framework relevant to all types of applications Current tool support focuses on Interface-Outside applications
Example: Java application that submits SQL Strings to HSQLDB relational database using JDBC drivers
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 8 Research Contributions
Database Interaction Fault Model Test Adequacy Criteria Test Coverage Monitoring Regression Testing Reduction Prioritization
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 9 Database Interaction Faults: (1-v)
P uses update or P m insert to incorrectly modify items within database insert update Commission fault that violates expected database validity Database-aware before after adequacy criteria actual can support fault isolation
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 10 Database Interaction Faults: (1-c)
m P uses delete to P remove incorrect items from database delete Commission fault that violates database expected completeness Database-aware before after adequacy criteria can actual support fault isolation
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 11 Data Flow-Based Test Adequacy P
3 delete from R define(R) where A > 100 m i
6 select * from R use(R)
The intraprocedural database interaction association hn3, n6, Ri exists within method mi
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 12 Research Contributions
Database Interaction Fault Model Test Adequacy Criteria Test Coverage Monitoring Regression Testing Reduction Prioritization
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 13 Test Adequacy Component
P, C
Representation ICFG Constructor Database-Aware Database Representation Constructor Database Database Interaction Entities Database Interaction ICFG Data Flow Analyzer (DI-ICFG) Analyzer
Database Interaction Associations
Process: Create a database-aware representation and perform data flow analysis Purpose: Identify the database interaction associations (i.e., the test requirements)
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 14 Database-Aware Representation
Database interaction qu_lck = "UPDATE UserInfo ..." + temp1 + ";" graphs (DIGs) are update_lock = m_connect.createStatement() placed before interaction
entry G entry Gr 1 A r2 D point define(temp3) define(temp2) Multiple DIGs can be use(temp4) exit Gr1 integrated into a single exit G r 2 CFG Ir result_lock = update_lock.executeUpdate(qu_lck) Analyze interaction in a if( result_lock == 1) control-flow sensitive completed = true fashion exit lockAccount
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 15 Data Flow Time Overhead
P P+D P+R P+Rc P+A P+Av 25
20.50 20.74 20.77 20.78 20.94 21.06 20 L 15 sec H
10 Time
5
P P+D P+R P+Rc P+A P+Av Database Granularity
2.7% increase in time overhead from P to P + Av (TM)
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 16 Research Contributions
Database Interaction Fault Model Test Adequacy Criteria Test Coverage Monitoring Regression Testing Reduction Prioritization
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 17 Database-Aware Coverage Monitoring
Program
Test Suite
Adequacy Criteria Instrumented Program Instrumentation Instrumented Test Suite
Test Suite Coverage Execution Results
Test Adequacy Requirements Calculation
Adequacy Measurements
Purpose: Record how the program interacts with the database during test suite execution
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 18 Database-Aware Instrumentation
Test Coverage Monitoring Instrumentation
Interaction Location Interaction Type
Program Test Suite Defining Using Defining-Using
Efficiently monitor coverage without changing the behavior of the program under test Record coverage information in a database interaction calling context tree (DI-CCT) PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 19 Configuring the Coverage Monitor
Configuration of the Test Coverage Monitor
Instrumentation Tree Format Tree Type Tree Storage
Static Dynamic Binary XML Traditional Database-Aware Standard Compressed
Source Code Bytecode CCT DCT Interaction Level DI-DCT DI-CCT
Database Relation Attribute Record Attribute Value
Flexible and efficient approach that fully supports both traditional and database-centric applications
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 20 Static Instrumentation: Time
FF PI RM ST TM GB All
10 L 8.687 sec H 8 Time
6 5.583 5.169
4.391 4.404 4.396 4.394
4 Instrumentation
2 Static
FF PI RM ST TM GB All Application
Attach probes to all of the applications in less than nine seconds Static approach is less flexible than dynamic instrumentation PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 21 Static Instrumentation: Space
A As
ZIP GZIP PACK
80000 75782 L 68456 68475 bytes H 60000 Size
40000
25084 19419 20000
Application 9034
ZIP GZIP PACK Compression Technique GB
Increase in bytecode size may be largeH L (space vs. time trade-off)
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 22 Static vs. Dynamic Instrumentation
Norm Sta -CCT Sta -DCT Dyn -CCT Dyn -DCT 14
11.435 12 11.084
10 L 8.026 7.626 sec 8 H 6.939
Time 6 TCM 4
2
Norm Sta -CCT Sta -DCT Dyn -CCT Dyn -DCT Instrumentation Technique - Tree Type GB
Static is faster than dynamic / CCT is HfasterL than DCT The coverage monitor is both efficient and effective
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 23 Size of the Instrumented Applications
Compr Tech Before Instr (bytes) After Instr (bytes) None 29275 887609 Zip 15623 41351 Gzip 10624 35594 Pack 5699 34497
Average static size across all case study applications Compress the bytecodes with general purpose techniques Specialized compressor nicely reduces space overhead
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 24 Database Interaction Levels
CCT Interaction Level TCM Time (sec) Percent Increase (%) Program 7.44 12.39 Database 7.51 13.44 Relation 7.56 14.20 Attribute 8.91 34.59 Record 8.90 34.44 Attribute Value 10.14 53.17
Static instrumentation supports efficient monitoring 53% increase in testing time at finest level of interaction
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 25 Research Contributions
Database Interaction Fault Model Test Adequacy Criteria Test Coverage Monitoring Regression Testing Reduction Prioritization
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 26 Database-Aware Regression Testing
Original Test Suite Program
Reduction Modified Test Suite Begin Coverage Report Testing Results End or Prioritization Test Suite Execution
Program or Database Changes
Version specific vs. general approach Use paths in the coverage tree as a test requirement Prioritization re-orders the test suite so that it is more effective Reduction identifies a smaller suite that still covers all of the requirements
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 27 Configuring the Regression Tester
Configuration of the Regression Tester
Technique Test Cost Requirements
Reduction Prioritization Type Unit Actual Type Traditional Database-Aware
Greedy Reverse Random Data Flow Coverage Tree Path
Overlap-Aware Not Overlap-Aware Unique Dominant Type of Tree
Cost Coverage Ratio Super Path Containing Path DCT CCT
Regression testing techniques can be used in the version specific model PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 28 Research Contributions
Database Interaction Fault Model Test Adequacy Criteria Test Coverage Monitoring Regression Testing Reduction Prioritization
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 29 PSfrag replacements Id : graphCT.ladot, v1.22006/07/1220 : 29 : 00gkapfhamExp Revision : 1.2
R R1 R R2 R R3 R R4 R R5 R R6
R7 R1 T1
R2
T8 R3 R3
R3 T7 R3
R3 T11
R4
R4 R4 R4 T5 Finding the Overlap in Coverage R4 R4 T10 R
R5
R6 R2 R1 R3 R4 R6 R7 R5
T10 R7 R7
T8 T2 T4 T12 T1 T3 T11 T7 T9 T10 T5 T6
Rj → Ti means that requirement Rj is covered by test Ti
T = hT2, T3, T6, T9i cover all of the test requirements
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 30 Reducing the Size of the Test Suite
App Rel Attr Rec Attr Value All
RM (13) (7, .462) (7, .462) (10, .300) (9, .308) (8.25, .365) FF (16) (7, .563) (7, .563) (11, .313) (11, .313) (9, .438) PI (15) (6, .600) (6, .600) (8, .700) (7, .533) (6.75, .550) ST (25) (5, .800) (5, .760) (11, .560) (10, .600) (7.75, .690) TM (27) (14, .481) (14, .481) (15, .449) (14, .481) (14.25, .472) GB (51) (33, .352) (33, .352) (33, .352) (32, .373) (32.75, .358)
All (24.5) (12, .510) (12.17, .503) (14.667, .401) (13.83, .435)
Reduction factor for test suite size varies from .352 to .8
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 31 Reducing the Testing Time
GRO GRC GRV GRR RVR RAR
Rc Av
1 0.94 0.94
0.81 0.81 Time 0.78 0.79 0.78 0.79 0.8 -
0.6 Factor
0.39 0.36 0.4 Reduction
0.2
0.06 0.06
Rc Av Database Interaction GB
GRO reduces test execution time HevLen though it removes few tests
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 32 Preserving Requirement Coverage
GRO GRC GRV GRR RVR RAR
Rc Av
1. 1. 0.98 0.99 0.98 0.96 1 0.94 0.91
0.8 Coverage 0.71 0.72 -
0.6 Factor
0.4 0.3
0.2 Preservation 0.09
Rc Av Database Interaction GB
GRO guarantees coverage preserH L vation - others do not
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 33 Research Contributions
Database Interaction Fault Model Test Adequacy Criteria Test Coverage Monitoring Regression Testing Reduction Prioritization
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 34 Cumulative Coverage of a Test Suite
n−1 Cover Π(T1) Cover Si=1 Π(Ti)
PSfrag replacements
) Cover Π(T ) t , T ( T Done
κ n . . . t(n) Area R0 κ(T, t) Covered Test Reqs
Testing Time (t)
T1 Done Tn−1 Done Calculate coverage effectiveness of a prioritization : Actual Ideal PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 35 Improving Coverage Effectiveness
GPO GPC GPV GPR RVP ORP RAP
Rc Av
1 0.94 0.93 0.94 0.93 0.9 0.88 0.87 0.87 0.86 0.84
0.8
Effectiveness 0.56 0.6 0.55
0.4 Coverage 0.22 0.22
0.2
Rc Av Database Interaction GB
GRO is the best choice - originalH Lordering is poor
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 36 Conclusions
Practically all use of databases occurs from within application programs - must test these interactions! Incorporate the state and structure of the database during all testing phases Fault model, database-aware representations, test adequacy, test coverage monitoring, regression testing Experimental results suggest that the techniques are efficient and effective for the chosen case study applications A new perspective on software testing: it is important to test a program’s interaction with the execution environment
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 37 Future Work
Avoiding database server restarts during test suite execution Time-aware regression testing Input simplification and fault localization during debugging Reverse engineering and mutation testing New environmental factors: eXtensible markup language (XML) databases Distributed hash tables Tuple spaces Further empirical studies with additional database-centric applications and traditional programs PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 38 Further Resources
Gregory M. Kapfhammer and Mary Lou Soffa. A Family of Test Adequacy Criteria for Database-Driven Applications. In Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engi- neering, 2003. (Distinguished Paper Award)
Gregory M. Kapfhammer. The Computer Science Handbook, chapter 105: Software Testing. CRC Press, Boca Raton, FL, Second Edition, June 2004.
Gregory M. Kapfhammer, Mary Lou Soffa, and Daniel Mossé. Testing in Resource-Constrained Exe- cution Environments. In Proceedings of the 20th ACM/IEEE International Conference on Automated Software Engineering, Long Beach, California, November, 2005
Kristen R. Walcott, Mary Lou Soffa, Gregory M. Kapfhammer, and Robert S. Roos. Time-Aware Test Suite Prioritization. In Proceedings of the ACM SIGSOFT/SIGPLAN International Symposium on Software Testing and Analysis, Portland, Maine, June, 2006.
Gregory M. Kapfhammer and Mary Lou Soffa. A Test Adequacy Framework with Database Interac- tion Awareness. ACM Transactions on Software Engineering and Methodology. Invited Paper Under Review.
PhD Dissertation Defense, University of Pittsburgh, April 19, 2007 – p. 39