M. Weintraub and TESTING STRATEGIES F. Tip
Thanks go to Andreas Zeller for allowing incorporation of his materials TESTING
Testing: a procedure intended to establish the quality, performance, or reliability of something, esp. before it is taken into widespread use.
2 SOFTWARE TESTING
Software Testing is the process of exercising a program with the specific intent of finding errors prior to delivery to the end user.
3 TECHNIQUES FOR EVALUATING SOFTWARE
Testing Program (dynamic Analysis verification) (static or dynamic)
Inspections Proofs (static verification) (static verification) THE CURSE OF TESTING
Dijkstra’ Curse: Testing can show the presence but not the absence of errors optimistic
inaccuracy
a test run a test
a test run a test run a test a test run a test
∞ possible runs 5 STATIC ANALYSIS
We cannot tell whether this condition ever if ( .... ) { holds ... Zeller’s Corollary:(halting problem) Static Analysis lock(S); can confirm the absence but not } Static pessimistic checking for ... the presence of errorsinaccuracy if ( ... ) { match is ... necessarily inaccurate unlock(S); }
6 COMBINING METHODS
a proof
a proof
a test run a test unverified
a test run a test run a test run a test abstraction properties
∞ possible runs 7 WHY IS SOFTWARE VERIFICATION HARD?
1. Many different quality requirements 2. Evolving (and deteriorating) structure 3. Inherent non-linearity 4. Uneven distribution of faults
8 WHY IS SOFTWARE VERIFICATION HARD?
1. Many different quality requirements 2. Evolving (and deteriorating) structure 3. Inherent non-linearity 4. Uneven distribution of faults
If an elevator can safely carry a load of 1000 kg, it can also safely carry any smaller load
9 WHY IS SOFTWARE VERIFICATION HARD?
1. Many different quality requirements 2. Evolving (and deteriorating) structure 3. Inherent non-linearity 4. Uneven distribution of faults
If an elevator can safely carry a load of If a procedure correctly sorts a set of 256 1000 kg, it can also safely carry any smaller elements, it may fail on a set of 255 or 53 load elements, as well as on 257 or 1023
10 TRADE-OFFS
We can be inaccurate (optimistic or pessimistic)
or we can simplify properties…
but you cannot have it all!
11 RECALL THE WATERFALL MODEL (1968)
Communication project initiation requirements gathering Planning estimating scheduling tracking Modeling analysis design Construction code test Deployment delivery support feedback TESTING FALLS TOWARD THE END…
Communication project initiation requirements gathering Planning estimating scheduling tracking Modeling analysis design Construction code test Deployment delivery support feedback WE TEND TO ASSUME OUR CODE IS DONE WHEN
14 WE TEND TO ASSUME OUR CODE IS DONE WHEN
We built it!
15 BUT, SHALL WE DEPLOY IT?
Is it really ready for release?
16 SO LET’S FOCUS ON THE CONSTRUCTION STEP
Construction code test Waterfall Model (1968)
Construction code test Deployment delivery support feedback V & V: VALIDATION & VERIFICATION
Validation Verification
Ensuring that software has been built Ensuring that software correctly according to customer requirements implements a specific function
Are we building the Are we building the right product? product right?
19 VALIDATION AND VERIFICATION
SW Specs System
Actual Requirements
Validation Verification Involves usability testing, user Includes testing, code feedback, & product trials inspections, static analysis, proofs
20 VALIDATION
if a user presses a request button at floor i, an available elevator must arrive at floor i soon
not verifiable, but may be validated
21 VERIFICATION
if a user presses a request button at floor i, an available elevator must arrive at floor i within 30 seconds
verifiable
2 2 CORE QUESTIONS
1. When does V&V start? When is it done?
2. Which techniques should be applied?
3. How do we know a product is ready?
4. How can we control the quality of successive releases?
5. How can we improve development?
23 WHEN DOES V&V START? WHEN IS IT DONE? WATERFALL SEPARATES CODING AND TESTING INTO TWO ACTIVITIES
Code
Test FIRST CODE, THEN TEST
1. Developers on software should do no testing at all
2. Software should be “tossed over a wall” to strangers who will test it mercilessly
3. Testers should get involved with the project only when testing is about to begin
26 V MODEL
Module Test
From: Pezze + Young, “Software Testing and Analysis” 27 UNIT TESTS
Aims to uncover errors at module boundaries
Typically written by programmer
Should be completely automatic (→ regression testing)
28 TESTING COMPONENTS: STUBS AND DRIVERS
Driver A driver exercises a module’s functions
A stub simulates not-yet-ready modules
Frequently realized as mock objects
Stub Stub
29 PUTTING THE PIECES TOGETHER: INTEGRATION TESTS
General idea: Constructing software while conducting tests
Options: 1. Big Bang 2. Incremental Construction
30 BIG BANG APPROACH
All components are combined in advance
The entire program is tested as a whole
31 BIG BANG APPROACH
All components are combined in advance
The entire program is tested as a whole Chaos results!
For every failure, the entire program must be taken into account.
32 TOP-DOWN INTEGRATION
Top module is tested with stubs (and then used as driver) A Stubs are replaced one at a time StubB Stub Stub (“depth first”) As new modules are integrated, tests are StubC Stub re-run
StubD Stub Allows for early demonstration of capability
33 BOTTOM-UP INTEGRATION
Bottom modules implemented first and Driver combined into clusters
DriverC F Drivers are replaced one at a time
Removes the need for complex stubs D E
34 SANDWICH INTEGRATION
A Combines bottom-up and top-down integration StubB Stub Stub Top modules tested with stubs, bottom Driver modules with drivers.
Combines the best of the two C F approaches
D E
35 ONE DIFFERENCE FROM UNIT TESTING: EMERGENT BEHAVIOR
Some behaviors are only clear when components are put together.
This has to be tested too, although it can be very hard to plan in advance!
(It’s almost always discovered after the fact…)
Causes test suites/cases to be refactored.
36 NEXT CONCERN: WHO OWNS RESPONSIBILITY FOR TESTING THE SOFTWARE?
Developer Independent Tester
understands the system must learn about system May test gently if driven by delivery. Likely won’t be gentle if driven by quality, May miss things because of familiarity. but faces a steep learning curve
37 WEINBERG’S LAW
A developer is unsuited to test his or her code.
From Gerald Weinberg, “The psychology of computer programming” 38 WEINBERG’S LAW
Theory: As humans want to be honest with themselves, developers are A developer is unsuited blindfolded with respect to their own mistakes. to test his or her code. Cynical Theory: People are not entirely honest and can take shortcuts.
So, how might you change the conditions to avoid these pitfalls?
From Gerald Weinberg, “The psychology of computer programming” 39 MAKE EVERYONE A TESTER!
✓ Experienced Outsiders and Clients Good for finding gaps missed by developers, especially domain specific items
✓ Inexperienced Users Good for illuminating other, perhaps unintended uses/errors
✓ Mother Nature Crashes tend to happen during an important client/customer demo…
41 SPECIAL KINDS OF SYSTEM TESTING
Recovery Testing Forces the software to fail in a variety of ways and verifies that recovery is properly performed Security Testing Verifies that protection mechanisms built into a system will, in fact, protect it from unauthorized access/disclosure/modification/destruction Stress Testing Executes a system in a manner that demands resources in abnormal quantity, frequency, or volume
Performance Testing Test the run-time performance of software within the context of an integrated system
42 PERFORMANCE TESTING
Measures a system’s capacity to process a specific load over a specific time-span, usually: ▪ number of concurrent users ▪ specific number of concurrent transactions ▪ other options: throughput, server response time, server request round-trip time, …
Involves defining and running operational profiles that reflect expected use
43 TYPES OF PERFORMANCE TESTING
Load Aims to assess compliance with non-functional requirements Stress Identifies system capacity limits Spike Testing involving rapid swings in loa Endurance Continuous operation at a given load (aka Soak)
44 SECURITY TESTING
Confidentiality Information protection from unauthorized access or disclosure Integrity Information protection from unauthorized modification or destruction Availability System protection from unauthorized disruption
46 ACCEPTANCE TESTING
Acceptance testing checks whether contractual requirements are met
Typically incremental alpha test at production site, beta test at user’s site
Work is over when acceptance testing is done
47 WHICH TECHNIQUES SHOULD BE APPLIED?
48 TECHNIQUES FOR EVALUATING SOFTWARE
Testing Program (dynamic Analysis verification) (static or dynamic)
Inspections Proofs (static verification) (static verification) HOW DO WE KNOW A PRODUCT IS READY?
50 HOW DO WE KNOW WHEN A PRODUCT IS READY?
Let the customer test it :-)
We’re out of time…
Relative to a theoretically sound and experimentally validated statistical model, we have done sufficient testing to say with 95% confidence that the probability of 1,000 CPU hours of failure-free operation is ≥ 0.995.
51 HOW CAN WE CONTROL THE QUALITY OF SUCCESSIVE RELEASES?
52 REGRESSION TESTS
Set up automated tests Using, e.G., Junit
Ideally, run regression tests after each change
If running the tests takes too long: Prioritize and run a subset Apply regression test selection to determine tests that are impacted by a set of changes
53 COLLECTING DATA
54 REMEMBER PARETO’S LAW
Approximately 80% of defects come from 20% of modules
55 HOW CAN WE IMPROVE DEVELOPMENT?
56 STRATEGIC ISSUES
Specify requirements in a quantifiable Build “robust” software that is designed manner to test itself
State testing objectives explicitly Use effective formal technical reviews as a filter prior to testing Understand the users of the software and develop a profile for each user category Conduct formal technical reviews to assess the test strategy and test cases Develop a testing plan that emphasizes themselves “rapid cycle testing” Develop a continuous improvement approach for the testing process
57 TEST-DRIVEN DEVELOPMENT
writing tests before you implement functionality involves extra effort, but…
… it forces you to think about the problem you are trying to solve more concretely and formulate a solution more quickly
…and you will regain the time spent on unit tests by catching problems early and reduce time spent later on debugging
59 TESTING: RECOMMENDATIONS
1. Write tests that cover a partition of the input space, and that cover specific features
2. Achieve good code coverage
3. Create an automated, fast running test suite, and use it all the time
4. Have tests that cover your system’s tests at different levels of functionality
5. Set up your tests so that, when a failure occurs, it pinpoints the issue so that it does not require much further debugging
60