<<

M. Weintraub and TESTING STRATEGIES F. Tip

Thanks go to Andreas Zeller for allowing incorporation of his materials TESTING

Testing: a procedure intended to establish the quality, performance, or reliability of something, esp. before it is taken into widespread use.

2 TESTING

Software Testing is the of exercising a with the specific intent of finding errors prior to delivery to the end .

3 TECHNIQUES FOR EVALUATING SOFTWARE

Testing Program (dynamic Analysis verification) (static or dynamic)

Inspections Proofs (static verification) (static verification) THE CURSE OF TESTING

Dijkstra’ Curse: Testing can show the presence but not the absence of errors optimistic

inaccuracy

a test run a test

a test run a test run a test a test run a test

∞ possible runs 5 STATIC ANALYSIS

We cannot tell whether this condition ever if ( .... ) { holds ... Zeller’s Corollary:(halting problem) Static Analysis lock(S); can confirm the absence but not } Static pessimistic checking for ... the presence of errorsinaccuracy if ( ... ) { match is ... necessarily inaccurate unlock(S); }

6 COMBINING METHODS

a proof

a proof

a test run a test unverified

a test run a test run a test run a test abstraction properties

∞ possible runs 7 WHY IS SOFTWARE VERIFICATION HARD?

1. Many different quality 2. Evolving (and deteriorating) structure 3. Inherent non-linearity 4. Uneven distribution of faults

8 WHY IS SOFTWARE VERIFICATION HARD?

1. Many different quality requirements 2. Evolving (and deteriorating) structure 3. Inherent non-linearity 4. Uneven distribution of faults

If an elevator can safely carry a load of 1000 kg, it can also safely carry any smaller load

9 WHY IS SOFTWARE VERIFICATION HARD?

1. Many different quality requirements 2. Evolving (and deteriorating) structure 3. Inherent non-linearity 4. Uneven distribution of faults

If an elevator can safely carry a load of If a procedure correctly sorts a set of 256 1000 kg, it can also safely carry any smaller elements, it may fail on a set of 255 or 53 load elements, as well as on 257 or 1023

10 TRADE-OFFS

We can be inaccurate (optimistic or pessimistic)

or we can simplify properties…

but you cannot have it all!

11 RECALL THE (1968)

Communication project initiation requirements gathering Planning estimating scheduling tracking Modeling analysis design Construction code test Deployment delivery support feedback TESTING FALLS TOWARD THE END…

Communication project initiation requirements gathering Planning estimating scheduling tracking Modeling analysis design Construction code test Deployment delivery support feedback WE TEND TO ASSUME OUR CODE IS DONE WHEN

14 WE TEND TO ASSUME OUR CODE IS DONE WHEN

We built it!

15 BUT, SHALL WE DEPLOY IT?

Is it really ready for release?

16 SO LET’S FOCUS ON THE CONSTRUCTION STEP

Construction code test Waterfall Model (1968)

Construction code test Deployment delivery support feedback V & V: VALIDATION & VERIFICATION

Validation Verification

Ensuring that software has been built Ensuring that software correctly according to customer requirements implements a specific

Are we building the Are we building the right product? product right?

19 VALIDATION AND VERIFICATION

SW Specs System

Actual Requirements

Validation Verification Involves testing, user Includes testing, code feedback, & product trials inspections, static analysis, proofs

20 VALIDATION

if a user presses a request button at floor i, an available elevator must arrive at floor i soon

not verifiable, but may be validated

21 VERIFICATION

if a user presses a request button at floor i, an available elevator must arrive at floor i within 30 seconds

verifiable

2 2 CORE QUESTIONS

1. When does V&V start? When is it done?

2. Which techniques should be applied?

3. How do we know a product is ready?

4. How can we control the quality of successive releases?

5. How can we improve development?

23 WHEN DOES V&V START? WHEN IS IT DONE? WATERFALL SEPARATES CODING AND TESTING INTO TWO ACTIVITIES

Code

Test FIRST CODE, THEN TEST

1. Developers on software should do no testing at all

2. Software should be “tossed over a wall” to strangers who will test it mercilessly

3. Testers should get involved with the project only when testing is about to begin

26 V MODEL

Module Test

From: Pezze + Young, “ and Analysis” 27 UNIT TESTS

Aims to uncover errors at module boundaries

Typically written by

Should be completely automatic (→ )

28 TESTING COMPONENTS: STUBS AND DRIVERS

Driver A driver exercises a module’s functions

A stub simulates not-yet-ready modules

Frequently realized as mock objects

Stub Stub

29 PUTTING THE PIECES TOGETHER: INTEGRATION TESTS

General idea: Constructing software while conducting tests

Options: 1. Big Bang 2. Incremental Construction

30 BIG BANG APPROACH

All components are combined in advance

The entire program is tested as a whole

31 BIG BANG APPROACH

All components are combined in advance

The entire program is tested as a whole Chaos results!

For every failure, the entire program must be taken into account.

32 TOP-DOWN INTEGRATION

Top module is tested with stubs (and then used as driver) A Stubs are replaced one at a time StubB Stub Stub (“depth first”) As new modules are integrated, tests are StubC Stub re-run

StubD Stub Allows for early demonstration of capability

33 BOTTOM-UP INTEGRATION

Bottom modules implemented first and Driver combined into clusters

DriverC F Drivers are replaced one at a time

Removes the need for complex stubs E

34 SANDWICH INTEGRATION

A Combines bottom-up and top-down integration StubB Stub Stub Top modules tested with stubs, bottom Driver modules with drivers.

Combines the best of the two F approaches

D E

35 ONE DIFFERENCE FROM : EMERGENT BEHAVIOR

Some behaviors are only clear when components are put together.

This has to be tested too, although it can be very hard to plan in advance!

(It’s almost always discovered after the fact…)

Causes test suites/cases to be refactored.

36 NEXT CONCERN: WHO OWNS RESPONSIBILITY FOR TESTING THE SOFTWARE?

Developer Independent Tester

understands the system must learn about system May test gently if driven by delivery. Likely won’t be gentle if driven by quality, May miss things because of familiarity. but faces a steep learning curve

37 WEINBERG’S LAW

A developer is unsuited to test his or her code.

From Gerald Weinberg, “The psychology of programming” 38 WEINBERG’S LAW

Theory: As humans want to be honest with themselves, developers are A developer is unsuited blindfolded with respect to their own mistakes. to test his or her code. Cynical Theory: People are not entirely honest and can take shortcuts.

So, how might you change the conditions to avoid these pitfalls?

From Gerald Weinberg, “The psychology of ” 39 MAKE EVERYONE A TESTER!

✓ Experienced Outsiders and Clients Good for finding gaps missed by developers, especially domain specific items

✓ Inexperienced Users Good for illuminating other, perhaps unintended uses/errors

✓ Mother Nature Crashes tend to happen during an important client/customer demo…

40

41 SPECIAL KINDS OF SYSTEM TESTING

Recovery Testing Forces the software to fail in a variety of ways and verifies that recovery is properly performed Verifies that protection mechanisms built into a system will, in fact, protect it from unauthorized access/disclosure/modification/destruction Executes a system in a manner that demands resources in abnormal quantity, frequency, or volume

Performance Testing Test the run-time performance of software within the context of an integrated system

42 PERFORMANCE TESTING

Measures a system’s capacity to process a specific load over a specific time-span, usually: ▪ number of concurrent users ▪ specific number of concurrent transactions ▪ other options: throughput, server response time, server request round-trip time, …

Involves defining and running operational profiles that reflect expected use

43 TYPES OF PERFORMANCE TESTING

Load Aims to assess compliance with non-functional requirements Stress Identifies system capacity limits Spike Testing involving rapid swings in loa Endurance Continuous operation at a given load (aka Soak)

44 SECURITY TESTING

Confidentiality Information protection from unauthorized access or disclosure Integrity Information protection from unauthorized modification or destruction Availability System protection from unauthorized disruption

45

46 ACCEPTANCE TESTING

Acceptance testing checks whether contractual requirements are met

Typically incremental alpha test at production site, beta test at user’s site

Work is over when acceptance testing is done

47 WHICH TECHNIQUES SHOULD BE APPLIED?

48 TECHNIQUES FOR EVALUATING SOFTWARE

Testing Program (dynamic Analysis verification) (static or dynamic)

Inspections Proofs (static verification) (static verification) HOW DO WE KNOW A PRODUCT IS READY?

50 HOW DO WE KNOW WHEN A PRODUCT IS READY?

Let the customer test it :-)

We’re out of time…

Relative to a theoretically sound and experimentally validated statistical model, we have done sufficient testing to say with 95% confidence that the of 1,000 CPU hours of failure-free operation is ≥ 0.995.

51 HOW CAN WE CONTROL THE QUALITY OF SUCCESSIVE RELEASES?

52 REGRESSION TESTS

Set up automated tests Using, e.G., Junit

Ideally, run regression tests after each change

If running the tests takes too long: Prioritize and run a subset Apply regression test selection to determine tests that are impacted by a set of changes

53 COLLECTING DATA

54 REMEMBER PARETO’S LAW

Approximately 80% of defects come from 20% of modules

55 HOW CAN WE IMPROVE DEVELOPMENT?

56 STRATEGIC ISSUES

Specify requirements in a quantifiable Build “robust” software that is designed manner to test itself

State testing objectives explicitly Use effective formal technical reviews as a filter prior to testing Understand the users of the software and develop a profile for each user category Conduct formal technical reviews to assess the and test cases Develop a testing plan that emphasizes themselves “rapid cycle testing” Develop a continuous improvement approach for the testing process

57 TEST-DRIVEN DEVELOPMENT

writing tests before you implement functionality involves extra effort, but…

… it forces you to think about the problem you are trying to solve more concretely and formulate a solution more quickly

…and you will regain the time spent on unit tests by catching problems early and reduce time spent later on

59 TESTING: RECOMMENDATIONS

1. Write tests that cover a partition of the input space, and that cover specific features

2. Achieve good

3. Create an automated, fast running test suite, and use it all the time

4. Have tests that cover your system’s tests at different levels of functionality

5. Set up your tests so that, when a failure occurs, it pinpoints the issue so that it does not require much further debugging

60