Empirical Evaluation of the Effectiveness and Reliability of Software Testing Adequacy Criteria and Reference Test Systems
Total Page:16
File Type:pdf, Size:1020Kb
Empirical Evaluation of the Effectiveness and Reliability of Software Testing Adequacy Criteria and Reference Test Systems Mark Jason Hadley PhD University of York Department of Computer Science September 2013 2 Abstract This PhD Thesis reports the results of experiments conducted to investigate the effectiveness and reliability of ‘adequacy criteria’ - criteria used by testers to determine when to stop testing. The research reported here is concerned with the empirical determination of the effectiveness and reliability of both tests sets that satisfy major general structural code coverage criteria and test sets crafted by experts for testing specific applications. We use automated test data generation and subset extraction techniques to generate multiple tests sets satisfying widely used coverage criteria (statement, branch and MC/DC coverage). The results show that confidence in the reliability of such criteria is misplaced. We also consider the fault-finding capabilities of three test suites created by the international community to serve to assure implementations of the Data Encryption Standard (a block cipher). We do this by means of mutation analysis. The results show that not all sets are mutation adequate but the test suites are generally highly effective. The block cipher implementations are also seen to be highly ‘testable’ (i.e. they do not mask faults). 3 Contents Abstract ............................................................................................................................ 3 Table of Tables ................................................................................................................ 7 Table of Figures ............................................................................................................... 8 Acknowledgements .......................................................................................................... 9 Author's declaration ....................................................................................................... 10 1. Introduction ................................................................................................................ 11 1.1 Background and Motivation ................................................................................ 11 1.2 Research Objectives ............................................................................................. 12 1.3 The Aims and Goals of the Thesis ....................................................................... 13 1.4 Overview and Structure of the Thesis .................................................................. 14 2. Literature Survey ....................................................................................................... 16 2.1. Fundamental Concepts ........................................................................................ 17 2.1.1 Testing – the English Usage ......................................................................... 17 2.1.2 Software Testing – a Technical Usage.......................................................... 18 2.1.3 Faults and Failures ........................................................................................ 18 2.1.4 Faults, Failures and Testing .......................................................................... 20 2.1.5 Terminology in the Literature ....................................................................... 20 2.1.6 What is Tested and Where? .......................................................................... 21 2.1.7 Phases of Dynamic Testing: Unit, Integration, System and Acceptance Testing.................................................................................................................... 25 2.2 Characteristics of Testing .................................................................................... 29 2.2.1 Introduction ................................................................................................... 29 2.2.2 Static v Dynamic ........................................................................................... 29 2.2.3 Functional v Structural .................................................................................. 30 2.2.4 Formal v Informal ......................................................................................... 33 2.2.5 Manual v Automated ..................................................................................... 35 2.2.6 Testing v Debugging ..................................................................................... 37 2.3. Software Standards, Process Models and Metrics .............................................. 40 2.3.1 Introduction ................................................................................................... 40 2.3.2 Software Standards ....................................................................................... 40 2.3.4 Software Testing Metrics .............................................................................. 45 2.3.5 Software Projects Continue to Fail ............................................................... 49 2.4. Model Testing, Reliability Growth Models and Fault Adequacy Techniques ... 51 4 2.4.1 Introduction ................................................................................................... 51 2.4.2 Model Based Testing .................................................................................... 51 2.4.3 Reliability Growth Models ........................................................................... 55 2.4.4 Fault Based Adequacy Techniques ............................................................... 58 2.4.5 Execution, Infection and Propagation (PIE) ..................................................... 65 2.5. What do Empirical and Case Studies tell us about Software Testing? ............... 66 2.6 When Can We Stop Testing? ............................................................................... 77 2.6.1 Introduction ................................................................................................... 77 2.6.2 Can We Stop Testing Now? .......................................................................... 77 2.7 Test Set Sub-setting via Heuristic Search ............................................................ 81 2.8 Summary .............................................................................................................. 85 3. The Research Test Framework and its Components .................................................. 86 3.1 Framework ........................................................................................................... 87 3.1.1 Overview of the Framework ............................................................................. 87 3.1.2 Code-Coverage ................................................................................................. 91 3.1.3 Test Set Generation ........................................................................................... 95 3.1.4 JUnit and the Framework .................................................................................. 96 3.1.5 Test Subset Extraction ...................................................................................... 98 3.1.5.1 Genetic Algorithms ........................................................................................ 98 3.1.6 Mutation .......................................................................................................... 103 3.1.7 Mutation Score Generation ............................................................................. 106 3.2 Testing Framework Refinement ........................................................................ 108 3.2.1 Overview of the Framework Refinement ....................................................... 108 3.2.2 Test Case Generation and Execution .............................................................. 109 3.2.3 Mutant Testing and Mutant Score Generation ................................................ 111 3.3 Framework Summary......................................................................................... 112 4. Testing Code Coverage Criteria............................................................................... 114 4.1 Numerical Programs Under Test and Their Program Properties ....................... 116 4.2 Mutations Injected and Results .......................................................................... 118 4.2.1 Full Test Set Results ....................................................................................... 118 4.2.2 Rationale for Different MS in Numerical Recipes ......................................... 121 4.2.3 Test Subset Results ......................................................................................... 126 4.2.4 Results from Miscellaneous and Sorting Algorithms. .................................... 131 4.3 Program Properties............................................................................................. 137 5 4.4 Conclusions ........................................................................................................ 137 5. Testing Encryption Algorithms................................................................................ 140 5.1 Programs under Test, Program Properties and Test Vectors ............................. 141 5.2. Mutation Results ............................................................................................... 145 5.4 Conclusions .......................................................................................................