Software Reliability Via Run-Time Result-Checking

Software Reliability via Run-Time Result-Checking Hal Wasserman University of California, Berkeley and Manuel Blum City University of Hong Kong and University of California, Berkeley We review the eld of result-checking, discussing simple checkers and self-correctors. We argue that suchcheckers could pro tably b e incorp orated in software as an aid to ecientdebug- ging and enhanced reliability. We consider how to mo dify traditional checking metho dologies to make them more appropriate for use in real-time, real-numb er computer systems. In particular, we suggest that checkers should b e allowed to use stored randomness: i.e., that they should b e allowed to generate, prepro cess, and store random bits prior to run-time, and then to use this information rep eatedly in a series of run-time checks. In a case study of checking agen- eral real-numb er linear transformation (for example, a Fourier Transform), we present a simple checker which uses stored randomness, and a self-corrector which is particularly ecient if stored randomness is employed. Categories and Sub ject Descriptors: D.2.5 [Software Engineering]: Testing and Debugging; F.2.1 [Analysis of Algorithms]: Numerical Algorithms|Computation of transforms; F.3.1 [Logics and Meanings of Programs]: Sp ecifying and Verifying and Reasoning ab out Programs General Terms: Reliability, Algorithms, Veri cation Additional Key Words and Phrases: Built-in testing, concurrent error detection, debugging, fault tolerance, Fourier Transform, result-checking, self-correcting This work was supp orted in part by a MICRO grant from Hughes Aircraft Company and the State of California, by NSF Grant ccr92-01092, and by NDSEG Fellowship daah04-93-g-0267. A preliminary version of this pap er app ears as: \Program result-checking: a theory of testing meets a test of theory," Proc. 35th IEEE Symp. Foundations of Computer Science (1994), pp. 382{392. Name: Hal Wasserman Aliation: Computer Science Division, University of California at Berkeley Address: Berkeley, CA 94720, [email protected] erkeley.edu, http.cs.b erkeley.edu/halw Name: Manuel Blum Aliation: Computer Science Department, City University of Hong Kong Address: 83 Tat Chee Ave., Kowlo on Tong, Kowlo on, Hong Kong, [email protected] Aliation: Computer Science Division, University of California at Berkeley Address: Berkeley, CA 94720, [email protected] erkeley.edu, http.cs.b erkeley.edu/blum Permission to make digital or hard copies of part or all of this work for p ersonal or classro om use is granted without fee provided that copies are not made or distributed for pro t or direct commercial advantage and that copies show this notice on the rst page or initial screen of a display along with the full citation. Copyrights for comp onents of this work owned by others than ACM must b e honored. Abstracting with credit is p ermitted. Tocopy otherwise, to republish, to p ost on servers, to redistribute to lists, or to use any comp onentofthiswork in other works, requires prior sp eci c p ermission and/or a fee. Permissions may b e requested from Publications Dept, ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 869-0481, or [email protected]. 2 H. Wasserman and M. Blum 1. RESULT-CHECKING AND ITS APPLICATIONS 1.1 Assuring Software Reliability Metho dologies for assuring software reliability form an imp ortant part of the tech- nology of programming. Yet the problem of eciently identifying software bugs remains a dicult one, and one to which no p erfect solution is likely to b e found. Software is generally debugged via testing suites: one runs a program on a variety of carefully selected inputs, identifying a bug whenever the program fails to p erform correctly. This approach leaves two imp ortant questions incompletely answered. First, how do we know whether or not the program's p erformance is correct? Generally, some sort of an oracle is used here: our program's output may be compared to the output of an older, slower program b elieved to b e more reliable, or the programmers may sub ject the output to a painstaking (and likely sub jective and incomplete) examination byeye. Second, given that a test suite feeds a program only selected inputs outofthe often enormous space of all p ossibilities, how do we assure that every bug in the co de will b e evidenced? Indeed, particular combinations of circumstances leading to a failure maywellgountested. Furthermore, if the testing suite fails to accurately simulate the input distribution which the program will encounter in its lifetime, a supp osedly debugged program may in fact fail quite frequently. One alternative to testing is formal veri cation, a metho dology in which mathematical claims ab out the b ehavior of a program are stated and proved. Thus it is p ossible to demonstrate once and for all that the program must behave correctly on all p ossible inputs. Unfortunately, constructing such pro ofs for even simple programs has proved unexp ectedly dicult. Moreover, many programmers would likely nd it unpleasanttohave to formalize their exp ectations of program b ehavior into mathematical theorems. Another alternativeis fault tolerance. According to this metho dology, the reliability of critical software maybeenhancedbyhaving several groups of programmers create separate versions. At run-time, all of the versions are executed, and their outputs are compared. The gross ineciency of this approach is evident, in terms of programming manp ower as well as either increased run-time or additional parallel-hardware requirements. Moreover, the metho d fails if common misconcep- tions among the several development groups result in corresp onding errors [Butler and Finelli 1993]. We nd more convincing inspiration in the eld of communications, where error- detecting and error-correcting co des allow for the identi cation of arbitrary run-time transmission errors. Such co des are pragmatic and are backed by mathematical guarantees of reliability. We wish to provide such run-time error-identi cation in a more general compu- tational context. This motivates us to review the eld of result-checking. 1.2 Simple Checkers It is a matter of theoretical curiositythat, for certain computations, the time re- quired to carry out the computation is asymptotically greater than the time re- quired, given a tentative answer, to determine whether or not the answer is correct. Software Reliability via Run-Time Result-Checking 3 As an easy example, consider the following task: given as input a comp osite integer c, output any non-trivial factor d of c. Carrying out this computation is currently b elieved to b e dicult, and yet, given I/O pair hc; di,ittakes just one division to determine whether or not d is a correct output on input c. These ideas have b een formalized into the concept of a simple checker [Blum 1988]. Let f b e a function with smallest p ossible computation time T (n), where n is input length (or, if a strong lower b ound cannot b e determined, we informally set T (n) equal to the smallest known computation time for f ). Then a simple checker for f is an algorithm (generally randomized) with I/O sp eci cations as follows: |Input: I/O pair hx; y i. |Correct output: If y = f (x), accept; otherwise, reject. |Reliability: For all hx; y i: on input hx; y i the checker must return correct output with probability(over internal randomization) p , for p a constant close to 1. c c |\Little-o rule": The checker is limited to time o(T (n)). The \little-o rule" is imp ortant in that it requires a simple checker to b e ecient, and also in that it forces the checker to determine whether or not y = f (x)bymeans other than simply recomputing f (x); hence the checker must be in some manner di erent from the program which it checks. We will see in Section 1.5 that this gives rise to certain hop es that checkers are indeed \indep endent" of the programs 1 they check,andsomay more reliably identify program errors. For an example of simple checking, consider a sorting task: input ~x =(x ;:::;x ) 1 n is an arrayofintegers; output ~y =(y ;:::;y ) should b e ~x sorted in increasing or- 1 n der. Completing this task|at least via a general comparison-based sort|is known to require time (n log n). Thus, if we limit our checker to time O (n), this will suce to satisfy the little-o rule. Given h~x; ~y i, to check correctness we must rst verify that the elements of ~y are in increasing order. This may easily b e done in time O (n). But wemust also check that ~y is a p ermutation of ~x . It might be convenient here to mo dify our sorter's I/O sp eci cations, requiring that each elementof~y have attached a p ointer to its original p osition in ~x. Any natural sorting program could easily b e altered to maintain such p ointers, and once they are available we can readily complete our checkintime O (n). But what if ~y cannot be augmented with such pointers? Similarly, what if ~x and ~y are only available on-line from sequential storage, so that O (1)-time p ointer- dereferencing is not p ossible? Then wemay still employ a randomized metho d due to [Wegman and Carter 1981; Blum 1988], which requires only one pass through eachof~x and ~y . 1 Variants of the little-o rule have also b een considered: for example, requiring checkers to use less space than the programs they check, or to b e implementable with smaller-depth circuits.

Software Reliability Via Run-Time Result-Checking

Data Types Are Values

Ignificant Scientific Progress Is Increasingly the Result of National Or Confine Hot Plasma

Figureseer: Parsing Result-Figures in Research Papers

A National Prospective Study of Pornography Consumption and Gendered Attitudes Toward Women

Introduction

Pornography Consumption and Satisfaction: a Meta-Analysis (2017)

Reflections on the Growth and Development of Islamic Philosophy

ICD-10 CM Official Guidelines for Coding and Reporting FY 2019

How Do I Know Whether to Trust a Research Result?

The Scientific Method the Scientific Method Is a Process Used by Scientists to Study the World Around Them

Designing an Experiment

How to Use the ICF: a Practical Manual for Using the International Classification of Functioning, Disability and Health (ICF)