Symbolic Execution and Program Testing

Symbolic Execution and Program Testing

1. Introduction The large-scale production of reliable programs is one of the fundamental requirements for applying com- puters to today's challenging problems. Several tech- Programming B. Wegbreit niques are used in practice; others are the focus of cur- Languages Editor rent research. The work reported in this paper is directed at assuring that a program meets its requirements even Symbolic Execution when formal specifications are not given. The current technology in this area is basically a testing technology. and Program Testing That is, some small sample of the data that a program is expected to handle is presented to the program. If the James C. King program is judged to produce correct results for the IBM Thomas J. Watson Research Center sample, it is assumed to be correct. Much current work [11] focuses on the question of how to choose this sample. Recent work on proving the correctness of programs by formal analysis 15] shows great promise and appears to be the ultimate technique for producing reliable pro- grams. However, the practical accomplishments in this area fall short of a tool for routine use. Fundamental problems in reducing the theory to practice are not This paper describes the symbolic execution of pro- likely to be solved in the immediate future. grams. Instead of supplying the normal inputs to a Program testing and program proving can be con- program (e.g. numbers) one supplies symbols represent- sidered as extreme alternatives. While testing, a pro- ing arbitrary values. The execution proceeds as in a grammer can be assured that sample test runs work cor- normal execution except that values may he symbolic rectly by carefully checking the results. The correct exe- formulas over the input symbols. The difficult, yet in- cution for inputs not in the sample is still in doubt. Al- teresting issues arise during the symbolic execution of ternatively, in program proving the programmer form- conditional branch type statements. A particular system ally proves that the program meets its specification for called EFFIGY which provides symbolic execution for all executions without being required to execute the program testing and debugging is also described, it program at all. To do this he gives a precise specifica- interpretively executes programs written in a simple tion of the correct program behavior and then follows a PL/I style programming language. It includes many formal proof procedure to show that the program and standard debugging features, the ability to manage and the specification are consistent. The confidence in this to prove things about symbolic expressions, a simple method hinges on the care and accuracy employed in program testing manager, and a program verifier. A both the creation of the specification and in the con- brief discussion of the relationship between symbolic struction of the proof steps, as well as on the attention execution and program proving is also included. to machine-dependent issues such as overflow, rounding Key Words and Phrases: symbolic execution, pro- gram testing, program debugging, program proving, etc. This paper describes a practical approach between program verification, symbolic interpretation these two extremes. From one simple view, it is an en- CR Categories: 4.13, 5.21, 5.24 hanced testing technique. Instead of executing a program on a set of sample inputs, a program is "symbolically" executed for a set of classes of inputs. That is, each sym- bolic execution result may be equivalent to a large num- ber of normal test cases. These results can be checked against the programmer's expectations for correctness either formally or informally. The class of inputs characterized by each symbolic execution is determined by the dependence of the pro- Copyright © 1976, Association for Computing Machinery, Inc. gram's control flow on its inputs. If the control flow of General permission to republish, but not for profit, all or part the program is completely independent of the input var- of this material is granted provided that ACM's copyright notice is given and that reference is made to the publication, to its date iables, a single symbolic execution will suffice to check of issue, and to the fact that reprinting privileges were granted all possible executions of the program. If the control by permission of the Association for Computing Machinery. Author's address: IBM Thomas J. Watson Research Center, flow of the program is dependent on the inputs, one P.O. Box 218, Yorktown Heights, N.Y. 10598. must resort to a case analysis. Often the set of input 385 Communications July 1976 of Volume 19 the ACM Number 7 classes needed to exhaust all possible cases is practicaUy guage syntax nor the individual programs written in the infinite, so this is still basically a testing methodology. language are changed. The only opportunity to intro- However, the input classes are determined only by those duce symbolic data objects (symbols representing inte- inputs involved in the control flow, and symbolic test- gers) is as inputs to the program. For simplicity, let us ing promises to provide better results more easily than suppose that each time a new input value for the pro- normal testing for most programs. gram is required, it is supplied symbolically from the list of symbols {al, ~2, ~3, • • • }. Program inputs are eventually assigned as values to program variables (e.g. by procedure parameters, global variables, or read 2. Symbolic Execution statements). Thus, to handle symbolic inputs, we allow the values of variables to be a~'s as well as signed integer The symbolic execution of a program is described in constants. this section in an ideal sense, and then, in Section 6, a The evaluation rules for arithmetic expressions used particular practical system which has been built (an ap- in assignment and IF statements must be extended to proximation to the ideal) is discussed. The term ideal is handle symbolic values. The expressions formed in the used for several reasons: usual way by the integers, a set of indeterminate sym- 1. The assumption is made that programs deal only bols {~1, as,... }, parentheses, and the operations -b, with integers and, in fact, only with integers having --, and X are the integer polynomials (integer valued, arbitrary magnitude. Machine register overflows are integer coefficients) over those symbols. By allowing not considered. program variables to assume integer polynomials over 2. The "execution tree" (defined later) resulting from the a~'s as values, the symbolic execution of assignment symbolic execution of many (most) programs is statements follows naturally. The expression on the infinite. right-hand side of the statement is evaluated, possibly 3. The symbolic execution of IF statements requires substituting polynomial expressions for variables. The theorem proving which, even for modest program- result is a polynomial (an integer is the trivial case) which ming languages, is mechanically impossible. is then assigned as the new value of the variable on the Nonetheless, the discussion of the ideal does provide a left-hand side of the assignment statement. standard against which real computer systems for sym- The GO-TO's to labels function exactly as in normal bolic execution can be measured. executions by unconditionally transferring control Each programming language has an execution se- from the GO-TO statement to the statement associated mantics describing the data objects which program vari- with the corresponding label. ables may represent, how statements written in the The "state" of a program execution usually includes language manipulate data objects, and how control the values of program variables and a statement counter flows through the statements of a program. One canalso (dhnoting the statement currently being executed). The define an alternative "symbolic execution" semantics definition of the symbolic execution of the IF statement for a programming language where the real data ob- requires that a "path col~dition" (pc) also be included jects need not be used but can be represented by arbi- in the execution state, pc is a Boolean expression over trary symbols. Symbolic execution is a natural extension the symbolic inpuls ia~}. It never contains program of normal execution, providing the normal computa- variables, and for our simple language, is a conjoined tions as a special case. Computational definitions for the list of expressions of the form R _> 0 or --1 (R >_ 0), basic operators of the language are extended to accept where R is a polynomial over {a~}. For example: symbolic inputs and produce symbolic formulas as 1,~1 >__ o A ", + 2 x ~, >_ 0 A -, (,~3 > o) 1. output. Let us consider a simple programming language. Let As will be seen, pc is the accumulator of properties the program variables be exclusively of type "signed which the inputs must satisfy in order for an execution to integer". Include simple assignment statements, IF state- follow the particular associated path. Each symbolic ments (with THEN and ELSE clauses), GO-TO's to execution begins with pc initialized to true. As as- labels, and some means for obtaining inputs (e.g. pro- sumptions about the inputs are made, in order to choose cedure parameters, global variables, read operations). between alternative paths through the program as pre- Restrict the arithmetic expressions to the basic integer sented by IF statements, those assumptions are added operators of addition (+), subtraction (-), and multi- (conjoined) to pc. plication (X). Restrict the Boolean expressions (used in The symbolic execution of an IF statement begins in IF statements) to the simple test of whether an arithmetic a fashion similar to its normal execution: the evaluation expression is non-negative (i.e. {arith.expr.} >_ 0). of the associated Boolean expression by replacing vari- The symbolic execution of programs in this simple ables by their values.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    10 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us