Synthesising Interprocedural Bit-Precise Termination Proofs
Total Page:16
File Type:pdf, Size:1020Kb
Synthesising Interprocedural Bit-Precise Termination Proofs Hong-Yi Chen, Cristina David, Daniel Kroening, Peter Schrammel and Bjorn¨ Wachter Department of Computer Science, University of Oxford, fi[email protected] Abstract—Proving program termination is key to guaranteeing does terminate with mathematical integers, but does not ter- absence of undesirable behaviour, such as hanging programs and minate with machine integers if n equals the largest unsigned even security vulnerabilities such as denial-of-service attacks. To integer. On the other hand, the following code: make termination checks scale to large systems, interprocedural termination analysis seems essential, which is a largely unex- void foo2 ( unsigned x ) f while ( x>=10) x ++; g plored area of research in termination analysis, where most effort has focussed on difficult single-procedure problems. We present does not terminate with mathematical integers, but terminates a modular termination analysis for C programs using template- with machine integers because unsigned machine integers based interprocedural summarisation. Our analysis combines a wrap around. context-sensitive, over-approximating forward analysis with the A second challenge is to make termination analysis scale to inference of under-approximating preconditions for termination. Bit-precise termination arguments are synthesised over lexico- larger programs. The yearly Software Verification Competition graphic linear ranking function templates. Our experimental (SV-COMP) [6] includes a division in termination analysis, results show that our tool 2LS outperforms state-of-the-art alter- which reflects a representative picture of the state-of-the-art. natives, and demonstrate the clear advantage of interprocedural The SV-COMP’15 termination benchmarks contain challeng- reasoning over monolithic analysis in terms of efficiency, while ing termination problems on smaller programs with at most retaining comparable precision. 453 instructions (average 53), contained at most 7 functions I. INTRODUCTION (average 3), and 4 loops (average 1). In this paper, we present a technique that we have suc- Termination bugs can compromise safety-critical software cessfully run on programs that are one magnitude larger, systems by making them irresponsive, e.g., termination bugs containing up to 5000 instructions. Larger instances require can be exploited in denial-of-service attacks [1]. Termination different algorithmic techniques to scale, e.g., modular inter- guarantees are therefore instrumental for software reliability. procedural analysis rather than monolithic analysis. This poses Termination provers, static analysis tools that aim to construct several conceptual and practical challenges that do not arise in a termination proof for a given input program, have made monolithic termination analysers. For example, when proving tremendous progress. They enable automatic proofs for com- termination of a program, a possible approach is to try to plex loops that may require linear lexicographic (e.g. [2], [3]) prove that all procedures in the program terminate universally, or non-linear termination arguments (e.g. [4]) in a completely i.e., in any possible calling context. However, this criterion is automatic way. However, there remain major practical chal- too optimistic, as termination of individual procedures often lenges in analysing real-world code. depends on the calling context, i.e., procedures terminate First of all, as observed by [5], most approaches in the conditionally only in specific calling contexts. literature are specialised to linear arithmetic over unbounded Hence, an interprocedural analysis strategy is to verify uni- mathematical integers. Although, unbounded arithmetic may versal program termination in a top-down manner by proving reflect the intuitively-expected program behaviour, the pro- termination of each procedure relative to its calling contexts, gram actually executes over bounded machine integers. The and propagating upwards which calling contexts guarantee semantics of C allows unsigned integers to wrap around when termination of the procedure. It is too difficult to determine they over/underflow. Hence, arithmetic on k-bit-wide unsigned these contexts precisely; analysers thus compute preconditions k integers must be performed modulo-2 . According to the C for termination. A sufficient precondition identifies those pre- standards, over/underflows of signed integers are undefined states in which the procedure will definitely terminate, and is behaviour, but practically also wrap around on most architec- thus suitable for proving termination. By contrast, a necessary tures. Thus, accurate termination analysis requires a bit-precise precondition identifies the pre-states in which the procedure analysis of program semantics. Tools must be configurable may terminate. Its negation are those states in which the with architectural specifications such as the width of data procedure will not terminate, which is useful for proving types and endianness. The following examples illustrate that nontermination. termination behaviour on machine integers can be completely In this paper we focus on the computation of sufficient pre- different than on mathematical integers. For example, the conditions. Preconditions enable information reuse, and thus following code: scalability, as it is frequently possible to avoid repeated anal- void foo1 ( unsigned n ) f f o r ( unsigned x =0; x<=n ; x + + ) ; g ysis of parts of the code base, e.g. libraries whose procedures are called multiple times or did not undergo modifications 1 unsigned f ( unsigned z ) f 1 unsigned h ( unsigned y ) f between successive analysis runs. 2 unsigned w=0; 2 unsigned x ; 3 i f ( z>0) w=h ( z ) ; 3 f o r ( x =0; x<10; x+=y ) ; 4 return w; 4 return x ; Contributions: 5 g 5 g 1) We propose an algorithm for interprocedural termination Fig. 1. Example. analysis. The approach is based on a template-based static 0 0 0 0 0 0 Initf ((z); (w ; z ; g )) ≡ (w =0 ^ z =z ^ g ) analysis using SAT solving. It combines context-sensitive, 0 0 0 Transf ((w; z; g); (w ; z ; g )) ≡ (g ^ h0((z); (w1))^ 0 0 summary-based interprocedural analysis with the infer- w =(z>0?w1:w) ^ :g ) ence of preconditions for termination based on template Outf ((w; z; g); (rf )) ≡ (rf =w) ^ :g 0 0 0 0 abstractions. We focus on non-recursive programs, which Inith((y); (x ; y )) ≡ (x =0 ^ y =y) 0 0 0 0 cover a large portion of software written, especially in Transh((x; y); (x ; y )) ≡ (x =x+y ^ x<10 ^ y =y) domains such as embedded systems. Outh((x; y); (rh)) ≡ (rh=x ^ :(x<10)) 2) We provide an implementation of the approach in 2LS, a Fig. 2. Encoding of Example 1. static analysis tool for C programs. Our instantiation of the algorithm uses template polyhedra and lexicographic, Moreover, we write x and x with the understanding that the linear ranking functions templates. The analysis is bit- former is a vector, whereas the latter is a scalar. precise and purely relies on SAT-solving techniques. Each call to a procedure h at call site i in a procedure f p in p out 3) We report the results of an experimental evaluation on is modeled by a placeholder predicate hi(x i; x i) 597 procedural SV-COMP benchmarks with in total 1.6 occurring in the formula Transf for f. The placeholder pred- million lines of code that demonstrates the scalability and icate ranges over intermediate variables representing its actual p in p out applicability of the approach to programs with thousands input and output parameters x i and x i, respectively. of lines of code. Placeholder predicates evaluate to true, which corresponds to havocking procedure calls. In procedure f in Fig. 2, the II. PRELIMINARIES placeholder for the procedure call to h is h0((z); (w1)) with the actual input and output parameters z and w1, respectively. In this section, we introduce basic notions of interprocedural A full description of the program encoding is given in the and termination analysis. extended version of our paper [7]. Program model and notation. We assume that programs Basic concepts. Moving on to interprocedural analysis, we 1 are given in terms of acyclic call graphs, where individual introduce formal notation for the basic concepts below: procedures f are given in terms of symbolic input/output transition systems. Formally, the input/output transition system Definition 1 (Invariants, Summaries, Calling Contexts). For a procedure given by (Init; Trans; Out) we define: of a procedure f is a triple (Init f ; Transf ; Out f ), where 0 Transf (x; x ) is the transition relation; the input relation • An invariant is a predicate Inv such that: in Init f (x ; x) defines the initial states of the transition sys- in 0 in in 8x ; x; x : Init(x ; x) =) Inv(x) tem and relates them to the inputs x ; the output relation 0 0 out ^ Inv(x) ^ Trans(x; x ) =) Inv(x ) Out f (x; x ) connects the transition system to the outputs xout of the procedure. Inputs are procedure parameters, global • Given an invariant Inv, a summary is a predicate Sum variables, and memory objects that are read by f. Outputs such that: are return values, and potential side effects such as global 8xin; x; x0; xout : Init(xin; x) ^ Inv(x0) ^ Out(x0; xout) variables and memory objects written by f. Internal states x =) Sum(xin; xout) are commonly the values of variables at the loop heads in f. These relations are given as first-order logic formulae • Given an invariant Inv, the calling context for a pro- resulting from the logical encoding of the program semantics. cedure call h at call site i in the given procedure is a predicate CallCtx such that Fig. 2 shows the encoding of the two procedures in Fig. 1 into hi 2 in 0 p in p out such formulae. The inputs x of f are (z) and the outputs 8x; x ; x i; x i : xout consist of the return value denoted (r ). The transition 0 p in p out f Inv(x) ^ Trans(x; x ) =) CallCtx hi (x i; x i) relation of h encodes the loop over the internal state variables (x; y).