Journal of Network and Computer Applications 59 (2016) 109–116
Contents lists available at ScienceDirect
Journal of Network and Computer Applications
journal homepage: www.elsevier.com/locate/jnca
Termination analysis with recursive calling graphs
Teng Long a,b,n, Wenhui Zhang b a School of Information Engineering, China University of Geosciences (Beijing), 29#, Xueyuan Road, Haidian District, Beijing, PR China b State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, 4#, South Fourth Street of Zhongguancun, Haidian District, Beijing, PR China article info abstract
Available online 16 July 2015 As one of the significant aspects for green software systems, termination analysis is related to the fi Keywords: optimization of the resource utilization. The approach for size-change termination principle was rst Termination analysis proposed by Lee, Jones and Ben-Amram in 2001, which is an effective method for automatic termination Green software programs analysis. According to its abstracted constructs (size-change graphs), the principle ignores the condition and Size-change termination principle return values for function call. In this paper, we devise a new construct including the ignoring features to extend the set of programs that are size-change terminating in real life. The main contribution of our paper is twofold: firstly, it supports the analysis of functions in which the returned values are relevant to termination. Secondly, it gains more accuracy for oscillating value change in termination analysis. & 2015 Elsevier Ltd. All rights reserved.
1. Introduction analysis is unrelated to the order of the variables which is decided in SCT. Therefore, it is one of the approaches which are success- With the development of information and communication fully applied in a large class of programs for termination analysis. technologies, it gives rise to the increasing popularity of electronic There are some attempts (Ben-Amram and Genaim, 2014, 2013; devices, raising serious environment concerns for the society. Ben-Amram, 2009; Cook et al., 2013) to relax the well-founded Trying to address these issues, there is a critical need to consider restriction. However, the downside is that the SCG can only greenness in the whole life cycle of software systems. express the transitions involving decreases in well-founded partial A guarantee of right behaviors of software systems is signifi- orders. In fact, there are still many situations with oscillating value cant, which can lighten greatly network loads and avoid wasting change, in which the traditional SCT approach is not applicable. resources. Termination property is one of the properties that We present a technique for deriving program termination proper- describe the right behaviors of the software system. Moreover, ties from size-change information, by constructing a recursive callings the termination analysis (Kuwahara et al., 2014; Heizmann et al., graph (RCG, defined in Section 3) and a set of extended size-change 2010; Codish et al., 2010, 2012; Farzan et al., 2015) is a much- graphs (ESCGs, defined in Section 3) for the program. The former one studied topic. In real-life software systems, methods to derive describes the relation to function and function return based on termination properties from recursive and mutually recursive locations in the program. The latter ones approximate the oscillating functions are useful in program analysis. size change at each location in the program. The strong connected The size-change termination (SCT) principle which was pre- components in the RCG are an approximation of all sequences of sented in Lee et al. (2001) is simple but surprisingly rich enough to function calls that could be idempotent.2 Our paper presents an capture the progress of many real-life programs. Size-change important technique which could help stramline the application of graphs (SCG) are essential auxiliary constructs to support SCT. the green software programs. The main contribution of our paper with More importantly, the relations between input and output vari- recursive calling graphs is twofold: Firstly, it supports the analysis that ables of a function call are restricted to the forms as x4y0 or can handle recursive and mutually recursive functions in which return xZy0.1 It operates over the variables whose “ size” is well-founded values are relevant to termination. Secondly, the approach extends the based on the function call's structure in the program. Once a set of programs that are size-change terminating. combination of the relations has been found, termination follows The paper is organized as follows: in Section 2,weintroducethe if one variable at least is guaranteed to decrease. The termination syntax of the language used in this paper and the definitions that are related to size-change termination principle. We propose the definitions
n Corresponding author. E-mail addresses: [email protected] (T. Long), [email protected] (W. Zhang). 1 The names for input and output variables could be different. 2 Idempotent graphs describe the self-recursive function calls. http://dx.doi.org/10.1016/j.jnca.2015.06.019 1084-8045/& 2015 Elsevier Ltd. All rights reserved. 110 T. Long, W. Zhang / Journal of Network and Computer Applications 59 (2016) 109–116 of auxiliary constructs and do termination analysis in Section 3. Section 3 A closure set of size-change graphs (sets of abstract transitions) outlines how to record the returned values, and how to do a more is used to express the set of all the possible call sequences. All of precise analysis. An algorithm is proposed in Section 4. Section 5 the idempotent ones in closure set stand for the infinite call illustrates the application of the approach on an example with self- sequences. If a descent arc “4” can be found, it satisfies the recursive function callings. Related work is discussed in Section 6.We “infinite descent” in the principle. end with concluding remarks in Section 7.
3. Termination analysis 2. Preliminaries The termination analysis for programs with nested self- In this section, we will list the syntax of the language used in recursive functions over integer domain is complex, because of this paper and the definitions in size-change termination principle the complicated “size change” situation. The possible cases are as in Lee et al. (2001). follows:
2.1. Language 1. The value of variables in the program declines all the time. 2. The value of variables in the program increases in each The language used in this paper is a simple first-order call-by- function call. value functional language, defined in Table 1. A variable can 3. The value of variables in the program is oscillating, such as in represent any expression. Similarly, a constant can represent any the McCarthy's 91 function. expression. An expression can be a conditional choice based on the equational theory (if..then..else). Functions are used to represent There are only two arc forms in size-change graphs, such as 0 0 definitions about these variables. So does the expressions. x4y or xZy , therefore, it can deal with the first case. In our work, additional variables and a new auxiliary construct are used 2.2. Size-change termination principle to address the latter two cases.
3.1. Additional variables Definition 1 (Size-change graph (SCG)). Suppose functions f and g are defined in program P. A size-change graph G : f -g for P is a 4 Z In this work, additional variables consist of counter variables set of labeled arcs x-y or x-y where xAVariablesðf Þ, and returned variables. The former one is used to record the yAVariablesðgÞ. number of times for some function call/return, while the latter one is used to record the returned value. Functions f and g (the caller and callee) are respectively called The returned values are always ignored in available approaches, the source and the target of G. G is an abstraction of a call so functions in which the returned values are relevant to termina- transition – from f to g. The arcs in the graph are as abstract tion cannot be analyzed correctly. Adding extra variables in the transitions. program is the first step to extend the range of programs for Definition 2 (Closure). The closure of a set of size-change graphs termination analysis. V is the smallest set cl(g) such that A set of additional variables add consists of a set of returned variables and a set of counter variables. After augmenting the g DclðgÞ program with additional variables, the set of variables V in the 0 0 ″ V V If G1 : f -f and G2 : f -f are in cl(g), then G1; G2 AclðgÞ. G1; G2 : program will augmented as [ add. ″ 4 4 w0 f -f with arc set E defined below: E ¼fx-zj (y; w:x-y-z or w 4 Z Z Z w x-y-z; wAf4; Zgg [fx-zj (y:x-y-zÞg where x-y and 3.2. Definitions of auxiliary constructs w0 y-z are respectively arcs of G1 and G2. We construct the recursive calling graph (RCG) to describe the control flow for the augmented program which can express not Size-change graph G is idempotent if G; G ¼ G. A self-recursive only the function calls but the function returns. The transitions in a function call can be expressed by an idempotent size-change RCG are described as ESCGs which are extended by transforming graph. δ the arc types from x4y0 or xZy0 to x-y0 where δAZ is the Theorem 1 (Size-change termination principle). Program P is SCT changed value in the transition. 4 terminating iff every idempotent G in cl(g) has an arc z-z. The changes of values in ESCGs have to be satisfied in size reducing. In other words, the corresponding code cannot be The SCT is described in Lee et al. (2001) as follows: Aprogram revisited, and the recursive calling graph cannot result in any fi terminates on all inputs if every in nite call sequence (following program infinite cycle. control flow) would cause an infinite descent in some data values. 3.2.1. Extended size-change graph Table 1 The language syntax. Definition 3 (Extended size-change graph (ESCG)). Suppose func- tions f and g are defined in program P. Location a is the place x A Var (Variables) where the value of variables function g changed in program P.An op A Prim (Primitive operators) A extended size-change graph Ba : f -g for P is a set of labeled arcs f FName (Function names) δ c A Const (Constants) x-y where xAVariablesðf Þ, yAVariablesðgÞ; δAZ is the changed e A Exp (Expressions) value for function calls. e:: if e0 then e1 else e2 j xj cj e1 op e2 Suppose functions f ; g are defined in the program. An ESCG Ba AB, δ j f ðe1; …; enÞ Ba : f -g for the program is a set of labeled arcs x-y where d A Def (Definitions) δAZ; xAVariablesðf Þ; yAVariablesðgÞ.Functionsf and g are respectively d:: f ðx1; …; xnÞ¼e called the source and the target of BðsourceðBÞ¼f ; targetðBÞ¼gÞ. T. Long, W. Zhang / Journal of Network and Computer Applications 59 (2016) 109–116 111
Definition 4 (E-Multipath). An e-multipath M is a transition L ¼fEðvÞjvAVariablesðpÞ; pAPg: A set of labels (expressions sequence B1, B2, B3; … such that target(Bi)¼source ðBi þ 1Þ for EðvÞ is related to some variable v): to express the conditions i ¼ 1; 2; …. An e-thread is a connected path of arcs in M that of function call. δ δ δ Z ; -t -t þ 1 -t þ 2… starts from some Bt, t 1 th ¼ zit zit þ 1 zit þ 2 with each δ AZ; z AV; jAN. t þ j t þ j Suppose positions p ; p AP are defined in the program. There 1 2 δ Definition 5 (Composition). The composition of two transitions B : are transitions B ; B AB, ESCGs B : f -g with labeled arcs x-1 y δ a b a δ δ 0 0 2 3 0 4 0 f -g and B : g-h is B; B : f -h with arc set E defined below. and v-u, Bb : g-f with labeled arcs y-x and u-v where δ1 δ2 Notation: write x-y-z if δ ; δ are the changed values of variables δ1; δ2; δ3; δ4 AZ; x; vAVariablesðf Þ; y; uAVariablesðgÞ. δ 1δ 2 in each function, x-1 y and y-2 z are respectively arcs of B and B0. An example of RCG is in Fig. 1. Nodes in the RCG are the program points. The program point 1 describes function f with Arc is a set of arcs calling condition L1. By contrast, the program point 2 describes δ þ δ δ δ Arc ¼fx 1- 2zj x-1 y in B; y-2 z in B0g function g with calling condition L2. Blocks in the RCG are transitions in the program. There are transitions between the For nested functions, if there are inner function f and outer in points with conditions L1 and L2 respectively as follows: function fout, we can use composition of ESCGs to express the function call in nested functions. There are ESCGs B : f -f and 1 in 1-2: f ðx; vÞ-gðy; uÞ, L1 B : f -f , and variables r Av , xAVariablesðf Þ and - ; - ; 2 out δ add δ in 2 1: gðy uÞ f ðx vÞ, L2 A -1 -2 y Variablesðf out Þ. There are arc x r in B1 and arc r y in B2, where δ AZ; δ ¼ 0. We use B ¼ B ; B to show the connection 1 2 1 2 The blocks f ðx; vÞ-gðy; uÞ at location a and gðy; uÞ-f ðx; vÞ at between the returned values and the termination of programs. location b are described as ESCGs. Definition 6 (M-Transition). An m-transition is a transition as B : All the calling condition can be expressed in the form as lgZrg f -g which is the composition for all the transitions in a multipath or lg4rq, where lg; rg AV [ const (const is a set of constant). There is an equation sizeðlg; rgÞ¼absðlg rgÞ, where absðEÞ is the absolute M : B1; B2; …; Bn, where sourceðB1Þ¼f ; targetðBnÞ¼g. value of expression E. E δ δ δ Assume an arc x-y0 in m-transition B comes from x-1 z -2 z …-n y0. 4 ; Δ 1 Δ 2 For example, under condition n 0, there is sizeðn 0Þ¼absðnÞ, -i -j We should consider the changed value in arcs zi zi and zj zj which while under condition 100Zn, there is sizeð100; nÞ¼absð100 nÞ. express the self-recursive function calls. All the possible self-recursive calls in the multipath should be considered by multiplying the visiting 3.3. Termination analysis with RCG number of times at the locations in the program.
Definition 7 (Tendency transition). Let B : f -g be an m-transition Lemma 1. Given a RCG G ¼ðP; B; LÞ for a program, every strong for a multipath M : B1; B2; …; Bn, where sourceðB1Þ¼f ; targetðBnÞ¼ δ G g.Wedefine B with integer labeled arcs as x-y; δAZ, and the connected component in has an e-idempotent m-transition. tendency transition BT of B with symbol labeled arcs as R Proof. Every strong connected component C in G can be expressed as x-y; RAfþ; ; ¼g. asequenceofnodes:Pi; Pi þ 1; …PN; Pi,whereeachP AP.Wecanget M - - … - Function Tend is defined as B↦BT an e-multipath from C: f g1 g2 gN f where f APi; g APi þ j; jA½0..N . TendðBÞ¼Tendðf -gÞ¼ j δ The m-transition of M is B : f -f 0.Anarcasx-x0, where þ 0 E 0 0 fx-y j x-y ; E40; xAVaribleðf Þ; y AVaribleðgÞg x; x0 AVariablesðf Þ can be constructed in B. E [fx-y0 j x-y0; Eo0; xAVaribleðf Þ; y0 AVaribleðgÞg ¼ E [fx-y0 j x-y0; E ¼ 0; xAVaribleðf Þ; y0 AVaribleðgÞg
Function Tend transform transition B with integer labeled arcs to the tendency transition BT with symbol labeled arcs that express the ascent or descent in some data value. We composite many transitions in an e-multipath. The result for composition seems as a new transition which describes the whole changed value after all the function calls. E-Idempotent B is that TendðBÞ¼TendðB; BÞ.
Definition 8 (Deterministic). Given an m-transition B, if there is no arc in the tendency transition BT of B, transition B is non- deterministic. Otherwise, it is deterministic.
3.2.2. Recursive calling graph
Definition 9 (Recursive calling graph). The recursive calling graph (RCG) for program is G ¼ðP; B; LÞ: