Processing an Intermediate Representation Written in Lisp
Total Page:16
File Type:pdf, Size:1020Kb
Processing an Intermediate Representation Written in Lisp Ricardo Pena,˜ Santiago Saavedra, Jaime Sanchez-Hern´ andez´ Complutense University of Madrid, Spain∗ [email protected],[email protected],[email protected] In the context of designing the verification platform CAVI-ART, we arrived to the need of deciding a textual format for our intermediate representation of programs. After considering several options, we finally decided to use S-expressions for that textual representation, and Common Lisp for processing it in order to obtain the verification conditions. In this paper, we discuss the benefits of this decision. S-expressions are homoiconic, i.e. they can be both considered as data and as code. We exploit this duality, and extensively use the facilities of the Common Lisp environment to make different processing with these textual representations. In particular, using a common compilation scheme we show that program execution, and verification condition generation, can be seen as two instantiations of the same generic process. 1 Introduction Building a verification platform such as CAVI-ART [7] is a challenging endeavour from many points of view. Some of the tasks require intensive research which can make advance the state of the art, as it is for instance the case of automatic invariant synthesis. Some others are not so challenging, but require a lot of pondering and discussion before arriving at a decision, in order to ensure that all the requirements have been taken into account. A bad or a too quick decision may have a profound influence in the rest of the activities, by doing them either more difficult, or longer, or more inefficient. This is the case of the problem presented in this paper: deciding a textual representation for an otherwise internal representation of the platform. The CAVI-ART platform accepts programs written in a variety of languages, both imperative and functional ones, and potentially it could be extended to more. A carefully chosen intermediate represen- tation common to all of them ensures that most of the platform activities can be performed in a language independent way. The CAVI-ART Intermediate Representation (in what follows IR), introduced in [7], is an internal programming language to which all the source languages supported by CAVI-ART are transformed. The language independent activities of the platform include termination analysis, invari- ant synthesis, verification condition generation, verification condition proving, test case generation, and some other. As we will show, the IR is the confluence point of all the tools. These tools either produce, consume, or transform the IR in different ways. As the tools are implemented in a variety of languages, we decided to use files to communicate the tools between them. This makes the tools more independent of each other, and also gives visibility to the intermediate results. So, we arrived at the point of deciding the format of these files containing IR-transformed programs. As we will see, it has not been an easy decision, given the somewhat contradictory requirements we have for these files. The final decision has been to represent the IR programs as Common Lisp S-expressions [5]. This satisfies all our requirements and also gives us for free a new possibility: that of being able to execute the ∗Work partially funded by the Spanish Ministry of Economy and Competitiveness, under the grant TIN2013-44742-C4-3-R Contribution to: © R. Pena,˜ S. Saavedra, J. Sanchez-Hern´ andez´ PROLE 2016 R. Pena,˜ S. Saavedra, J. Sanchez-Hern´ andez´ 149 a ::= c f constant g j x f variable g be ::= a f atomic expression g j f ai f function/primitive operator application g j haii f tuple construction g j C ai f constructor application g e ::= be f binding expression g j let hxi :: tii = be in e f sequential let. Left part of the binding can be a tuple g j letfun defi in e f recursive let for function definitions g j case a of alti[; ! e] f case distinction with optional default branch g def ::= f (xi :: ti) :: yi :: ti = e f function definition. Output results are named g alt ::= C xi :: ti ! e f case branch g t ::= a f type variable g j T ti f type constructor application g Figure 1: CAVI-ART IR syntax IR. Additionally, the Common Lisp environment provides a set of facilities, such as macro-expansion, evaluation while compiling, and some other, that can make the processing of the IR files a very systematic and uniform task, independently of whether we want to execute it, or to do any other processing. The plan of the paper is as follows: In Sec. 2 we explain the different uses of the IR and all the requirements that these uses pose on the IR-files; then, in Sec. 3 we justify the decision of using Lisp S- expressions; in Sec. 4 we introduce the facilities of the Common Lisp environment and present a common scheme for doing all kind of processing one may wish to do with IR files; as two instantiations of this scheme, we show how to execute an IR file, and how to extract the verification conditions for proving the program correct. A running example is used all across the paper, and in Sec. 5 we show the resulting goals of the second instantiation of the scheme. Finally, Sec. 6 concludes. 2 Contents and Uses of the Intermediate Representation The IR abstract syntax is reproduced in Fig. 1, where the vector notation z is an abbreviation for z1;:::;zn. One of its aims was to achieve minimality, as it was clear for us that the bigger the number of the differ- ent IR constructions, the longer, more difficult, and more error-prone would be the above cited tasks. In this sense, iteration and recursion are unified into a single construction letfun which allows to define sets of mutually recursive functions. The imperative assignments are translated into sequential let expres- sions, and an SSA1 transformation ensures that no variable is assigned more than once. All conditional statements, both of imperative and functional languages, are transformed into case expressions. The rest of the IR consists of function and constructor applications, tuples and atoms. The IR is a functional language, so there is nothing like destructive update, output function arguments, or arguments passed by reference. When a function returns more than one single result, then it returns a tuple. An atom is either a literal constant or a variable, and in several places of the IR atoms are mandatory. For instance, in actual arguments of an application. These restrictions make it easier the activities related to verification condition generation and proving. For a broader motivation of the decisions leading to the IR, see [7]. 1 Static Single Assignment. See for instance [1, Chap. 19]. 150 Processing an Intermediate Representation Written in Lisp quicksort (v :: array int; n :: int) :: (vres :: array int) = fQ : n = length(v)g fR : sorted(vres) ^ permut all(v;vres)g v o i d q s o r t ( i n t v [ ] , i n t i , i n t j )f letfun // Pre:0 <=i <=j < l e n g t h(v) qsort (v :: array int; i :: int; j :: int) :: (vres :: array int) = i n t p ; fQ1 : 0 ≤ i ≤ j < length(v)g fR1 : sorted sub(vres;i; j + 1) ^ permut sub(v;vres;i; j + 1)g i f ( i <j )f let (b :: bool) = <(i; j) in partition(v,i ,j,p); case b of (true bool) ! f (v;i; j) qsort(v,i ,p −1); 1 qsort(v,p+1,j ); (false :: bool) ! v g // Post: sorted s u b(vres,i,j+1) && f (v :: array int; i :: int; j :: int) :: (vres :: array int) = // permut sub(v,vres,i,j+1) 1 let (v :: array int; p :: int) = partition(v;i; j) in g 1 let (p1 :: int) = -(p;1) in v o i d q u i c k S o r t ( i n t v [ ] , i n t n )f let (v2 :: array int) = qsort(v1;i; p1) in // Pre:n= length(v) let (p2 :: int) = +(p;1) in qsort(v,0,n −1); qsort(v2; p2; j) // Post: sorted(vres) && in // permut a l l(v,vres) let (n1 :: int) = -(n;1) in g qsort(v;0;n1) (a) Original C++ version (b) Transformed to the CAVI-ART IR Figure 2: The quickSort algorithm we will use as running example We will use the algorithm quicksort as a running example in this paper. Fig. 2a shows a source ver- sion in C++. Once transformed into the IR, it has adopted the form shown in Fig. 2b. Preconditions and postconditions are logic formulas which may use atomic predicates introduced in externally axiomatized theories. In the example, we use the predicates sorted, permut for checking the corresponding proper- ties on the full length of vectors, and sorted sub, permut sub for reasoning about subvectors. These predicates are part of Why3 [3], which is the underlying platform that CAVI-ART uses for discharging formulas. The IR is strongly typed in a polymorphic Hindley–Milner-like type system. After the transformation from the source language (either by translating the user given types, or by inferring them), the IR is assumed to be annotated with types. In Fig. 1 it can be seen that types are included at all the defining occurrences of variables, i.e. in the formal arguments and results of function definitions, in the let-bound variables, and in the case-bound variables introduced by pattern-matching. The syntax of types is also shown in Fig.