<<

Lemma Functions for Frama-: C Programs as Proofs

Grigoriy Volkov Mikhail Mandrykin Denis Efremov Faculty of Science Software Engineering Department Faculty of National Research University Ivannikov Institute for System Programming of the National Research University Higher School of Economics Russian Academy of Sciences Higher School of Economics Moscow, Russia Moscow, Russia Moscow, Russia [email protected] [email protected] [email protected]

Abstract—This paper describes the development of to additionally specify loop invariants and termination an auto-active verification technique in the Frama-C expression (loop variants) in case the under framework. We outline the lemma functions method analysis contains loops. Then the verification instrument and present the corresponding ACSL extension, its implementation in Frama-C, and evaluation on a set generates verification conditions (VCs), which can be of string-manipulating functions from the Linux kernel. separated into categories: safety and behavioral ones. We illustrate the benefits our approach can bring con- Safety VCs are responsible for runtime error checks for cerning the effort required to prove lemmas, compared known (modeled) types, while behavioral VCs represent to the approach based on interactive provers such as the functional correctness checks. All VCs need to be . Current limitations of the method and its imple- mentation are discussed. discharged in order for the function to be considered fully Index Terms—formal verification, deductive verifi- proved (totally correct). This can be achieved either by cation, Frama-C, auto-active verification, lemma func- manually proving all VCs discharged with an interactive tions, Linux kernel prover (i. e., Coq, Isabelle/HOL or PVS) or with the help of automatic provers. I. Introduction It is well known that most problems related to verifying the conformance between some code in basically any Deductive verification is one of the most “heavyweight” practical programming language and its formal specification static techniques of formal reasoning about software prop- in any sufficiently expressive specification language are erties. It requires manual annotation of source code with undecidable in general. The situation when an automatic formal specifications and manual or semi-automatic proof prover cannot discharge a verification condition is common. of conformance between code and specifications. Deductive This may be due to various reasons: limited time or verification is applied in areas where the cost of error can memory resources for the prover, an error in the code, be too high. The main benefit of this approach is that an incorrect formal specification, an incomplete formal it can mathematically guarantee that code is free from specification, an incapacity of the prover to finish the proof. particular error classes. The specifications debugging is hard. Despite the modern In the toolsets based on the Hoare logic (such as Frama- advances []–[] in the field, the whole situation is still C[], VeriFast [], Dafny [], GNATprove []), in order to poor when specifications become complex. verify the correctness of a function, the verification engineer As it can be seen the whole process of deductive verifica-

arXiv:1811.05879v1 [cs.SE] 14 Nov 2018 is required to define a precondition — a predicate that tion requires a significant manual effort from a verification should hold in order for the following code to be functionally engineer. This implies that practically all verification correct and contain no runtime errors (i. e., models of tools have some capabilities for user interaction beyond undefined behavior that are known to the verification tool), simply supplying the program and its specification. This a postcondition — a predicate that should hold after a additional user interaction required to eventually prove function execution and describe what this code should act the correspondence between the code and its specification like to be functionally correct. Pre- and postconditions is usually called the proof overhead. This overhead can form the function’s contract. Frame-conditions or effects significantly depend on the way the user interaction is are sometimes viewed as a separate entity in the contract, accomplished in a particular verification tool. Even though but typically they present a type of postcondition that these ways can generally be very diverse, for simplicity we restrict the memory state of a program. A formal contract dare to speculatively group them into the following three is required to reason about the function’s correctness. major categories: Given a formal contract, an engineer can try to prove that ∙ External provers. The vast majority of verification a function conforms to its specification. To do this one needs tools support this approach to user interaction at some point, be it as a part of a standard workflow or the project is to develop the correct and detailed ACSL [] as a last resort. The essence is that the verification specifications and fully prove these functions with the conditions (VCs) produced by the tool are translated Frama-C framework. Thus, evaluating the toolset (Frama- into the native language of some interactive proof C + AstraVer Plugin (a fork of Jessie) + Why + Solvers) assistant, e. g., Coq, Isabelle/HOL or PVS. Then the on the real code (not artificial examples) and highlighting resulting theorems are proved using the capabilities of C idioms that are hard to describe with ACSL specification, the prover, sometimes with some extensions provided not possible to model with the deductive verification plugin, by the verification tool, e. g., custom tactics. This or just giving the right direction for the tools development. approach is often both simple and efficient, but for ACSL is a Behavioral Interface Specification Language some use cases, it has significant drawbacks such as (BISL) [] implemented in Frama-C. At the same time, the ones we describe in this paper. many higher-level formalisms for efficient specification ∙ Interactive proof editors. Some verification tools such and verification of heap-manipulating or parallel programs as WP plugin for Frama-C, Why verification plat- such as dynamic frames, separation logic, permissions, form or KIV [] verification system go further and rely/guarantee reasoning (e. g., ownership methodology), implement their own custom interactive proof manage- specification refinement (e. g., function contract inclusion) ment facilities such as tactics, transformations, and or auto-active verification are not included in ACSL. strategies. This is a hard yet powerful approach with Therefore ACSL is not tightly bound to any particular its own advantages and shortcomings. One of its major verification methodology and support for its features can disadvantages especially relevant to our particular case significantly vary in tools. is a high implementation overhead, which prevents tentative adoption and quick preliminary evaluation A. Verification Approach of the results. Specifications development is an iterative process. It ∙ Auto-active proofs. One of the possible techniques cannot be broadly described in the general case. Every to reduce and automate the manual work and make requires its own approach []–[] with separate deductive verification more suitable for such meth- axiomatizations fitting each particular case and implemen- ods as continuous verification [ ]–[] is auto-active tation. A common way to write formal specifications is verification. The general idea is to express the proofs to factor out the most complex logical statements into in the same input language as the program and its theorems and lemmas so that the provers can discharge specification themselves relying on the Curry-Howard the rest of the verification conditions automatically. The correspondence between the programs and the proofs. theorems and lemmas are then proved manually with an This approach is quite widely applied in particular interactive prover. The main benefits of this approach are in such verification tools as Dafny [], VeriFast [], the following: GNATprove [], and Why []. This is the approach ∙ Lemmas are formulated in a more general way so that we pursue and evaluate in this paper. that they contain less implementation context specific In this paper, we present our experience with integra- details (e. g., the assertion in a particular function). tion of support for auto-active verification in ACSL [], This eases the process of their manual proving and Frama-C, and AstraVer deductive verification tool and allows one to reuse them in multiple proofs of VCs applying this approach to our existing set of specified and other lemmas. string-manipulating functions from the Linux kernel. We ∙ In this case, lemmas are “aside from code”, thus demonstrate that integration of auto-active proofs to ACSL making formal specifications more maintainable and and Frama-C is both beneficial and promising approach to tolerable to small code changes because they rarely automated deductive verification that significantly reduces provoke changes in general statements. manual proof overhead. The VerKer project adheres to this approach in the This paper is organized as follows. In Section II we give ACSL specifications. The project currently contains  an overview of the VerKer project and its specifications fully proved functions from Linux kernel library folder, and describe the problems we are solving by applying auto-  functions among them are string related (i. e., have active verification to this project. Next, in Section III we the str* prefix and perform operations on strings), and present the implementation of the technique in the Frama-  are memory related (i. e., have mem* prefix). Only C framework and the AstraVer plugin. In Sections IV andV functions (see Table I) contain in its formal specification a we describe the obtained result and discuss future work on logical function with lemmas formulated on it. The formal open issues. contracts are written in a way to prove correspondence between logical functions and C functions. The lemmas are II. VerKer Project excessive, and provers sometimes can automatically prove the correctness of a lemma based on other lemmas. The VerKer project [] consists of standard library In the paper [] it is claimed that the functions are functions taken from the Linux kernel. The main idea of fully-proved but lemmas are not proved (only automatic Function Number of Proved Unproved provers, e. g., SMT-solvers since the VCs should remain Name Lemmas Automatically check_bytes   manageable by different tools that apply different strlen    tactics or decision procedures. In particular, while skip_spaces    SMT solvers efficiently handle large ground (quantifier- strchr    strchrnul    free) formulas modulo theories, most interactive tools strspn    are much more suitable to handle smaller formulas in strcspn    higher-order logic. strnlen    strpbrk    ∙ External proofs introduce additional problems of proof Total    location and sharing when the same lemma is included or imported in several source code files. TABLE I Lemmas in the VerKer project checks for contradiction was performed). The proving of III. Auto-active Proofs all lemmas in the project considered as a future work in the paper.

B. Shortcomings and limitations According to its original definition in [], auto-active verification is any verification technique where user input Thus we can found the following limitations of the is supplied before VC generation. The user input can approach based on external provers: be given in various forms including but not limited to: ∙ The use of external interactive proof assistants or auxiliary annotations such as verifier pragmas, triggers, proof editors requires the users to get accustomed to ad-hoc lemma instantiations, ghost code, and lemma at least two different specification representations, cor- functions. Among those, lemma functions are a primary tool responding languages and tools. This makes it harder intended to convey proofs of user-defined auxiliary lemmas. for the users to employ their skills acquired during The proofs are expressed as ordinary pure (i. e., without the process of specification within one verification side-effects) total (i. e., always successfully terminating) environment (e. g., the deductive verification plugin) imperative functions. Though the execution of pure total later in the process of proof within the other one imperative code does not produce any observable effects (e. g., the Coq proof assistant). In particular, the range with respect to the programming language semantics, when of supported language constructs can significantly dif- interpreted by the verification tool the instructions of the fer. Many ACSL constructs such as pointer dereference code can produce a substantial effect on its internal state do not have direct counterparts in the language of the or the internal state of the automatic theorem prover. In interactive prover. particular, through the process of VCs generation, the ∙ The reproduction of external proofs requires the use of proof-carrying code can determine the fragment of current external tools that work on their own representations proof context that is explicitly known to the verifier or the — yet the translations performed by the verification prover versus the fragment that is only known implicitly as tool in order to obtain the external representations can a set of derivable inferences. As most of the logic relevant cause various surprising instabilities, where the proof to deductive verification is generally undecidable, neither suddenly fails after small supposedly irrelevant changes the verifier nor the prover can ever explore all propositions are made to program source code or specifications. derivable in the current context. The code of the lemma This is especially well demonstrated by the pervasive functions thus serves as a guide for the verification tool problem of generating unique, stable and predictable that efficiently indicates the relevant parts of the search names for various program and specification entities space. One of the most significant cases where verification such as variables and lemmas. For instance, when tools need such a guide is with inductive proofs as they a lemma or variable is renamed, the proofs have to have very limited support in automatic provers. be adjusted accordingly, yet the relation between the names used in the original program and the names used in the proofs can be quite complex. Consider the following inductive proof expressed as a ∙ The external proofs are kept separately from the lemma function: specifications and are therefore not directly available to the reader of the verified program. This can make it harder to estimate the effort needed for modification of logic char * strchrnul(char * 푠, char 푐) = certain parts of the program or specification or to keep *푠 ≡ 푐 ? 푠 : the proofs in sync with the program and specification. *푠 ≡ ′∖0′ ? 푠 : ∙ The use of external interactive provers prevents an ap- 푠푡푟푐ℎ푟푛푢푙(푠 + 1, 푐); plication of various encodings optimized for automatic ... /*@ ghost lemma function into ACSL in particular. We first present @ /@ lemma our existing approach and current results that we obtained @@ requires 푣푎푙푖푑_푠푡푟(푠); by using it, and then we discuss some of its limitations and @@ decreases 푠푡푟푙푒푛(푠); possible directions of future developments. @@ ensures 푠 ≤ 푠푡푟푐ℎ푟푛푢푙(푠, 푐) ≤ 푠푡푟 + 푠푡푟푙푒푛(푠); @ @/ A. Integrating Lemma Functions Into ACSL/Frama-C @ void 푠푡푟푐ℎ푟푛푢푙_푖푛_푟푎푛푔푒(char * 푠, char 푐) In our fork of the Frama-C platform developed as @{ part of the AstraVer toolset, we implemented relatively @ if (*푠 ! = ′∖0′ && *푠 ! = 푐) simple support for lemma functions as a special case of @ 푠푡푟푐ℎ푟푛푢푙_푖푛_푟푎푛푔푒(푠 + 1, 푐); total pure ghost functions. Both ACSL specification and @} its implementation in mainline Frama-C provide support @*/ for ghost functions. Meanwhile, the AstraVer plugin is The lemma and its proof are expressed within a single able to generate VCs to check ACSL assigns, allocates, recursive function definition 푠푡푟푐ℎ푟푛푢푙_푖푛_푟푎푛푔푒. The decreases and terminates clauses of function contracts. induction here is simple as it exactly follows the recursive So the only missing part to provide basic support for lemma definition of the logic function 푠푡푟푐ℎ푛푢푙. The pre-condition functions was the automatic generation of lemmas and their of the function requires the parameter string 푠 to be a implicit importing in all the code functions located further valid C string so that it has a well-defined finite length. in the source code. Then the if statement in the function body splits between More concretely, in order to allow lemma functions to the base and inductive cases. In the base cases *푠 == ′∖0′ be used in a similar way to regular ACSL lemmas (which and *푠 == 푐 the post-condition trivially follows from the can only be written in logic), we introduced a new lemma definition of the function 푠푡푟푐ℎ푟푛푢푙. In the inductive case, keyword for C function specifications. which corresponds to the body of the if statement, the Our modified version of Frama-C processes it in the post-condition follows from the inductive assumption for following way: strings of a smaller length and an additional instance ∙ For a function contract of the form of a lemma about the length of the string 푠 + 1 that /@ requires 푅(푝1, . . . , 푝푛, 푔1, . . . , 푔푛); is implicitly present in the proof context (this lemma is @ ensures 퐸(푝1, . . . , 푝푛, 푔1, . . . , 푔푛, ∖푟푒푠푢푙푡); actually also proved by a lemma function that is provided @/, earlier in the code). The recursion is well-founded since in where 푝1, . . . , 푝푛 are function parameters, 푔1, . . . , 푔푛 the only recursive call the value of the variant 푠푡푟푙푒푛(푠) are global variables and the special variable ∖푟푒푠푢푙푡 de- specified in the decreases clause is strictly smaller than notes the return value of the function, a corresponding that in the caller function. With the AstraVer plugin, the axiom is generated, that is, an axiom of the form correctness of the function 푠푡푟푐ℎ푟푛푢푙_푖푛_푟푎푛푔푒 can be ∀typeof(푝1) 푝1,..., typeof(푝푛) 푝푛, verified automatically by discharging all the VCs using an typeof(푔1) 푔1,..., typeof(푔푛) 푔푛; SMT solver such as Alt-Ergo or CVC. The automatic ∃typeof(∖푟푒푠푢푙푡) ∖푟푒푠푢푙푡; proofs require no induction and only a few quantifier 푅(푝1, . . . , 푝푛, 푔1, . . . , 푔푛) =⇒ instantiations, while all the necessary terms are explicitly 퐸(푝1, . . . , 푝푛, 푔1, . . . , 푔푛, ∖푟푒푠푢푙푡); present in the VCs. So in either solver, the verification ∙ All logic definitions such as functions, predicates, ax- takes far less than a second. ioms, and lemmas in ACSL are grouped into axiomatic Now since the function 푠푡푟푐ℎ푟푛푢푙_푖푛_푟푎푛푔푒 is proved blocks. The generated axiom is placed in a new ax- correct for arbitrary values of its parameters, and since iomatic block. AstraVer implements on-demand import the function is pure and total, the statement about its of all the definitions from axiomatic blocks based on correctness with respect to the contract can be expressed occurrences of symbols defined in the axiomatic. As as the following lemma: the axiom name cannot be directly used in an ACSL lemma 푠푡푟푐ℎ푟푛푢푙_푖푛_푟푎푛푔푒 : specification, the axiom is not included in any proof ∀char * 푠, char 푐; context until some other symbol in this new axiomatic 푣푎푙푖푑_푠푡푟(푠) =⇒ is defined and later used in another specification (code 푠 ≤ 푠푡푟푐ℎ푟푛푢푙(푠, 푐) ≤ 푠푡푟 + 푠푡푟푙푒푛(푠); function or another axiomatic block). In particular, This lemma can be automatically generated from the this approach removes the axiom from the context of function contract and used in the proofs later in the the lemma function itself, preventing the axiom from program code, both explicitly by the user by calling trivially proving the function’s verification conditions. the function 푠푡푟푐ℎ푟푛푢푙_푖푛_푟푎푛푔푒 with some concrete ∙ A dummy identically true predicate with a unique arguments or implicitly by the SMT solver by instantiating name is generated, and its definition is added into the the lemma with some concrete terms. new axiomatic block. This is the basic idea of lemma functions and their use. ∙ A precondition requiring the dummy predicate from Now let us consider our approach to the integration of the new axiomatic block is inserted into the contracts of all function located after the currently processed Function Number of Number of Name Ghost Lemma lemma function, which causes the definitions from the Functions Functions axiomatic block (including the generated axiom) to strlen   be imported into the proof context of these functions. skip_spaces   strchr  As the dummy predicate is generated by the tool and strchrnul  has a unique name, the user cannot access it and thus strspn   create any extra dependencies disrupting the ordering strcspn  strnlen   of lemma proofs according to their code locations. strpbrk  ∙ The lemma function itself gets some additional clauses Total  inserted into its contract specification: TABLE II – assigns ∖푛표푡ℎ푖푛푔; allocates ∖푛표푡ℎ푖푛푔 — to en- Lemma functions sure the function is pure (i. e., it does not touch any global state); custom tactics could potentially result in a much smaller – terminates ∖푡푟푢푒; – to prevent abrupt termination number. (i. e., calls to exit(), abort() etc.) — loop and recursion termination is already required by default However, in the context of our project, the use of lemma in the AstraVer plugin. functions became a decisive choice that enabled complete, An alternative approach enforcing the topological or- practical, stable and future-proof verification of auxiliary dering of lemma functions has been considered: explicit lemmas with minimal implementation effort. The ability to removal of the axiom from the context of the functions instantiate lemmas by explicitly calling the corresponding located before the lemma body. This is possible to achieve lemma functions and the availability of all general C using the meta remove_prop pragma in the generated constructs in the proof code provided the verification WhyML model. However, this would have required Frama- engineer with sufficient control. By these means in our C to track the connection between the function and the examples we were able to reduce the verification time of axiom explicitly, and the AstraVer plugin to use that each lemma function (cumulative time of the solvers on all connection to generate the meta statement in the WhyML function’s VCs) to a few seconds. output. A minimal implementation that fully translates lemma functions into existing specification entities was chosen in the end. V. Open Issues and Future Work

IV. Results A. Logic types and expressions in code Even though at the previous stage of out project [] we ended up with  unproved lemmas, we did not attempt to Unlike many specification languages originally developed match our lemma function proofs at this stage exactly with with auto-active verification in mind, ACSL maintains a the lemmas used at the previous stage. Instead, the goal was clear distinction between code and specification languages. to ensure all the code can be verified without resorting to In particular, C and logic types as well as C and logic any additional assumptions by completely replacing all the expressions are represented entirely separately within the lemmas involved with a minimum required set of lemma Frama-C kernel and are only allowed in their respective functions. Thus we ended up with  lemma functions positions. Therefore lemma functions cannot be directly required to fully prove the VCs for some code functions as used to prove the lemmas with quantification over logic presented in Table II. types. This limitation can be overcome in individual cases of Unfortunately, a thorough comparison of the proof quantified logic variables ranging over the domains bounded overhead between the interactive and auto-active proofs by C values. turned out to be very hard. The current support for interactive proofs that existed in AstraVer plugin was quite For universal quantification of the form basic, which resulted in considerably large manual proofs. The reduction of these proofs seems only possible with very ∀ 푣; 푣 ≤ 푣 < 푣 =⇒ 푃 (푣), significant implementation effort including implementation L 1 2 of relevant custom tactics and optimizing the VC generation for manual proofs. For the reference, we provide an where 푣1 and 푣2 are C values of type C a proof can be illustrative comparison of an interactive auto-active proof of accomplished through the use of an auxiliary lemma a single lemma in Figure  as it can be performed with the current implementation. This example shows the reduction aux: ∀ C 푣; 푣1 ≤ 푣 < 푣2 =⇒ 푃 (푣), in proof overhead of more than tenfold, yet a different more optimized manual proof augmented with appropriate which can then be generalized using a dummy proof of the ⎫  Theorem Strchr_skipped : ⎪  ∀ voidP_str___alloc_table_at_L: alloc_table voidP, ⎪ ∀ charP_charM_str___at_L : map (pointer voidP) Int.t, ⎪  ⎪  ∀ str__ : pointer voidP, ⎪  ∀ c__ : Int . t, ⎪ ∀ i__ : Uint. t, ⎪  ⎪  valid_str ⎪  str__ ⎪ voidP_str___alloc_table_at_L ⎪ ⎪  charP_charM_str___at_L ∧ ⎪  strchr ⎪

⎪   str__ ⎪  c__ ⎪ ⎪  charP_charM_str___at_L ≠ (null : pointer voidP) ∧ ⎪  Uint. infix_lseq (Uint. of_int %Z) i__ ∧ ⎪  (Uint. to_int i__ < sub_pointer ⎬⎪  (strchr  str__ ⎪  c__ ⎪  charP_charM_str___at_L) ⎪ str__))%Z → ⎪  ⎪

 get ⎪ Generated by Why ⎪  charP_charM_str___at_L ⎪  (shift str__ (Uint.to_int i__)) ≠ c__. ⎪  (* Why intros ⎪ ⎪  voidP_str___alloc_table_at_L ⎪  charP_charM_str___at_L ⎪  str__ c__ i__ ⎪ ⎪  (h,(h ,(h ,(h ,h)))).*) ⎪  intros ⎪  /*@ ghost  voidP_str___alloc_table_at_L ⎪  @ /@ lemma ⎪  charP_charM_str___at_L ⎪  @@ requires 푣푎푙푖푑_푠푡푟(푠푡푟);  str_ c__ i_ ⎭⎪  @@ requires 푠푡푟푐ℎ푟(푠푡푟, 푐) ≠ ∖푛푢푙푙;  (h,( h,(h,(h,h)))).  @@ requires 0 ≤ 푖 < 푠푡푟푐ℎ푟(푠푡푟, 푐) − 푠푡푟;   @@ decreases 푖;  DefinitionP (i : int ) := ⎫  @@ ensures 푠푡푟[푖] ≠ 푐;  ∀ a : alloc_table voidP, ⎪  ∀ m : map (pointer voidP) Int . t, ⎪  @ @/ ∀ ⎪ @ void strchr_skipped(char ∗ s t r , char c,  s : pointer voidP, ⎪  ∀ c : Int . t, ⎪  @ s i z e _ t i )  valid_strsam ∧ ⎪  @{ ̸ ∧ ⎪  strchrscm =(null : pointer voidP) ⎪  @ if (i> && ∗ s t r !=’\ ’ && ∗ s t r != c )  ( ≤ i )%Z ∧ ⎪  @ strchr_skipped(str +  , c , i −  );  (i < sub_pointer (strchrscm ) s)%Z → ⎪ ⎪  @}  getm (shiftsi ) ≠ c. ⎪  specialize Wf_Z.natlike_rec with (P := P) as Ind. ⎪  @∗/ ⎪  assert (P ) as Base. ⎪  unfoldP . ⎫ ⎪ ⎪ ⎪  introsamsc Valid . ⎪ ⎪  unfold not. ⎪ ⎪ ⎬ ⎪ ... lines ⎪ ⎪ ⎪  rewrite First in Valid . ⎪  ⎪  rewrite Sub_pointer_self in Valid . ⎭⎪ ⎬⎪  omega. user proof ∀ → →  assert ( z : int ,( < z)%Z P (Z.predz ) Pz ) as⎫ Step. ⎪  introsz Gt Pred. ⎪ unfoldP . ⎪ ⎪  ⎪ ⎪ lines of  introsamsc Valid . ⎬⎪ ⎪

⎪ 

lines ⎪ ... ⎪ ⎪ ⎪  apply Strchr_same_block with ⎪  ⎪ ⎪ ⎪  (voidP_s___alloc_table_at_L := a). ⎭ ⎪  tauto. ⎪  assert (∀ z : int ,( ≤ z)%Z → Pz ) as Lemma. ⎪ ⎪  apply Ind. ,: tauto. ⎪  unfoldP in Lemma . ⎪  apply Lemma with (a := voidP_str___alloc_table_at_L). ⎪ ⎫ ⎪  unfold Uint. infix_lseq inh . ⎪  rewrite Uint. Of_int inh . ⎬⎪ ⎪ ⎪ ... lines ⎪ ⎪  unfold Uint. in_bounds. omega. ⎭⎪  ⎪  tauto. ⎭  Qed. Fig. . Proofs of lemma strchr_skipped. Coq proof on the left. Auto-active proof with lemma function on the right. form grouped into axiomatic blocks. Ultimately, instead of /@ lemma choosing between the declaration and the definition points @ ensures ∀ L 푣; 푣1 ≤ 푣 < 푣2 =⇒ 푃 (푣); to be used for topological ordering, we propose including @/ the lemma function definitions (bodies) into the axiomatic void gen() { blocks directly. The ordering then can be retained as in /@ loop invariant 푣1 ≤ 푖 ≤ 푣2; the current implementation. However, this would be a @ loop invariant ∀ C 푗; 푣1 ≤ 푗 < 푖 =⇒ 푃 (푗); significant change to the ACSL specification. @/ for (C i = v; i < v; i++) C. Proof reuse and higher-order logic 푎푢푥(푖); Careful consideration of the resulting auto-active proofs } from our study reveals that many of them are very For logic expressions 푙(푢) ranging over the C values similar to each other and represent instances of some more 푣 ≤ 푙(푢) < 푣 , where 푢 are C expressions, a proxy C 1 2 general proof patterns. In general, it is natural to express function can be introduced: such more general lemmas and their proofs using higher- /*@ ensures ∖푟푒푠푢푙푡 ≡ 푙(푢); order logic (HOL). However, the support for higher-order @ assigns ∖푛표푡ℎ푖푛푔; reasoning in SMT-based deductive verification tools such @*/ as Jessie or AstraVer is usually very limited due to the C 푝푟표푥푦(푢); inherent incompleteness of decision procedures for HOLs. An alternative solution to direct support for HOL would be Note that 푙(푢) may contain logic constructs that are thus a stratified solution, where the core specification language encapsulated by the proxy. In general, proxies should does not support any HOL constructs, but a separate be introduced carefully as their post-conditions are not higher-level language such as the language of modules does checked and may introduce inconsistencies, but in some provide constructs that can be used to express and prove cases, the proxy can be given an implementation (definition) higher-order propositions. This approach is already used that satisfies the corresponding specification. In our proofs, in WhyML specification language. Let us illustrate this we only introduced proxies with definitions. approach on two example lemmas from our study: Ultimately, the proper support for logic types and logic char * strchr(char * 푠, char 푐) = expressions in the ghost code should be added to ACSL *푠 ≡ 푐 ? 푠 : specification and implemented in Frama-C to fully support *푠 ≡ ′∖0′ ?(char*)∖푛푢푙푙 : 푠푡푟푐ℎ푟(푠 + 1, 푐); auto-active proofs with lemma functions. lemma strchr_skipped : B. Lemma functions and axiomatics ∀char * 푠, 푐, size_t 푖; valid_str(푠) ∧ Normally, lemmas in ACSL are grouped into axiomatic strchr(푠, 푐) ̸≡ ∖푛푢푙푙 ∧ blocks. However, axiomatic blocks cannot include code. 0 ≤ 푖 < strchr(푠, 푐) − 푠 =⇒ Therefore, lemma functions cannot be directly grouped 푠[푖] ̸≡ 푐; into axiomatic blocks. Moreover, allowing lemma functions ... to be declared as lemmas in axiomatic blocks introduces a logic char * strchrnul(char * 푠, char 푐) = problem with ordering of proofs. To be well-founded, the *푠 ≡ 푐 ? 푠 : lemma functions should be split into strongly connected *푠 ≡ ′∖0′ ? 푠 : 푠푡푟푐ℎ푟푛푢푙(푠 + 1, 푐); components and topologically ordered. However, unlike lemma strchrnul_skipped : code functions that can only be called explicitly, lemmas ∀char * 푠, 푐, size_t 푖; can be instantiated implicitly, provided they are preset in valid_str(푠) ∧ the proof context. When each lemma function has just a 0 ≤ 푖 < strchrnul(푠, 푐) − 푠 =⇒ single definition, the definition order of lemma functions 푠[푖] ̸≡ 푐; can serve as their required topological order, with an only Currently, the lemmas have to be proved as two separate additional restriction that mutually recursive functions lemma functions. Though there is a single generic function should be somehow declared simultaneously. However, when that suits both cases, its definition makes use of higher- lemma functions are allowed to have separate declarations order features : and definitions, the ordering of lemma functions becomes logic char * strchrcomb(boolean 푐표푛푑(char * 푠, char 푐), generally unspecified. In the current implementation, the lemma functions are allowed to have separate declarations, char * 푑푓푡(char * 푠), but they are later grouped into strongly connected com- char * 푠, char 푐) = ponents, and the definition of the first function in the 푐표푛푑(푠, 푐)? 푠 : *푠 ≡ ′∖0′ ? 푑푓푡(푠, 푐): 푠푡푟푐ℎ푟푐표푚푏(푠 + 1, 푐); component is then used for topological ordering. This Within the stratified approach, this function should be essentially simulates simultaneous definitions. However, this placed in a separate module alongside the functions cor- approach is completely ignorant of the lemma declarations responding to its parameters, auxiliary definitions, and appropriate lemmas: [] F. Rothenberger, “Integration and analysis of alternative SMT predicate 푐표푛푑(char * 푠, char 푐) reads * 푠; solvers for software verification,” Master’s thesis, ETH-Zürich, . logic char * 푑푓푡(char * 푠); [] G. Petiot, N. Kosmatov, B. Botella, A. Giorgetti, and predicate 푑푓푡_푠푎푚푒 = ∀char * 푠; 푑푓푡(푠) ≡ 푠; J. Julliand, “How testing helps to diagnose proof failures,” predicate 푑푓푡_푛푢푙푙 = ∀char * 푠; 푑푓푡(푠) ≡ ∖푛푢푙푙; Formal Aspects of Computing, Jun . [Online]. Available: https://doi.org/./s--- logic char * 푠푡푟푐ℎ푟푐표푚푏(char * 푠, char 푐) = [] M. Hentschel, R. Bubel, and R. Hähnle, “The symbolic execution 푐표푛푑(푠, 푐)? 푠 : debugger (SED): a platform for interactive symbolic execution, *푠 ≡ ′∖0′ ? 푑푓푡(푠): debugging, verification and more,” International Journal on Software Tools for Technology Transfer, . [Online]. Available: 푠푡푟푐ℎ푟푐표푚푏(푠 + 1, 푐); https://doi.org/./s -- - lemma 푠푡푟푐ℎ푟푐표푚푏_푠푘푖푝푝푒푑 : [] W. Reif, “The KIV-approach to software verification,” in KO- ∀char * 푠, 푐, size_t 푖; RSO: Methods, Languages, and Tools for the Construction of Correct Software. Springer,  , pp.  –. 푣푎푙푖푑_푠푡푟(푠) ∧ [ ] C. Calcagno, D. Distefano, J. Dubreil, D. Gabi, P. Hooimeijer, (푑푓푡_푠푎푚푒 ∨ 푑푓푡_푛푢푙푙 ∧ 푠푡푟푐ℎ푟푐표푚푏(푠, 푐) ̸≡ ∖푛푢푙푙) ∧ M. Luca, P. O’Hearn, I. Papakonstantinou, J. Purbrick, and 0 ≤ 푖 < strchrcomb(푠, 푐) − 푠 =⇒ D. Rodriguez, “Moving fast with software verification,” in NASA Formal Methods, K. Havelund, G. Holzmann, and R. Joshi, Eds. !푐표푛푑(푠 + 푖, 푐); Cham: Springer International Publishing, , pp. –. Then the lemmas can be proved in the general case as usual, [] A. Chudnov, N. Collins, B. Cook, J. Dodds, B. Huffman, e. g., by using lemma-functions. The resulting module can C. MacCárthaigh, S. Magill, E. Mertens, E. Mullen, S. Tasiran, be instantiated by providing concrete definitions for logic A. Tomb, and E. Westbrook, “Continuous formal verification of Amazon sn,” in Computer Aided Verification, H. Chockler functions 푐표푛푑 and 푑푓푡 to obtain both logic functions 푠푡푟푐ℎ푟 and G. Weissenbacher, Eds. Cham: Springer International and 푠푡푟푐ℎ푟푛푢푙. Upon instantiation, their corresponding Publishing, , pp. –. lemmas do not have to be proven, and they can be safely [] P. W. O’Hearn, “Continuous reasoning: Scaling the impact of formal methods,” in Proceedings of the rd Annual ACM/IEEE instantiated directly into axioms. Symposium on Logic in Computer Science, ser. LICS ’. New York, NY, USA: ACM, , pp. –. [Online]. Available: VI. Conclusion http://doi.acm.org/./ .  [] C. Dross and Y. Moy, “Auto-active proof of red-black trees in In this paper we propose the implementation of auto- SPARK,” in NASA Formal Methods, C. Barrett, M. Davies, and T. Kahsai, Eds. Cham: Springer International Publishing, , active proofs technique in the Frama-C framework and pp. –. AstraVer plugin, evaluate our implementation on VerKer [] J.-C. Filliâtre and A. Paskevich, “Why — where programs project specifications by proving all string-related lemmas, meet provers,” in Proceedings of the nd European Symposium on Programming, ser. Lecture Notes in Computer Science, and demonstrate the benefits of the approach by comparing M. Felleisen and P. Gardner, Eds., vol.  . Springer, Mar. it with manual lemma proofs in Coq. , pp. –. The results of the work are publicly available. The [] P. Baudin, P. Cuoq, J.-C. Filliâtre, C. Marché, B. Monate, Y. Moy, and V. Prevosto, “ACSL: ANSI/ISO C Specification source code with specifications and verification protocols Language,” CEA LIST and INRIA, Tech. Rep. ., March . are available as a part of the VerKer project. The proofs [] D. Efremov, M. Mandrykin, and A. Khoroshilov, “Deductive are easy reproducible since there is a configuration of a Verification of Unmodified Linux Kernel Library Functions,” in Leveraging Applications of Formal Methods, Verification and continuous integration system. The modified toolset is Validation. Verification, T. Margaria and B. Steffen, Eds. Cham: available on the AstraVer project page. Springer International Publishing, , pp. –. The obtained results show the approach is very promising [] J. Hatcliff, G. T. Leavens, K. R. M. Leino, P. Müller, and M. Parkinson, “Behavioral interface specification languages,” and should be adopted in the mainline of AstraVer toolset. ACM Comput. Surv., vol. , no. , pp. :–:, Jun. . Auto-active verification in Frama-C ease the work of a ver- [Online]. Available: http://doi.acm.org/./. ification engineer, make specifications more maintainable [] J.-C. Filliâtre, “Verifying two lines of C with Why: an exercise in program verification,” in Proceedings of the th and potentially can lower the cost of deductive verification. International Conference on Verified Software: Theories, Tools, Experiments, ser. VSTTE’. Berlin, Heidelberg: References Springer-Verlag, , pp. – . [Online]. Available: http: //dx.doi.org/./ ----_ [] F. Kirchner, N. Kosmatov, V. Prevosto, J. Signoles, and [] A. Tafat and C. Marché, “Binary heaps formally verified in B. Yakobowski, “Frama-C: A software analysis perspective,” Why,” INRIA, Research Report , Oct. , http://hal. Formal Aspects of Computing, vol. , no. , pp. – , May inria.fr/inria-/en/. . [Online]. Available: https://doi.org/./s-- [ ] A. Blanchard, N. Kosmatov, and F. Loulergue, “Ghosts for - Lists: A Critical Module of Contiki Verified in Frama-C,” [] B. Jacobs, J. Smans, P. Philippaerts, F. Vogels, W. Penninckx, in Tenth NASA Formal Methods Symposium - NFM , and F. Piessens, “VeriFast: A powerful, sound, predictable, fast Newport News, United States, Apr. . [Online]. Available: verifier for C and Java,” in NASA Formal Methods Symposium. https://hal.inria.fr/hal- Springer, , pp. –. [] J. Burghardt, R. Clausecker, J. Gerlach, and H. Pohl, “ACSL [] K. R. M. Leino, “Dafny: An automatic program verifier for By Example,” Fraunhofer Institute for Open Communication functional correctness,” in International Conference on Logic for Systems, Tech. Rep., . Programming Artificial Intelligence and Reasoning. Springer, [] K. R. M. Leino and M. Moskal, “Usable auto-active verification,” , pp. –. . [] B. Carré and J. Garnsworthy, “SPARK — an annotated ada subset for safety-critical programming,” in Proceedings of the conference on TRI-ADA’ . ACM,  , pp.  –.