Interpolation: Theory and Applications

Vijay D’Silva Google Inc., San Francisco

UC Berkeley 2016

Wednesday, March 30, 16 1 A Brief History of Interpolation

2 Verification with Interpolants

3 Interpolant Construction

4 Further Reading and Research

Wednesday, March 30, 16 Craig Interpolants

For two formulae A and B such that A implies B B,aCraig interpolant is a formula I such that 1. A implies I, and I 2. I implies B, and 3. the non-logical symbols in I occur in A and in B. A

P (P = Q)=Q = R = Q ^ ) ) ) )

Wednesday, March 30, 16 Reverse Interpolants or “Interpolants”

For a contradiction A B,areverse interpolant is ^ a formula I such that B 1. A implies I, and 2. I B is a contradiction, and ^ I 3. the non-logical symbols in I occur in A and in B.

Note: In classical logic, A = B is valid exactly ) A if A B is a contradiction. ^¬ In the verification literature, “interpolant” usually means “reverse interpolant.”

Wednesday, March 30, 16

International Business Machines Corporation 2050 Rt 52 Hopewell Junction, NY 12533 845-892-5262

October 7, 2008

Dear Andreas,

I would like to congratulate Cadence Research Labs on their 15th Anniversary. In these 15 years, Cadence Research Labs has worked at several frontiers of Electronic Design Automation. They focus on hard problems that when solved significantly push the state of the art forward. They found novel solutions to system, synthesis and formal verification problems.

Formal verification is the process of exhaustively validating that a logic entity behaves correctly. In contrast to testing-based approaches, which may expose flaws though generally cannot yield a proof of correctness, the exhaustiveness of formal verification ensures that no flaw will be left unexposed. Formal verification is thus a critical technology in many domains, being essential to safety-critical applications and to enable increased quality and reduced development costs of hardware and software systems. The benefits of formal verification come at a substantial "cost": its exhaustiveness implies that it generally requires computational resources which grow exponentially with respect to the size of the entity being analyzed. Cadence Research Labs has had a fundamental role in the research and development of leading-edge formal International Business Machines Corporation verification technologies, which have been critical to increasing the scalability2050 Rtand 52 applicability of formal Hopewell Junction, NY 12533 verification techniques to an industrially relevant level. 845-892-5262

October 7, 2008

CRL made important contributions in satisfiability checking technologies and model checking algorithms. Dear Andreas, Satisfiability checking is arguably one of the most fundamental algorithms in computer-aided design, with I would like to congratulate Cadence Research Labs on their 15th Anniversary. In these 15 years, Cadence pervasive application Researchdomains Labs has including worked at several verificatio frontiers of Electronicn. Members Design Automation. of TheyCadence focus on hard Research labs are world- recognized experts in problemsthe field that when of solved high-performance significantly push the state ofsatisfiability the art forward. They solvers, found novel solutions and collectivelyto have developed a system, synthesis and formal verification problems. set of solvers including MiniSAT, BerkMin, and Forklift which have won numerous competitions, been Formal verification is the process of exhaustively validating that a logic entity behaves correctly. In downloaded and used contrastin thousands to testing-based approaches,of applications, which may expose a ndflaws have though generally integrated cannot yield novel a proof oftricks and ideas which have correctness, the exhaustiveness of formal verification ensures that no flaw will be left unexposed. Formal become the basis of countlessverification is thus other a critical solvers. technology in many domains, being essential to safety-critical applications and to enable increased quality and reduced development costs of hardware and software systems. The benefits of formal verification come at a substantial "cost": its exhaustiveness implies that it generally requires Model checking algorithmscomputational are resources widely which growused exponentially for verif with yingrespect tohardware the size of the entity and being software analyzed. models. CRL has Cadence Research Labs has had a fundamental role in the research and development of leading-edge formal pioneered numerous fundamentalverification technologies, ideas which haveand been algorithms critical to increasing to the this scalability field, and applicabilityincluding of formal "interpolation" as a verification techniques to an industrially relevant level. satisfiability-based proof method which is often dramatically faster and more scalable than prior proof CRL made important contributions in satisfiability checking technologies and model checking algorithms. techniques. CBL researchersSatisfiability checkinginvented is arguably numerous one of the most novel fundamental methods algorithms in tocomputer-aided automatically design, with reduce the domain of a pervasive application domains including verification. Members of Cadence Research labs are world- verification problem throughrecognized experts "abstracting" in the field of high-performance it based satisfiability upon unsatisfiability solvers, and collectively have proofs. developed aThese techniques have set of solvers including MiniSAT, BerkMin, and Forklift which have won numerous competitions, been substantially increaseddownloaded the scalability and used in thousands of formal of applications, verification and have integrated of novel complex tricks and ideas hardware which have designs. become the basis of countless other solvers.

CRL researchers haveModel not checking only algorithmsused logic are widely optimizati used for verifyingons hardware to andspeed software up models. formal CRL has verification algorithms, but are pioneered numerous fundamental ideas and algorithms to this field, including "interpolation" as a now also applying themsatisfiability-based to sequential proof method optimization. which is often dramatically Sequential faster and more synthesis scalable than prior has proo longf been a holy grail in logic techniques. CBL researchers invented numerous novel methods to automatically reduce the domain of a optimization. A large verificationpart of problem the design through "abstracting" space it rema based uponins unsatisfiability untapped proofs. unless These techniques one canhave reliably and effectively substantially increased the scalability of formal verification of complex hardware designs. optimize and verify in the sequential domain. Recent progress from CRL shows that there is some promise we can tap into this someCRL researchers time inhave the not only not used too logic distaoptimizatintons future. to speed up formal verification algorithms, but are now also applying them to sequential optimization. Sequential synthesis has long been a holy grail in logic optimization. A large part of the design space remains untapped unless one can reliably and effectively optimize and verify in the sequential domain. Recent progress from CRL shows that there is some promise we can tap into this some time in the not too distant future. Leon Leon

Leon Stok Leon Stok Director, Director, Electronic Design Automation IBM Corporation Electronic Design Automation

Wednesday, MarchIBM 30, Corporation16

1 A Brief History of Interpolation

2 Verification with Interpolants

3 Interpolant Construction

4 Further Reading and Research

Wednesday, March 30, 16 Craig’s Interpolation Lemma (1957)

William Craig in 1988 http://sophos.berkeley.edu/interpolations/

Wednesday, March 30, 16 Craig’s Interpolation Lemma (1957)

Lemma. First-order logic has the interpolation property. That is, if A = B is valid, then a ) Craig interpolant exists.

William Craig in 1988 http://sophos.berkeley.edu/interpolations/

Wednesday, March 30, 16 A High-Level View of the Proof

“The intuitive idea for Craig’s proof of the Interpolation Theorem rests on the completeness theorem for FOL, in the form of the equivalence of validity with provability in a suitable system of axioms and rules of inference. By “suitable” here is meant one in which there is a notion of a direct proof for which if Phi implies Psi is provable then there is a direct proof of Psi from Phi. One would expect that in such a proof, the relation symbols of Phi that are in Psi would not disappear in the middle. Such systems were devised by Herbrand (1930) and Gentzen (1934); Hilbert-style systems enlarged by the axioms and rules of the epsilon-calculus can also serve this purpose. Feferman, Harmonious Logic: Craig’s Interpolation Theorem and Its Descendants, 2008

Wednesday, March 30, 16 Another Perspective on the Interpolation Theorem

“Important results have many things to say. At first sight, the Interpolation Theorem of Craig (1957) seems a rather technical result for connoisseurs inside logical meta-theory. But over the past decades, its broader importance has become clear from many angles. In this paper, I discuss my own current favourite views of interpolation: no attempt is made at being fair or representative. First, I discuss the entanglement of inference and vocabulary that is crucial to interpolation. Next, I move to the role of interpolants in facilitating generalized inference across different models. Then I raise the perhaps surprising issue of ‘what is the right formulation of Craig’s Theorem?’, high-lighting the existence of non-trivial options in formulating meta-theorems. ... Finally, I discuss the ‘end of history’. Craig’s Theorem is about the last significant property of first-order logic that has come to light. Is there something deeper going on here, and if so, can we prove it?” -- Johan van Benthem, The Many Faces of Interpolation, 2008

Wednesday, March 30, 16 Interpolation In

1957 1960 1970 1980 1990 2000 2010

• Simpler proofs of known properties: Beth definability, Robinson’s theorem. • Interpolant structure: Lyndon Interpolation theorems (1959). • Preservation under homomorphisms (connections to finite-).

• Many-sorted and Infinitary logics: Feferman ’68, ’74, Lopez-Escobar ’65, Barwise ’69, Stern ’75, Otto ’00. • Model theoretic characterizations: Makowsky ’85 for a survey. • Amalgamation and algebraic characterization

• Guarded fragment: Hoogland, Marx, Otto ’00 • Modal and fixed point logics: Ten Cate ’05, • Uniform interpolation: Pitt ’92, Visser ’96, d’Agostino, Hollenberg ’00

Wednesday, March 30, 16 Interpolation In Complexity and

1957 1960 1970 1980 1990 2000 2010

1971 1971, Cook. The Complexity of Theorem Proving Procedures

Proof Content. For any lan- guage A NP and n N, one 2 2 can construct in polynomial time a formula

Fn(x1,...,xn,y1,...,yp(n))

in propositional logic such that for all x 0, 1 n 2 { } x A y.F (x, y)=true 2 () 9 n

Wednesday, March 30, 16 Interpolation and Complexity Theory

1957 1960 1970 1980 1990 2000 2010

1971 1971, Cook. The Complexity of Theorem Proving Procedures

1982 Mundici, NP and Craig’s Interpolation Theorem (pub. 1984)

1983 Mundici, A Lower bound for the complexity of Craig’s Interpolants in Sentential Logic

Theorem. (Mundici, 1982) At least one of the following is true. 1. P = NP. 2. NP = coNP. 6 3. For F and G in propositional logic, such that F = G, an ) interpolant is not computable in time polynomial in the size of F and G.

Wednesday, March 30, 16 Interpolation and (Proof) Complexity Theory

1957 1960 1970 1980 1990 2000 2010

Jan Krajíček, Interpolation theorems, lower bounds for proof systems, and independence results for bounded arithmetic. 1997

Pudlák, Lower Bounds for Resolution and Cutting Plane Proofs and Monotone Computations

A proof system has feasible interpolation if, whenever there is a short ` refutation of A B, the interpolant is computable in polynomial time in ^ the size of the proof.

Lemma If there is a resolution refutation of size n for a formula A B, ^ there is an interpolant of circuit size 3n that is computable in time n.

Wednesday, March 30, 16 Interpolation and (Proof) Complexity Theory

260 A. Carbonel Annals of Pure and Applied Logic 83 (1997) 24%299 1957 1960 1970 1980 1990 2000 2010 It is easy to see that it describes the inner proof

D+D 1997D--+CvD Carbone, Interpolants, Cut Elimination and Flow Graphs for the

where we will refer to D --f C V D as inner sequent of A V D + A, C V D. On the other hand, not all subgraphs of II induce a proof. For instance, consider the subgraph

A*VE9D+D D,AvE -+A,D

D,dvE/+A!CvD Combinatorial description of how information Notice that a given proof may contain more than one inner proof. Consider the following very simple proof: flows in a proof. Interpolants eliminate certain B+B C-+C flows and preserve others. The “relevant” in- A-+A BvC+B,C A,B v C + B,A A C formation is preserved. and the two logical flows for it

By*7

A-+A 4=sT/!Y A,BvC + B,A AC

describing respectively the inner proofs

Wednesday, March 30, 16 A+A c+c BAB C+C and A,C-+AAC BVC+C,B ’

The idea of inner proof is a basic intuition for both the proof of the Craig Inter- polation Theorem and the result on the Cut Elimination Theorem we want to present. Given a proof of A -+ B we will ‘extract’ from its cut-free form a proof of A + C (C -+ B) by keeping the logical paths linked to A (B) and forgetting the ones only linked to B (A). Second, we show that if we can ‘extract’ (by keeping and forgetting paths properly) a proof from the cut-free form of l7 of a certain inner sequent, then we can ‘extract’ a proof directly from II of the same inner sequent. Interpolants in Automated Reasoning

1957 1960 1970 1980 1990 2000 2010

1995 Huang, Constructing Craig Interpolation Formulas. (OTTER)

Amir, McIlraith, Partition-Based Logical Reasoning. 2001

2003 McMillan, Interpolation and SAT-Based Model Checking.

2004 Henziger, Jhala,Majumdar,McMillan, Abstractions from Proofs

2005 McMillan, An Interpolating Theorem Prover

Wednesday, March 30, 16 1 A Brief History of Interpolation

2 Verification with Interpolants

3 Interpolant Construction

4 Further Reading and Research

Wednesday, March 30, 16 Systems Analysis with SAT/SMT Solvers

• Bounded Model Checking and System Property Symbolic Execution generate formulae encoding bounded executions. Constraint Generation • Can we generate invariants? Formula • Can we explore deeper executions Solver without running out of memory?

• Can we avoid exploring redundant UNSAT SAT system behaviours? Satisfying Assignment

Wednesday, March 30, 16 A Simple Binary Counter

Initial State def J(a) def=(a2 a1 a0) J(a) =(a2 a1 a0) ^ ^^ ^ ¿ ¿ def a a a2a1a0 def 2 1 0 TT((aa,,aa00)) =(=(a2 a2a1 aa120 a10a)0 a0 ) ¿ ¿ ^ ^) )^ 2^^ 1 ^ (a2 a1 a0 a0 ) ¡ (a2 a12 1a0 a0 ) ¡ a2a1a0 ¡ ^ ) ^ 2^ 1 a a a Transition (a2 a1^ a20 )a10 ) ^ ^ ¬ 2 1 0 ¡ ^ ) ^ ^ (a2 a1 a20 a10 ) ¬ Relation (a2 a1 a20 a10 ) √ a2a1a0 ^ ^) )^ ^^ ^ ¬ (a20(a2a10 aa120 a10 a20a20 aa1010)) a0 √ a2a1a0 ^ ^_ ^) _ ^^ )^ def ¬ F (a) = a0 (a(0a1 aa02) a0 a0 a0 a0 ) a0 ^ 2 ^_ 1 _ 2 ^ 1 _ 2 ^ 1 ) 0 def F (a) = a0 (a1 a2) Fig. 2. A transition systemReachability implementing Condition^ a binary_ counter.

a2a1a10 [a10 ] a1 [ ] a2 [ ] a2a1a20 [a20 ] Fig. 2. A transition? system implementing? a binary counter.

a2a10 [a10 ] a1a20 [a20 ] a2a1a10 [a10 ] a1 [ ] a2 [ ] a2a1a20 [a20 ] ? a0 [a0?] a0 a0 a0 [a0 a0 a0 ] 2 2 2 1 0 2 _ 1 _ 0

Wednesday, March 30, 16 a2a10 [a10 ] a1a20 [a20 ] a10 [a10 ] a10 a0 [a20 a10 a20 a0] J(a)=a2 a1 ^ _ ^ ^ T (a, a0)=(a2 a1 a10 ) _ _ ^ a a[0a [a0a] a ] aa0 a[0 a] 0 [a0 a0 a0 ] (a a a0 ) 0 220 2 10 0 20 1 0 2 1 0 2 _ 1 _ 2 ^ ^ ^ > _ _ (a0 a0 a0 ) 2 _ 1 _ 0 F (a0)=a0 ⇤ [a20 a10 a0] a10 [a10 ] a10 a0 ^[a20 ^ a10 a20 a0] J(a)=a2 a1 ^ _ ^ Fig. 3. Refutation^ with McMillan’s interpolant of J(a) T (a, a0) and F (a0). The figure ^ T (a, a0shows)=( aa contradictory2 a1 a10 ) subset of the clauses of a CNF encoding of the formulae. _ _ ^ a [a a a ] a [ ] (a a a0 ) 0 20 10 0 0 2 _ 1 _ 2 ^ ^ ^ > (a0 a0 a0 ) 4.1 Labelling2 _ 1 _ Functions0 and Interpolation F (a0)=a0 ⇤ [a20 a10 a0] Definition 5 (Labelling Function). Let ( , , , ) be the lattice^ below,^ where = , a, b, ab is a set of symbols and ,S vandu t are defined by the Hasse Fig. 3. RefutationS {? with} McMillan’s interpolantv u of J(at) T (a, a0) and F (a0). The figure diagram. A labelling function LR : VR Lit for a^ refutation R over a set shows a contradictoryof literals Lit satisfies subset that of for the all clausesv V⇥ ofand a!t CNFS Lit: encoding of the formulae. 2 R 2 ab 1. LR(v, t)= i↵ t/`R(v) ? +2 ab 2. L (v, t)=L (v ,t) L (v,t) for an internal vertex R R t R 4.1 Labellingv and literal Functionst `R(v). and Interpolation 2 ? Definition 5 (Labelling Function). Let ( , , , ) be the lattice below, where Due to condition (2) above, the labelling function for literals at internal vertices = , a, b, ab is a set of symbols and ,S vandu t are defined by the Hasse S {? is completely} determined by the labels of literalsv u at initialt vertices. A variable x diagram.is AA-locallabellingin a pair function (A, B)ifxL Var(: VA) Var(LitB), B-localforif x a refutationVar(B) Var(AR), over a set 2R R\ 2 \ of literalslocalLitif itsatisfies is either of that these, for and allsharedv otherwise.V⇥ and!t S Lit: 2 R 2 ab 1. LR(v, t)= i↵ t/`R(v) ? +2 ab 2. L (v, t)=L (v ,t) L (v,t) for an internal vertex R R t R v and literal t `R(v). 2 ?

Due to condition (2) above, the labelling function for literals at internal vertices is completely determined by the labels of literals at initial vertices. A variable x is A-local in a pair (A, B)ifx Var(A) Var(B), B-local if x Var(B) Var(A), local if it is either of these, and2 shared \otherwise. 2 \ def J(a) =(a2 a1 a0) ¿ ^ ^ a a a def 2 1 0 T (a, a0) =(a2 a1 a0 a0 ) ¿ ^ ) 2 ^ 1 ^ (a2 a1 a20 a10 ) ¡ a2a1a0 ¡ ^ ) ^ ^ (a2 a1 a0 a0 ) ¬ ^ ) 2 ^ 1 ^ (a2 a1 a20 a10 ) √ a2a1a0 ^ ) ^ ^ ¬ (a0 a0 a0 a0 a0 a0 ) a0 2 ^ 1 _ 2 ^ 1 _ 2 ^ 1 ) 0 def F (a) = a0 (a1 a2) A Refutation and Its Interpolant^ _ Fig. 2. A transition system implementing a binary counter.

a a a0 [a0 ] a [ ] a [ ] a a a0 [a0 ] 2 1 1 1 1 ? 2 ? 2 1 2 2

a2a10 [a10 ] a1a20 [a20 ]

a0 [a0 ] a0 a0 a0 [a0 a0 a0 ] 2 2 2 1 0 2 _ 1 _ 0

a10 [a10 ] a10 a0 [a20 a10 a20 a0] J(a)=a2 a1 ^ _ ^ ^ T (a, a0)=(a2 a1 a10 ) _ _ ^ a [a a a ] a [ ] (a a a0 ) 0 20 10 0 0 2 _ 1 _ 2 ^ ^ ^ > (a0 a0 a0 ) 2 _ 1 _ 0 F (a0)=a0 [a0 a0 a0 ] 0 ⇤ 2 ^ 1 ^ 0

def Fig. 3. RefutationJ(a) with=( McMillan’sa2 a1 a0) interpolant of J(a) T (a, a0) and F (a0). The figure ¿ ^ ^ a a a def ^ shows2 1 a0 contradictoryT (a, a0) =( subseta2 a1 of thea0 clausesa0 ) of a CNF¿ encoding of the formulae. The interpolant,^ ) 2 ^ in1 ^this case is the image. In general, (a2 a1 a20 a10 ) ¡ a2a1a0 ¡ interpolants^ ) for^ an ^appropriately constructed formula are (a2 a1 a0 a0 ) ¬ overapproximations^ ) 2 ^ 1 ^ of images. (a2 a1 a20 a10 ) √ 4.1a2a1 Labellinga0 Functions^ ) and^ Interpolation^ ¬ (a0 a0 a0 a0 a0 a0 ) a0 2 ^ 1 _ 2 ^ 1 _ 2 ^ 1 ) 0 def Definition 5F ((Labellinga) = a0 (a Function).1 a2) Let ( , , , ) be the lattice below, where Wednesday, March 30, 16 ^ _ = , a, b, ab is a set of symbols and ,S vandu t are defined by the Hasse S {? } v u t Fig.diagram. 2. A transition A labelling system implementing function a binaryL : counter.V Lit for a refutation R over a set R R ⇥ ! S a a a0of[a0 literals] aLit[ ] satisfiesa [ ] thata fora a0 all[a0 ] v VR and t Lit: 2 1 1 1 1 ? 2 ? 2 1 2 2 2 2 ab 1.a2La10R[(av,10 ] t)= i↵ t/a`1aR20 ([va20)] ? +2 ab 2. LR(v, t)=LR(v ,t) LR(v,t) for an internal vertex a20 [a20 ] t a20 a10 a0 [a20 a10 a0] v and literal t `R(v). _ _ 2 ? a10 [a10 ] a10 a0 [a20 a10 a20 a0] J(a)=a2 a1 ^ _ ^ ^ T (a, a0)=(a2 a1 a10 ) Due_ to_ condition^ a (2)[a above,a a the] labellinga [ function] for literals at internal vertices (a a a0 ) 0 20 10 0 0 2 _ 1 _ 2 ^ ^ ^ > is(a0 completelya0 a0 ) determined by the labels of literals at initial vertices. A variable x 2 _ 1 _ 0 F (a0)=a0 [a a a ] is0A-local in a pair (A, B)if⇤x 20 Var(10 A0) Var(B), B-local if x Var(B) Var(A), 2 ^ ^ \ 2 \ Fig. 3. Refutationlocal withif McMillan’s it is either interpolant of these, of J(a) andT (a,shareda0) and F (otherwise.a0). The figure ^ shows a contradictory subset of the clauses of a CNF encoding of the formulae.

4.1 Labelling Functions and Interpolation

Definition 5 (Labelling Function). Let ( , , , ) be the lattice below, where = , a, b, ab is a set of symbols and ,S vandu t are defined by the Hasse S {? } v u t diagram. A labelling function LR : VR Lit for a refutation R over a set of literals Lit satisfies that for all v V⇥ and!t S Lit: 2 R 2 ab 1. LR(v, t)= i↵ t/`R(v) ? +2 ab 2. L (v, t)=L (v ,t) L (v,t) for an internal vertex R R t R v and literal t `R(v). 2 ?

Due to condition (2) above, the labelling function for literals at internal vertices is completely determined by the labels of literals at initial vertices. A variable x is A-local in a pair (A, B)ifx Var(A) Var(B), B-local if x Var(B) Var(A), local if it is either of these, and2 shared \otherwise. 2 \ ReachabilityReachability with with Interpolants Interpolants

Approximate Image

S(x0) I (x1) U(x1) Forward ReachabilityUnsafe Analysis Start Forward Reachability Analysis Start Unsafe Interpolant I (x1)satisfiesthat: Start Unsafe S(x ) T (x , x ) I (x ) (over-approximates the image) 0 ^ 0 1 ) 1 I (x ) U(x ) is UNSAT (contains no unsafe state) 1 ^ 1

6

Algorithm: Compute images till a fixed point is reached.

Algorithm:Image Compute of Qx images: x : Q till(x a) fixedT (x point, x0) is reached. 9 ^ Image of QxFixed: x Point: Q(x on) QT(x():x, xx0), x0. x :[Q(x) T (x, x0) Q(x0)] Wednesday, March 30, 16 9 ^ 8 9 ^ ) Fixed Point on Q(x): x, x . x :[Q(x) T (x, x ) Q(x )] 8 0 9 ^ 0 ) 0 4

4 Interpolation Slogans

• A poor person’s quantifier elimination Program Property

• A separator between two regions of a search space Constraint Generation • A summary of why a bounded- Formula property holds occurs

Solver • An approximate image operator

• A relevance heuristic that articulates UNSAT SAT the core reason for a proof Proofs, Satisfying Interpolants Assignment

Wednesday, March 30, 16 Bounded Execution as a Formula

int x0= i; int y0 = j; int x = i; x1 = y0 + 1 int y = j; y1 = x0 + 1; while (foo()) { x2 = y1 + 1 // Code that does not y2 = x1 + 1; // modify 'x' or 'y'. x = y + 1; x3 = y2 + 1 y = x + 1; y3 = x2 + 1; } if (i = j && if (i = j && x <= 10) x3 <= 10) { assert(y <= 10); if (y3 > 10) Err:// ERROR REACHED }

Wednesday, March 30, 16 Bounded Execution and Interpolants

int x0= i; int y0 = j; x1 = y0 + 1 x = i +2 y = j +2 y1 = x0 + 1; A 2 ^ 2 x2 = y1 + 1 • Symbolic representation of the states y2 = x1 + 1; reachable after two iterations x3 = y2 + 1 • Image computation for program y3 = x2 + 1; statements typically requires quantifier if (i = j && B x3 <= 10) { elimination in that theory. if (y3 > 10) Err:// ERROR REACHED }

Wednesday, March 30, 16 Another Interpolant

int x = i; int y = j; while (foo()) { i = j = x2 y2 // Code that does not )  // modify 'x' or 'y'. • Potential loop invariant x = y + 1; • Invariant computation often requires y = x + 1; } fixed point computation, quantifier if (i = j && x <= 10) elimination, or even both. assert(y <= 10);

Wednesday, March 30, 16 A Space of Interpolants

i = j = i = j = ) ) x2 y2 y2 x2 int x0= i;   int y0 = j; x1 = y0 + 1 x2 = i +2 y1 = x0 + 1; A x2 = y1 + 1 y2 = j +2 y2 = x1 + 1; ^ i = j = ) x3 = y2 + 1 x = y y3 = x2 + 1; 2 2 if (i = j && x3 <= 10) { B if (y3 > 10) Err:// ERROR } • Multiple interpolants exist. • They differ in size, logical strength, symbols, etc. • The ideal one depends on the problem

Wednesday, March 30, 16 Approximation of States with Interpolants

States reaching B error in m steps

States on Function I similar paths summary Approximate k-image States that enter and exit a function A States reachable States on one in n-steps program path

The challenge is to encode these constraints and respect the vocabulary condition

Wednesday, March 30, 16 Abstract ReachabilitySoftware Tree model Construction checking [McMillan, 2006]

L = 0; L=0 do { L!=0 L==0 assert(L==0); lock() L = 1; old = new; ERR L=1 if (*){ L = 0; unlock() old=new new++; } } while (new!=old); new!=old

L!=0

38/98 ERR

Example: McMillan 2006, Graphic Ruemmer ’14

Wednesday, March 30, 16 Sequence Interpolants from Reachability Tree In the example

L=0

L==0

L=1

old=new

new!=old

L!=0

ERR 40/98

Example: McMillan 2006, Graphic Ruemmer ’14

Wednesday, March 30, 16 Abstract Reachability Construction

• Interpolants Decorate Positions on the Reachability Tree

• They denote state that are reached at those points

• A covering check is used to determine if all states at some location have been visited

• More complicated than predicate abstraction or fixed point computation due to non-monotonicity of interpolant construction

Wednesday, March 30, 16 Dynamic System (Circuit, Program, Hybrid System) Analyzer = Constraint Generation Solver = Constraint Solving + Interpolant Construction Property Checking with Interpolants

Wednesday, March 30, 16 Data Types

Conditionals/ Assignments

Loops Path Sharing

Recursion Generalization Boolean Structure

(Relative) Theory Completeness

Combination EUF

Theory Property Checking Quantifiers with Interpolants

Wednesday, March 30, 16 General picture

Generalizations of Classical Interpolants

Wednesday, March 30, 16 58/98 1 A Brief History of Interpolation

2 Verification with Interpolants

3 Interpolant Construction

4 Further Reading and Research

Wednesday, March 30, 16 Dynamic System

Analyzer = Constraint Boolean Structure Generation Theory

Combination EUF

Theory Property Checking Quantifiers with Interpolants

Wednesday, March 30, 16 Flows in Resolution Proofs

A B

x x, y y, z z • Literals flow around in the proof y y • Literals with opposite polarity cancel each other y ⇤ • Propositional interpolant construction can be viewed as controlling initial inputs and y y gating the flows using a circuit ¬ ? ? > > • Want to let shared variables flow through the A-part and restrict flow from the B-part.

Wednesday, March 30, 16 Interpolating Proof Rules

Split rules based on vocabulary

A-Hyp (C A) C [ ` var(`) var(B) ] 2 { | 2 } B-Hyp (C B) C [ ] 2 > C x [I ] x D [I ] A-Res _ 1 _ 2 (x var(A) var(B)) C D [I I ] 2 \ _ 1 _ 2 C x [I ] x D [I ] B-Res _ 1 _ 2 (x var(B)) C D [I I ] 2 _ 1 ^ 2

Annotate formulae with Partial Interpolants

Wednesday, March 30, 16 Applying Interpolating Proof Rules

a a [a ] a a [a ] a a [ ] a a [ ] a [ ] a1a2 [ ] a1a3 [ ] a2a3 [ ] a2a4 [ ] a4 [ ] 1 2 2 1 3 3 2 3 > 2 4 > 4 > ? ? > > >

a a [a a ] a [a ] a [ ] a2a3 [ ] a2 [ ] a2 [ ] 2 3 2 _ 3 2 2 2 > ? ? > a [a a ] a [ ] a [ ] a [ ] 3 3 ^ 2 3 > 3 ? 3 > A-Hyp C [C ] ⇤ [a3 a2] |B ⇤ [a3] ^ B-Hyp C [ ] > C x [I ] x D [I ] A-Res _ 1 _ 2 A =(a1 (a)a2) McMillan’s(a1 a3) a2 System C D [I I ] (b) Symmetric System _ ^ _ ^ _ 1 _ 2 C x [I ] x D [I ] B =(a2 a3) (a2 a4) a4 B-Res _ 1 _ 2 _ ^ _ ^ C D [I1 I2] I = _ ^

Wednesday, March 30, 16 Fig. 1. Refutation yielding di↵erent interpolants for di↵erent systems.

Definition 4 (McMillan’s System). McMillan’s system ItpM maps vertices in an (A, B)-refutation R as to partial interpolants as defined below.

For an initial vertex v with `(v)=C

(A-clause) if C A (B-clause) if C B C[C ] 2 C[T] 2 |B + For an internal vertex v with piv(v)=x, `(v )=C x and `(v)=C x 1 _ 2 _ C x [I ] C x [I ] 1 _ 1 2 _ 2 C C [I ] 1 _ 2 3

def (A-Res) if x/Var(B), I3 = I1 I2 2 def _ (B-Res) if x Var(B), I = I I 2 3 1 ^ 2

See [11] for McMillan’s proof of correctness. Example 1 shows that the inter- polants obtained from ItpM and ItpS are di↵erent and that ItpM is not symmetric.

Example 1. Let A be the formula (a a ) (a a ) a and B be the formula 1 _ 2 ^ 1 _ 3 ^ 2 (a2 a3) (a2 a4) a4. An (A, B)-refutation R is shown in Figure 1. The partial_ interpolants^ _ in^ McMillan’s system are shown in Figure 1(a) and those in the symmetric system in Figure 1(b). We have that Itp (R)=a a and M 3 ^ 2 Itp (R)=a . For the inverse systems, the interpolants are Itp0 (R)=a a S 3 M 2 ^ 3 and Itp0 (R)=a . Observe that Itp (R) Itp (R), Itp (R) Itp0 (R), and S 3 M ) S S , ¬ S Itp0 (R) Itp0 (R). ¬ S ) ¬ M C Example 2 below shows that there are interpolants that cannot be obtained by these systems and that the interpolants from ItpM and ItpS may coincide.

Example 2. Let A be the formula a (a a ) and B be the formula (a a ) a . 1^ 1_ 2 1_ 2 ^ 1 Applying Interpolating Proof Rules

a a [a ] a a [a ] a a [ ] a a [ ] a [ ] a1a2 [ ] a1a3 [ ] a2a3 [ ] a2a4 [ ] a4 [ ] 1 2 2 1 3 3 2 3 > 2 4 > 4 > ? ? > > >

a a [a a ] a [a ] a [ ] a2a3 [ ] a2 [ ] a2 [ ] 2 3 2 _ 3 2 2 2 > ? ? > a [a a ] a [ ] a [ ] a [ ] 3 3 ^ 2 3 > 3 ? 3 > A-Hyp C [C ] ⇤ [a3 a2] |B ⇤ [a3] ^ B-Hyp C [ ] > C x [I ] x D [I ] A-Res _ 1 _ 2 A =(a1 (a)a2) McMillan’s(a1 a3) a2 System C D [I I ] (b) Symmetric System _ ^ _ ^ _ 1 _ 2 C x [I ] x D [I ] B =(a2 a3) (a2 a4) a4 B-Res _ 1 _ 2 _ ^ _ ^ C D [I1 I2] I = a a _ ^ 3 ^ 2 Wednesday, March 30, 16 Fig. 1. Refutation yielding di↵erent interpolants for di↵erent systems.

Definition 4 (McMillan’s System). McMillan’s system ItpM maps vertices in an (A, B)-refutation R as to partial interpolants as defined below.

For an initial vertex v with `(v)=C

(A-clause) if C A (B-clause) if C B C[C ] 2 C[T] 2 |B + For an internal vertex v with piv(v)=x, `(v )=C x and `(v)=C x 1 _ 2 _ C x [I ] C x [I ] 1 _ 1 2 _ 2 C C [I ] 1 _ 2 3

def (A-Res) if x/Var(B), I3 = I1 I2 2 def _ (B-Res) if x Var(B), I = I I 2 3 1 ^ 2

See [11] for McMillan’s proof of correctness. Example 1 shows that the inter- polants obtained from ItpM and ItpS are di↵erent and that ItpM is not symmetric.

Example 1. Let A be the formula (a a ) (a a ) a and B be the formula 1 _ 2 ^ 1 _ 3 ^ 2 (a2 a3) (a2 a4) a4. An (A, B)-refutation R is shown in Figure 1. The partial_ interpolants^ _ in^ McMillan’s system are shown in Figure 1(a) and those in the symmetric system in Figure 1(b). We have that Itp (R)=a a and M 3 ^ 2 Itp (R)=a . For the inverse systems, the interpolants are Itp0 (R)=a a S 3 M 2 ^ 3 and Itp0 (R)=a . Observe that Itp (R) Itp (R), Itp (R) Itp0 (R), and S 3 M ) S S , ¬ S Itp0 (R) Itp0 (R). ¬ S ) ¬ M C Example 2 below shows that there are interpolants that cannot be obtained by these systems and that the interpolants from ItpM and ItpS may coincide.

Example 2. Let A be the formula a (a a ) and B be the formula (a a ) a . 1^ 1_ 2 1_ 2 ^ 1 McMillan’s Interpolation System

A-Hyp (C A) C [ ` var(`) var(B) ] 2 { | 2 } B-Hyp (C B) C [ ] 2 > C x [I ] x D [I ] A-Res _ 1 _ 2 (x var(A) var(B)) C D [I I ] 2 \ _ 1 _ 2 C x [I ] x D [I ] B-Res _ 1 _ 2 (x var(B)) C D [I I ] 2 _ 1 ^ 2

Theorem. The partial interplant labelling the empty clause is an interplant for A and B.

Wednesday, March 30, 16 Program = Control + Data

Analyzer = Constraint Boolean Structure Generation Theory

Combination EUF

Theory Property Checking Quantifiers with Interpolants

Wednesday, March 30, 16 Equality Proofs

f(u, y)=z u = x v = y f(x, v) = z 6 f(u, y)=f(x, v)

f(u, y) = z 6 ⇤

A = u = x f(u, y)=z ^ • Deduced literals may not be in A or in B B = v = y f(x, v) = z • New terms may use non-shared symbols ^ 6 I = f(x, y)=z • Interpolant may be over terms not in the proof

Wednesday, March 30, 16 Coloured Congruence Graphs

f(u, y)=z u = x v = y f(x, v) = z 6 f(x, y)=z

f(x, v)=z ⇤

z = A = u = x f(u, y)=z 6 ^ f(u, y) f(x, v) B = v = y f(x, v) = z ^ 6 f(x, y) I = f(x, y)=z

Wednesday, March 30, 16 Interpolation from Coloured Congruence Graphs

z = 6 f(u, y) f(x, v)

f(x, y)

• Modify graph to be colourable A = u = x f(u, y)=z • Take summaries A-paths by endpoints that ^ are over the shared vocabulary B = v = y f(x, v) = z ^ 6 • Summarize B-paths as premises for A- I = f(x, y)=z summaries

Wednesday, March 30, 16 Interpolation with B-Premises

v = 6 f(z,u) y

f(z,x)

• Mixes A and B reasoning A = v = f(z,u) y = f(z,x) • Endpoints of B-reasoning paths are ^ antecedents of implications for A B = u = x v = y ^ 6 • Implication introduced by combination of I = u = x = v = y congruence and shared reasoning )

Wednesday, March 30, 16 1 A Brief History of Interpolation

2 Verification with Interpolants

3 Interpolant Construction

4 Further Reading and Research

Wednesday, March 30, 16 Program = Control + Data Analyzer = Constraint Generation Interpolation and SMT Solver = Constraint Solving + Interpolant Construction

1957 1960 1970 1980 1990 2000 2010

1995 Huang, Constructing Craig Interpolation Formulas. (OTTER) 2005 to present

Amir, McIlraith, Partition-Based Logical Reasoning. Interpolation for theories: numeric, 2001 bit-vectors, strings, arrays, etc.

Interpolation for equality and theory 2003 McMillan, Interpolation and SAT-Based Model Checking. combinations.

200 Henziger, Jhala,Majumdar,McMillan, Abstractions from Proofs Quantified interpolants.

Sequence, tree and DAG 2005 McMillan, An Interpolating Theorem Prover interpolants.

Wednesday, March 30, 16 Program = Control + Data Analyzer = Constraint Generation Analysis with Interpolants Solver = Constraint Solving + Interpolant Construction

1957 1960 1970 1980 1990 2000 2010

2003 McMillan, Interpolation and SAT-Based Model Checking. 2006 onwards

Henzinger, Jhala, Majumdar, McMillan, Abstraction from Proofs. 2004 Interpolant Loops Sequence

2006 McMillan, Lazy Abstraction with Interpolants

Recursion Tree Interpolant

2009 Vizel, Grumberg, Interpolation-Sequence Based Model Checking

Multiple DAG Interpolant Paths 2010 Heizmann, Hoenicke, Podelski, Nested Interpolants

Albarghouthi, Gurfinkel, Chechik, Whale: An Inteprolation-Based 2012 Frameworks Horn Clauses Algorithm for Inter-Procedural Verification

Wednesday, March 30, 16 Program = Control + Data Analyzer = Constraint Generation Analysis of Interpolants Solver = Constraint Solving + Interpolant Construction

1957 1960 1970 1980 1990 2000 2010

2006 Jhala, McMillan, A Practical and Complete Approach to Predicate Refinement. Theory Independent

2010 D. Kroening, Purandare, Weissenbacher, Interpolant Strength.

Propositional Logic

2012 Rollini, Sery, Sharygina. Leveraging Interpolant Strength in Model Checking.

Alberti, Brutomesso, Ghilardi, Ranise, Sharygina, Lazy Abstraction with 2012 Interpolants for Arrays.

2013 Albarghouthi, McMillan, Beautiful Interpolants.

2013 Ruemmer, Subotic, Exploring Interpolants.

Wednesday, March 30, 16 Further Reading: Propositional Interpolants

1995 Huang, Constructing Craig Interpolation Formulas. (OTTER)

Jan Krajíček, Interpolation theorems, lower bounds for proof systems, and independence results for 1997 bounded arithmetic.

1997 Pudlák, Lower Bounds for Resolution and Cutting Plane Proofs and Monotone Computations

2003 McMillan, Interpolation and SAT-Based Model Checking.

2006 Yorsh, Musuvathi, A Combination Method for Generating Interpolants.

2009 Biere, Bounded Model Checking (in Handbook of Satisfiability).

2010 D. Kroening, Purandare, Weissenbacher. Interpolant Strength.

Wednesday, March 30, 16 Further Reading: Equality Interpolants

1996 Fitting, First-Order Logic and Automated Theorem Proving

2005 McMillan, An Interpolating Theorem Prover

2006 Yorsh, Musuvathi, A Combination Method for Generating Interpolants.

2009 Fuchs, Goel, Grundy, Krstic, Tinelli, Ground Interpolation for the Theory of Equality.

2014 Bonacina, Johansson, Interpolation Systems for Ground Proofs in Automated Reasoning

Wednesday, March 30, 16 Interpolation in Theories

2005 McMillan. Interpolating Theorem Prover LA(Q)

2006 Kapur, Majumdar, Zarba, Interpolation for Data Structures Datatype theories

2007 Rybalchenko, Sofronie-Stokkermans, Constraint Solving for Interpolation LA(Q)

Cimatti, Griggio, Sebastiani, Efficient Interpolant Generation in Satisfiability 2008 LA(Q), DL(Q), UTVPI Modulo Theories

Jain, Clarke, Grumberg, Efficient Craig Interpolation for Linear Diophantine 2008 LDE, LME (dis)Equations and Linear Modular Equations

2009 Cimatti, Griggio, Sebastiani, Interpolant Generation for UTVPI UTVPI

2011 Griggio, Effective Word-Level Interpolation for Software Verification Bit-Vectors

Wednesday, March 30, 16 Interpolation in Theory Combinations

2005 McMillan. Interpolating Theorem Prover LA(Q) over EUF over Bool

2005 Yorsh and Musuvathi, A Combination Method for Generating Interpolants Nelson-Oppen

Cimatti, Griggio, Sebastiani, Efficient Generation of Craig Interpolants in Delayed Theory 2009 Satisfiability Modulo Theories Combination

2009 Goel, Krstic, Tinelli, Ground Interpolation for Combined Theories Proof transformation

Proof Transformation 2012 Kovacs, Voronkov, Playing in the Gray Area of Proofs

Wednesday, March 30, 16