<<

TLA+ Model Checking Made Symbolic

Igor Konnov

Interchain Foundation / Informal Systems

Jure Kukovec Thanh-Hai Tran Swiss non-profit foundation

Supports R&D of applications that are: - secure and scalable - decentralized

Main focus: - the Cosmos Network - Tendermint consensus What is Cosmos?

Cosmos is a decentralized network of independent blockchains

Blockchains are powered by BFT consensus like Tendermint

They communicate over Inter-Blockchain Communication protocol

[https://cosmos.network/ecosystem]

Igor Konnov 3 of 47 Tendermint

Byzantine fault-tolerant State Machine Replication middleware

Consensus protocol adapts DLS & PBFT for blockchains:

- wide area network

- hundreds of validators and thousands of nodes

- communication via gossip

efficient and open source

Igor Konnov 4 of 47 [informal.systems]

Verification-Driven Development of Tendermint:

1. PODC-style specifications in English

2. TLA+ specifications (make English formal / fix it)

- model checking for finding bugs in TLA+ specs

3. Implementation in Rust

- model-based testing of the implementation using TLA+ specs

4. Automated verification of TLA+ specs

Igor Konnov 5 of 47 TLA+ model checking made symbolic Why TLA+?

Rich specification language

TLA+ is used in industry, e.g.,

TLA+ tools maintained at and

- an interactive proof system (TLAPS) - a model checker (TLC), state enumeration

Raft

Paxos (Synod), Egalitarian Paxos, Flexible Paxos

Apache Kafka several bugs found

Igor Konnov 7 of 47 TLA+

First-order with sets (ZFC)

Rich expression syntax:

- operations on sets, functions, tuples, records, sequences

Temporal operators:

- 2 (always), 3 (eventually), ; (leads-to), no Nexttime

Practice: safety properties, 2Invariant

Igor Konnov 8 of 47 APALACHE 0.5.0 Symbolic model checker that works under the assumptions of TLC: Fixed and finite constants (parameters)

Finite sets, function domains and co-domains

TLC’s restrictions on formula structure

Bounded model checking to check safety

As few language restrictions as possible

Technically, Quantifier-free formulas in SMT: QF_UFNIA Unfolding quantified expressions: x S : P as P[c/x] 8 2 c S V2 Igor Konnov 9 of 47 APALACHE 0.5.0 Symbolic model checker that works under the assumptions of TLC: Fixed and finite constants (parameters)

Finite sets, function domains and co-domains

TLC’s restrictions on formula structure

Bounded model checking to check safety

As few language restrictions as possible

Technically, Quantifier-free formulas in SMT: QF_UFNIA Unfolding quantified expressions: x S : P as P[c/x] 8 2 c S V2 Igor Konnov 9 of 47 example from distributed A service for reliable broadcast

one process broadcasts a message bcast

unforgeability: if no correct process received bcast, 000 ...0 then no correct process ever accepts bcast

correctness: if all correct processes received bcast, 111 ...1 then some correct process eventually accepts bcast

relay: if a correct process accepts bcast, 011 ...1 then all correct processes eventually accept bcast

Igor Konnov 11 of 47 Reliable broadcast by Srikanth & Toueg 87

local myval 0, 1 -- did process i receive bcast? ⌥ i 2{ } ⌅ while true do if myvali = 1 and not sent ECHO before then send ECHO to all

if received ECHO from at least n-2t distinct processes and not sent ECHO before then send ECHO to all

if received ECHO from at least n-t distinct processes then accept od

⌃resilience: of n > 3t processes, f t processes are Byzantine ⇧ 

Igor Konnov 12 of 47 How to check its properties?

I read that paper about Byzantine Model Checker

Model the as a threshold automaton

Verify safety and liveness for all n, t, f : n > 3t t f 0 ^ [forsyte.at//bymc]

I have heard this talk by

Let’s write it in TLA+

Run the TLC model checker for fixed parameters

Igor Konnov 13 of 47 Declaration and initialization

EXTENDS Integers , FiniteSets

N =4 12 T =4 3 F =4 3 Corr =4 1..(N F 1) Faulty =4 (N F) .. N

VARIABLES pc , rcvd , sent

Init =4 pc [Corr “V 0“, “V 1“ ] some processes receive the broadcast ^ 2 !{ } sent = no messages sent initially ^ {} rcvd [Corr ] no messages received initially ^ 2 !{} Transition relation

Next =4 p Corr : 9 2 Receive(p) ^ UponV 1(p) ^_ UponNonFaulty(p) _ UponAccept(p) _ UNCHANGED pc, sent _ h i

Receive (p) =4 newMessages SUBSET(sent Faulty) : 9 2 [ rcvd 0 =[rcvd EXCEPT ![self ]=rcvd[p] newMessages] [ Actions

UponV1 ( p ) =4 pc[p]=“V1” ^

pc0 =[pc EXCEPT ![p]=“SE”] sent0 = sent p ^ ^ [{ }

UponNonFaulty (p) =4 pc[p] “V0”, “V1” Cardinality(rcvd 0[p]) >= N 2 T ^ 2{ }^ ⇤

pc0 =[pc EXCEPT ![p]=“SE”] sent0 = sent p ^ ^ [{ }

UponAccept (p) =4 pc[p] “V0”, “V1”, “SE” Cardinality(rcvd 0[p]) >= N T ^ 2{ }^

pc0 =[pc EXCEPT ![p]=“AC”] ^

sent0 = sent (IF pc[p] = “SE” THEN p ELSE ) ^ [ 6 { } {} Safety?

unforgeability: if no correct process received bcast, 000 ...0 then no correct process ever accepts bcast

\* a non-inductive invariant Unforg =4 p Corr : pc[p] = “AC” 8 2 6

\* restricted initial states InitNoBcast =4 Init pc [Corr “V0” ] ^ 2 !{ }

Check that every state reachable from InitNoBcast satisfies Unforg Breaking unforgeability

12 processes, 4 faults n = 3f

APALACHE-MC: a counterexample in 5 minutes

- 12K SMT constants, 34K SMT assertions depth 6

TLC: a counterexample after 2 hrs 21 min

- 600M states depth 6 how does APALACHE work? What is hard about TLA+?

Rich data

sets of sets, functions, records, tuples, sequences

No types

TLA+ is not a programming language

No imperative statements like assignments

TLA+ is not a programming language

No standard control flow

TLA+ is not a programming language

Igor Konnov 21 of 47 Essential steps

Assignments TLA+ Flat TLA+ Reduction SMT & symbolic Types specification specification rules (UF_NIA) transitions

Extracting assignments and symbolic transitions

similar to TLC treat some x ... as assignments 0 2{ } Simple type inference

propagate types at every step x : Int gives us x : Set[Int] { } Bounded model checking

overapproximate data structures and use SMT

Igor Konnov 22 of 47 assignments & symbolic transitions Symbolic transitions [Kukovec, K., Tran, ABZ’18]

Next =4 p Corr : 9 2 Receive(p) ^ UponV 1(p) ^_ UponNonFaulty(p) _ UponAccept(p) _ UNCHANGED pc, sent _ h i

Automatically partitioning Next into four transitions:

p Corr : p Corr : 9 2 9 2 Receive(p) Receive(p) ^ ^ UponV 1(p) UponNonFaulty(p) ^ ^ p Corr : p Corr : 9 2 9 2 Receive(p) Receive(p) ^ ^ UponAccept(p) UNCHANGED pc, sent ^ ^ h i Igor Konnov 24 of 47 Symbolic transitions [Kukovec, K., Tran, ABZ’18]

Next =4 p Corr : 9 2 Receive(p) ^ UponV 1(p) ^_ UponNonFaulty(p) _ UponAccept(p) _ UNCHANGED pc, sent _ h i

Automatically partitioning Next into four transitions:

p Corr : p Corr : 9 2 9 2 Receive(p) Receive(p) ^ ^ UponV 1(p) UponNonFaulty(p) ^ ^ p Corr : p Corr : 9 2 9 2 Receive(p) Receive(p) ^ ^ UponAccept(p) UNCHANGED pc, sent ^ ^ h i Igor Konnov 24 of 47 Types

Igor Konnov 25 of 47 Types: scalars and functions

Basic: constants: Const “a”, “hello” integers: Int -1, 1024 Booleans: Bool FALSE, TRUE

Finite sets: Set[⌧] Set[Set[Int]]

Function-like: functions: ⌧ ⌧ Int Bool 1 ! 2 ! tuples: ⌧ ⌧ Int Bool (Int Int) 1 ⇥···⇥ n ⇥ ⇥ ! records: [Const ⌧ ,...,Const ⌧ ] [“a” Int, “b” Bool] 7! 1 7! n 7! 7! sequences: Seq(⌧) Seq[Int]

Igor Konnov 26 of 47 Simple type inference

Knowing the types at the current state

Compute the types of the expressions and of the primed variables

if X has type Set[Int]

X [X X] has type Int Int 0 2 ! ! y in y X : y > 0 has type Int { 2 } and are polymorphic constructors for sets and sequences {} hi hence, we ask the user to specify the type, e.g., <: Int {} { } records also require type annotations

Igor Konnov 27 of 47 Bounded model checking

Igor Konnov 28 of 47 Old recipe for bounded symbolic computations

Two symbolic transitions that assign values to x

Next =4 A B _

Translate TLA+ expressions to SMT with some · J K state 0 state 1 state 2 ...

Init x i A[i /x] x 0 a A[c /x] x 0 a 7! 0 0 7! 1 1 7! 2 B[i /x] x 0 b B[c /x] x 0 b ... 0 7! 1 1 7! 2 x 0 a , b x 0 c x 0 a , b x 0 c J K J 2{K1 1} 7! 1 J 2{ 2K 2} 7! 2 J K J K J K J K

Igor Konnov 29 of 47 What is ? · J K

Igor Konnov 30 of 47 Our idea

Mimic the semantics implemented by TLC

Compute layout of data structures, constrain contents with SMT

Define operational semantics by reduction rules (for finite models)

trade efficiency for expressivity

Igor Konnov 31 of 47 Static picture of TLA+ values and relations between them

Arena: SMT:

integer sort Int c5

1 2 Boolean sort Bool

c = FALSE 4 c3 name, e.g., "abc", uninterpreted sort

1 2 finite set: - a constant c of uninterpreted sort set⌧ - propositional constants for members c1 = 22 c2 = 4 in c ,c ,...,in c ,c h 1 i h n i Arenas for sets: 1, 2 , 2, 3 {{ } { }}

c6 : Set[Set[Int]] 1 2

c4 : Set[Int] c5 : Set[Int] 1 2 1 2

c1 : Int c2 : Int c3 : Int

SMT defines the contents, e.g., to get 1 , 2 : {{ } { }}

in c ,c in c ,c in c ,c in c ,c h 1 4i ^¬ h 2 4i ^ h 2 5i ^¬ h 3 5i

Igor Konnov 33 of 47 Tuples and records: "a", 3, [b 0, c 3] h 7! 7! i

c : Name Int [b : Int, c : Int] 15 ⇤ ⇤ 3

c14 :[b : Int, c : Int] 1 2 2 1 c11 : Name c12 : Int c13 : Int

Arena and types precisely define the contents of tuples and records

Igor Konnov 34 of 47 Functions and sequences

a function f : ⌧ ⌧ is encoded with its relation: 1 ! 2

x, f [x] : x DOMAIN f {h i 2 }

a sequence is encoded as a triple:

fun, start, end h i

Igor Konnov 35 of 47 Abstract reduction system

A state is e Ar ⌫ : | | | aTLA+ ⌦expression e↵and arena Ar,

a valuation ⌫ : Vars Cells ! [{?} SMT constraints

Reduction rules:

simplify the expression, enrich the arena and add constraints

Igor Konnov 36 of 47 A reduction sequence

c1 1 c1 c3 c5 1 2{{ }} c c 2{ } c1 c4 {}2{{ }} 1 2{{ 2}} 2

Arena: c1, c2, c3, c4, c5 c c , c c , 3 ! 2 4 ! 3

c : U c : Int c : U c : U c5 SMT: 1 SSI 2 3 SI 4 SSI $ in c ,c h 3 4i c2 = 1 in c ,c in c ,c h 2 3i h 3 4i ^ c1 = c3 ... Rewriting a set filter

x 1 , 2 : p(x) x c : p(x) c { 2{ } } { 2 3 } 4

Corresponding arena

c3 : Set[Int] c3 : Set[Int] c4 : Set[Int]

(empty)

c1 : Int c2 : Int c1 : Int c2 : Int

Igor Konnov 38 of 47 SMT constraints for set filter

in c ,c in c ,c c5 = true h 4 1i , h 3 1i ^ in c ,c in c ,c c6 = true h 4 2i , h 3 2i ^

Set filtering x 1 , 2 : p(x) p(1) c5 : Bool { 2{ } }

c3 : Set[Int] c4 : Set[Int] p(2) c6 : Bool

1 2 1 2

c1 : Int c2 : Int

Igor Konnov 39 of 47 Equalities

Integers, Booleans, and string constants

SMT equality (=)

Sets, functions, records, tuples, and sequences

- lazy, define X = Y when needed e.g., X Y Y X ✓ ^ ✓ - avoid redundant constraints

- use locality thanks to arenas, cache equalities

Igor Konnov 40 of 47 + + KERA : a core language of TLA action operators

reduction rules + proofs

Igor Konnov 41 of 47 Experiments

Igor Konnov 42 of 47 Benchmarks

Name Description LCR-n Leader election in rings with n processes Bakery-n Bakery algorithm for of n processes bcastByz-n Reliable broadcast of n processes bcastFolk-n Folklore broadcast of n processes EWD840-n Termination detection in a ring of n processes Paxos-n Paxos consensus for n acceptors with crash faults Prisoners-n Puzzle of n prisoners Raft-n Raft consensus for n processes and crash faults SimpAlloc-c-r Simple resource allocator with c clients and r re- sources Traffic Traffic example TwoPhase-n Two-phase commit with n resource managers

Repository: github.com/tlaplus/Examples/

Igor Konnov 43 of 47 Inductive invariant checking

Wrong invariant candidates Benchmark APALACHE TLC Bakery–5 1 min N/A TwoPhase–7 1 min 3 hrs LCR–5 1 min 2 hrs bcastByz–10 1 min 19 min EWD840–11 1 min 12 min

Inductive invariants Benchmark APALACHE TLC Bakery–5 1 min N/A TwoPhase–7 1 min 3 hrs LCR–5 1 min 2 hrs bcastByz–4 1 min 1 min EWD840–10 1 min 1 min

Igor Konnov 44 of 47 Bounded model checking with invariants

(TO = 24 hours)

Igor Konnov 45 of 47 Symbolic model checker for TLA+ [OOPSLA’19]

Reduction SMT TLA+ rules (Z3)

Distributed algorithms Invariants Fixed parameters, bounded executions Inductive invariants Fixed parameters

forsyte.at/research/apalache/

Igor Konnov 46 of 47 [informal.systems]

TLC does not scale to minimal interesting examples

- Byzantine faults - very non-deterministic environment

Starting to use Apalache

Improvements in the encoding Parallel model checker?

We are hiring!