Steffen Smolka, Praveen Kumar, David Kahn, Nate Foster, Justin Hsu, Dexter Kozen, Alexandra Silva

Scalable Verifcation of Probabilitic Networks

Google Calendar down while preparing this talk! Key Issue: ` Too complex for human reasoning Example: Network Veriﬁcation

Network topology:

SRC DST

Network conﬁg: shortest path Example: Network Veriﬁcation

Network topology:

SRC DST

Veriﬁcation Tool

Network conﬁg: shortest path Example: Network Veriﬁcation

Network topology: “Will my packet get delivered?

SRC DST

Veriﬁcation Tool

Network conﬁg: shortest path Example: Network Veriﬁcation

Network topology: “Will my packet get delivered?

No! SRC DST (+ counterexample)

Veriﬁcation Tool

Network conﬁg: Yes! shortest path (+ formal proof) Example: Network Veriﬁcation

Network topology: “Will my packet get delivered?

SRC DST

Veriﬁcation Tool

Network conﬁg: Yes! shortest path (+ formal proof) Example: Network Veriﬁcation

Network topology: “Will my packet get delivered?

SRC DST

Veriﬁcation Tool Network conﬁg: shortest path

Failure model: links fail independently with probability 1% Example: Network Veriﬁcation

Network topology: “Will my packet get delivered? ? ? SRC DST

Veriﬁcation maybe Tool Network conﬁg: shortest path

Failure model: links fail independently with probability 1% Example: Probabilistic Network Veriﬁcation

Network topology: “Will my packet get delivered?

SRC DST

McNetKAT Network conﬁg: shortest path

Failure model: links fail independently with probability 1% Example: Probabilistic Network Veriﬁcation

Network topology: “Will my packet get delivered? .99 .99 SRC DST

with probability McNetKAT (0.99)2 Network conﬁg: shortest path

Failure model: links fail independently with probability 1% Example: Probabilistic Network Veriﬁcation

Network topology: “Will my packet get delivered?

SRC DST

McNetKAT Network conﬁg: shortest path + detour uniform at random Failure model: ≤ 2 links fail independently with probability 1% Example: Probabilistic Network Veriﬁcation

Network topology: “Will my packet get delivered?

SRC DST

with probability McNetKAT 1 Network conﬁg: shortest path + detour uniform at random Failure model: ≤ 2 links fail independently with probability 1% Example: Probabilistic Network Veriﬁcation

Network topology: “Will my packet get delivered?

SRC DST

with probability McNetKAT 1 Network config: shortest path + detour uniform at random → configuration is Failure model: ≤ 2 links fail independently 2-resilient with probability 1% Example: Infinite Paths

SRC DST

Network config: shortest path + detour uniform at random Failure model: ≤ 2 links fail independently with probability 1% Example: Infinite Paths delivery with prob. 1 despite infinite loop?

SRC DST

Network config: shortest path + detour uniform at random Failure model: ≤ 2 links fail independently with probability 1% Example: Infinite Paths infinite loop has probability 0 ½ ⅓

SRC DST

Network config: shortest path + detour uniform at random Failure model: ≤ 2 links fail independently with probability 1% Example: Infinite Paths infinite loop has probability 0 ½ ⅓

SRC DST

Network conﬁg: ⅓ 1 shortest path + detour uniform at random Failure model: ≤ 2 links fail independently with probability 1% Challenge: How to automate this? Solution

⅓ ⅓ SRC DST

⅓

Packets in probabilistic network undergo random walk ‣ can be modeled as Markov chain ‣ limiting distribution computable in closed form Challenge: State Space

⅓ State Space ⅓ SRC DST Location x Header Values ⅓ Challenge: State Space

⅓ State Space ⅓ SRC DST Location x Header Values ⅓ (~1k) (~ 2160) Challenge: State Space

⅓ State Space ⅓ SRC DST Location x Header Values ⅓ (~1k) (~ 2160)

larger than # particles in the universe Challenge: State Space

⅓ State Space ⅓ SRC DST Location x Header Values ⅓ (~1k) (~ 2160)

But: Markov chain very structured in practice ‣ sparse transition structure ‣ many similar states ‣ analysis can be made tractable using clever data structures Implementation Scalable Verication of Probabilistic Networks∗ Steen Smolka Praveen Kumar David M. Kahn† Nate Foster Cornell University Cornell University Carnegie Mellon University Cornell University Ithaca, NY, USA Ithaca, NY, USA Pittsburgh, PA, USA Ithaca, NY, USA

Justin Hsu† Dexter Kozen Alexandra Silva University of Wisconsin Cornell University University College London Madison, WI, USA Ithaca, NY, USA London, UK

⟦-⟧

Markov chain Scalable Verication of Probabilistic Networks∗ Steen Smolka Praveen Kumar David M. Kahn† Nate Foster Cornell University Cornell University Carnegie Mellon University Cornell University Ithaca, NY, USA Ithaca, NY, USA Pittsburgh, PA, USA Ithaca, NY, USA

Justin Hsu† Dexter Kozen Alexandra Silva University of Wisconsin Cornell University University College London Madison, WI, USA Ithaca, NY, USA London, UK

⟦-⟧

Justin Hsu† Dexter Kozen Alexandra Silva University of Wisconsin Cornell University University College London Madison, WI, USA Ithaca, NY, USA London, UK

Abstract of Probabilistic Networks. In Proceedings of the 40th ACM SIGPLAN This paper presents McNetKAT, a scalable tool for verifying Conference on Programming Language Design and Implementation probabilistic network programs. McNetKAT is based on a (PLDI ’19), June 22–26, 2019, Phoenix, AZ, USA. ACM, New York, NY, USA, 1 page. hps://doi.org/10.1145/3314221.3314639 new semantics for the guarded and history-free fragment of Probabilistic NetKAT in terms of nite-state, absorbing ∗Extended version with appendix. † Markov chains. This view allows the semantics of all pro- Work performed at Cornell University. grams to be computed exactly, enabling construction of an PLDI ’19, June 22–26, 2019, Phoenix, AZ, USA automatic verication tool. Domain-specic optimizations © 2019 Copyright held by the owner/author(s). Publication rights licensed and a parallelizing backend enable McNetKAT to analyze to ACM. networks with thousands of nodes, automatically reasoning This is the author’s version of the work. It is posted here for your personal use. Not for redistribution. The denitive Version of Record was published in about general properties such as probabilistic program equiv- Proceedings of the 40th ACM SIGPLAN Conference on Programming Language alence and renement, as well as networking properties such Design and Implementation (PLDI ’19), June 22–26, 2019, Phoenix, AZ, USA, as resilience to failures. We evaluate McNetKAT’s scalabil- hps://doi.org/10.1145/3314221.3314639. ity using real-world topologies, compare its performance McNetKATPLDI ’19, June architecture 22–26, 2019, Phoenix, AZ, USA S. Smolka, P. Kumar, D. Kahn, N. Foster, J. Hsu, D. Kozen, and A. Silva against state-of-the-art tools, and develop an extended case compute limiting distribution pt=1 study on a recently proposed data center network design. if pt=1 then if port= then ? pt=1 pt=2 pt=3 pt= 1 pt 2 . pt 3 ⇤ 0 5 CCS Concepts • Theory of computation → Automated port 2 0.5 port 3 else if pt=2 then pt=2 ? 1 1 1 reasoning; Program semantics; Random walks and Markov else if port=2 then pt1 pt=1 compile 2 2 else if pt=3 then convert 2 3 chains; • Networks → Network properties; • Software and port1 pt=3 pt=2 6 1 7 pt 1 6 7 → . else 6 7 its engineering Domain specic languages else pt=3 6 1 7 drop 6 7 drop pt= 6 1 7 Keywords Network verication, Probabilistic Programming pt 2 0.5 pt 3 pt 1 drop ⇤ 6 7 6 7 network model symbolic IR 6 sparse matrixSolve 7 ACM Reference Format: Compile Convert 6 7 [ICFP ’15] 4 5 Steen Smolka, Praveen Kumar, David M. Kahn, Nate Foster, Justin = ProbNetKAT program Network model Symbolic IR Sparse matrix [ESOP ’16, POPL ’17] Hsu, Dexter Kozen, and Alexandra Silva. 2019. Scalable Verication Figure 5. Implementation using FDDs and a sparse linear algebra solver. ⟦-⟧ model checking to represent large state spaces compactly. A 5.2 PRISM backend variant called Forwarding Decision Diagrams (FDDs) [45] PRISM is a mature probabilistic model checker that has been was previously⟦-⟧ developed specically for the networking actively developed and improved for the last two decades. domain, but only supported deterministic behavior. In this The tool takes⟦- as⟧ input a Markov chain model specied sym- work, we extended FDDs to probabilistic FDDs. A probabilis- bolically in PRISM’s input language and a property specied tic FDD is a rooted directed acyclic graph that can be un- using a logic such as Probabilistic CTL, and outputs the derstood as a control-ow graph. InteriorMarkov nodes test chain packet probability that the model satises the property. PRISM sup- elds and have outgoing true- and false- branches, which ports various types of models including nite state Markov we visualize by solid lines and dashed lines in Figure 5. Leaf chains, and can thus be used as a backend for reasoning about nodes contain distributions over actions, where an action ProbNetKAT programs using our results from §3 and §4. Ac- is either a set of modications or a special action drop.To cordingly, we implemented a second backend that translates interpret an FDD, we start at the root node with an initial ProbNetKAT to PRISM programs. While the native backend packet and traverse the graph as dictated by the tests until a computes the big step semantics of a program—a costly op- leaf node is reached. Then, we apply each action in the leaf eration that may involve solving linear systems to compute node to the packet. Thus, an FDD represents a function of xed points—the PRISM backend is a purely syntactic trans- type Pk Pk + ? , or equivalently, a stochastic matrix ! D( ) formation; the heavy lifting is done by PRISM itself. over the state space Pk + ? where the ?-row puts all mass A PRISM program consists of a set of bounded variables on ? by convention. Like BDDs, FDDs respect a total order together with a set of transition rules of the form on tests and contain no isomorphic subgraphs or redundant tests, which enables representing sparse matrices compactly. p u + + p u ! 1 · 1 ··· k · k

where is a Boolean predicate over the variables, the pi are probabilities that must sum up to one, and the ui are sequences of variable updates. The predicates are required Dynamic domain reduction. As Figure 5 shows, we do to be mutually exclusive and exhaustive. Such a program not have to represent the state space Pk + ? explicitly even encodes a Markov chain whose state space is given by the when converting into sparse matrix form. In the example, the nite set of variable assignments and whose transitions are state space is represented by symbolic packets pt = 1, pt = 2, dictated by the rules: if is satised under the current as- pt = 3, and pt = , each representing an equivalence class signment and is obtained from by performing update ⇤ i of packets. For example, pt = 1 can represent all packets ui , then the probability of a transition from to i is pi . satisfying .pt = 1, because the program treats all such It is easy to see that any PRISM program can be expressed packets in the same way. The packet pt = represents the in ProbNetKAT, but the reverse direction is slightly tricky: ⇤ set .pt < 1, 2, 3 . The symbol can be thought it requires the introduction of an additional variable akin to { | { }} ⇤ of as a wildcard that ranges over all values not explicitly a program counter to emulate ProbNetKAT’s control ow represented by other symbolic packets. The symbolic packets primitives such as loops and sequences. As an additional are chosen dynamically when converting an FDD to a matrix challenge, we must be economical in our allocation of the by traversing the FDD and determining the set of values program counter, since the performance of model checking appearing in each eld, either in a test or a modication. is very sensitive to the size of the state space. Since FDDs never contain redundant tests or modications, We address this challenge in three steps. First, we translate these sets are typically of manageable size. the ProbNetKAT program to a nite state machine using a

8 Scalable Verication of Probabilistic Networks∗ Steen Smolka Praveen Kumar David M. Kahn† Nate Foster Cornell University Cornell University Carnegie Mellon University Cornell University Ithaca, NY, USA Ithaca, NY, USA Pittsburgh, PA, USA Ithaca, NY, USA

Justin Hsu† Dexter Kozen Alexandra Silva University of Wisconsin Cornell University University College London Madison, WI, USA Ithaca, NY, USA London, UK

Abstract of Probabilistic Networks. In Proceedings of the 40th ACM SIGPLAN This paper presents McNetKAT, a scalable tool for verifying Conference on Programming Language Design and Implementation probabilistic network programs. McNetKAT is based on a (PLDI ’19), June 22–26, 2019, Phoenix, AZ, USA. ACM, New York, NY, USA, 1 page. hps://doi.org/10.1145/3314221.3314639 new semantics for the guarded and history-free fragment of Probabilistic NetKAT in terms of nite-state, absorbing ∗Extended version with appendix. † Markov chains. This view allows the semantics of all pro- Work performed at Cornell University. grams to be computed exactly, enabling construction of an PLDI ’19, June 22–26, 2019, Phoenix, AZ, USA automatic verication tool. Domain-specic optimizations © 2019 Copyright held by the owner/author(s). Publication rights licensed and a parallelizing backend enable McNetKAT to analyze to ACM. networks with thousands of nodes, automatically reasoning This is the author’s version of the work. It is posted here for your personal use. Not for redistribution. The denitive Version of Record was published in about general properties such as probabilistic program equiv- Proceedings of the 40th ACM SIGPLAN Conference on Programming Language alence and renement, as well as networking properties such Design and Implementation (PLDI ’19), June 22–26, 2019, Phoenix, AZ, USA, as resilience to failures. We evaluate McNetKAT’s scalabil- hps://doi.org/10.1145/3314221.3314639. ity using real-world topologies, compare its performance McNetKATPLDI ’19, June architecture 22–26, 2019, Phoenix, AZ, USA S. Smolka, P. Kumar, D. Kahn, N. Foster, J. Hsu, D. Kozen, and A. Silva against state-of-the-art tools, and develop an extended case compute limiting distribution pt=1 study on a recently proposed data center network design. if pt=1 then if port= then ? pt=1 pt=2 pt=3 pt= 1 pt 2 . pt 3 ⇤ 0 5 CCS Concepts • Theory of computation → Automated port 2 0.5 port 3 else if pt=2 then pt=2 ? 1 1 1 reasoning; Program semantics; Random walks and Markov else if port=2 then pt1 pt=1 compile 2 2 else if pt=3 then convert 2 3 chains; • Networks → Network properties; • Software and port1 pt=3 pt=2 6 1 7 pt 1 6 7 → . else 6 7 its engineering Domain specic languages else pt=3 6 1 7 drop 6 7 drop pt= 6 1 7 Keywords Network verication, Probabilistic Programming pt 2 0.5 pt 3 pt 1 drop ⇤ 6 7 6 7 network model symbolic IR 6 sparse matrixSolve 7 ACM Reference Format: Compile Convert 6 7 [ICFP ’15] 4 5 Steen Smolka, Praveen Kumar, David M. Kahn, Nate Foster, Justin = ProbNetKAT program Network model Symbolic IR Sparse matrix [ESOP ’16, POPL ’17] Hsu, Dexter Kozen, and Alexandra Silva. 2019. Scalable Verication Figure 5. Implementation using FDDs and a sparse linear algebra solver. ⟦-⟧ model checkingcan be to represent analyzed large state spaces compactly. A 5.2 PRISM backend variant called Forwarding Decision Diagrams (FDDs) [45…and] PRISM standard is a mature probabilistic tools model checker that has been wasusing previously⟦ -custom⟧ developed speci code…cally for the networking actively developed and improved for the last two decades. domain, but only supported deterministic behavior. In this The tool takes⟦- as⟧ input a Markov chain model specied sym- work, we extended FDDs to probabilistic FDDs. A probabilis- bolically in PRISM’s input language and a property specied tic FDD is a rooted directed acyclic graph that can be un- using a logic such as Probabilistic CTL, and outputs the derstood as a control-ow graph. InteriorMarkov nodes test chain packet probability that the model satises the property. PRISM sup- elds and have outgoing true- and false- branches, which ports various types of models including nite state Markov we visualize by solid lines and dashed lines in Figure 5. Leaf chains, and can thus be used as a backend for reasoning about nodes contain distributions over actions, where an action ProbNetKAT programs using our results from §3 and §4. Ac- is either a set of modications or a special action drop.To cordingly, we implemented a second backend that translates interpret an FDD, we start at the root node with an initial ProbNetKAT to PRISM programs. While the native backend packet and traverse the graph as dictated by the tests until a computes the big step semantics of a program—a costly op- leaf node is reached. Then, we apply each action in the leaf eration that may involve solving linear systems to compute node to the packet. Thus, an FDD represents a function of xed points—the PRISM backend is a purely syntactic trans- type Pk Pk + ? , or equivalently, a stochastic matrix ! D( ) formation; the heavy lifting is done by PRISM itself. over the state space Pk + ? where the ?-row puts all mass A PRISM program consists of a set of bounded variables on ? by convention. Like BDDs, FDDs respect a total order together with a set of transition rules of the form on tests and contain no isomorphic subgraphs or redundant tests, which enables representing sparse matrices compactly. p u + + p u ! 1 · 1 ··· k · k

8 Evaluation Scalability on FatTree with ECMP

103 with failures without failures > 1000 hosts

102

101 Time (seconds)

100 102 103 1umber of switches McNetKAT scales to data-center-size networks. SpeedupPLDI through ’19, June 22–26, parallelization 2019, Phoenix, AZ, USA S. Smolka, P. Kumar, D. Kahn, N. Foster, J. Hsu, D. Kozen, and A. Silva

60 S1 S4k+1 FatTree S 14 H1 S S S S + H2 50 FatTree S 16 0 3 4k 4k 3 S2 pfail S4k+2 pfail 40

30 Figure 9. Chain topology 6SeeduS 20

0 0 20 40 60 80 100 1umber of Fores Often near optimalFigure 8. Speedup speedup. due to Up parallelization. to 40x on 60 cores.

Semantically, this construct is equivalent to a cascade of conditionals; but the native backend compiles it in parallel using a map-reduce-style strategy, using one process per core by default. To evaluate the impact of parallelization, we compiled two representative FatTree models (p = 14 and p = 16) using ECMP routing on an increasing number of cores. With m cores, we used one master machine together with r = Figure 10. Scalability on chain topology. ⌈m/16 − 1⌉ remote machines, adding machines one by one as needed to obtain more physical cores. The results are shown in Figure 8. We see near linear speedup on a single machine, p = 1/1000. For k > 1, the network consists of k diamonds cutting execution time by more than an order of magnitude fail linked together into a chain as shown in Figure 9. Within on our 16-core test machine. Beyond a single machine, the each diamond, switch S forwards packets with equal proba- speedup depends on the complexity of the submodels for 0 bility to switches S and S , which in turn forward to switch each switch—the longer it takes to generate the matrix for 1 2 S . However, S drops the packet if the link to S fails. We each switch, the higher the speedup. For example, with a 3 2 3 analyze the probability that a packet originating at H1 is p = 16 FatTree, we obtained a 30x speedup using 40 cores successfully delivered to H2. Our implementation does not across 3 machines. exploit the regularity of these topologies. Comparison with other tools. Bayonet [15] is a state-of- Figure 10 gives the running time for several tools on the-art tool for analyzing probabilistic networks. Whereas this benchmark: Bayonet, hand-written PRISM, ProbNetKAT McNetKAT has a native backend tailored to the networking with the PRISM backend (PPNK), and ProbNetKAT with the domain and a backend based on a probabilistic model checker, native backend (PNK). Further, we ran the PRISM tools in Bayonet programs are translated to a general-purpose prob- exact and approximate mode, and we ran the ProbNetKAT abilistic language which is then analyzed by the symbolic in- backend on a single machine and on the cluster. Note that ference engine PSI [16]. Bayonet’s approach is more general, both axes in the plot are log-scaled. as it can model queues, state, and multi-packet interactions We see that Bayonet scales to 32 switches in about 25 under an asynchronous scheduling model. It also supports minutes, before hitting the one hour time limit and 64 GB Bayesian inference and parameter synthesis. Moreover, Bay- memory limit at 48 switches. ProbNetKAT answers the same onet is fully symbolic whereas McNetKAT uses a numerical query for 2048 switches in under 10 seconds and scales to linear algebra solver [7] (based on foating point arithmetic) over 65000 switches in about 50 minutes on a single core, to compute limits. or just 2.5 minutes using a cluster of 24 machines. PRISM To evaluate how the performance of these approaches scales similarly to ProbNetKAT, and performs best using the compares, we reproduced an experiment from the Bayonet hand-written model in approximate mode. paper that analyzes the reliability of a simple routing scheme Overall, this experiment shows that for basic network in a family of “chain” topologies indexed by k, as shown in verifcation tasks, ProbNetKAT’s domain-specifc backend Figure 9. based on specialized data structures and an optimized linear- For k = 1, the network consists of four switches organized algebra library [7] can outperform an approach based on a into a diamond, with a single link that fails with probability general-purpose solver.

199 PLDI ’19, June 22–26, 2019, Phoenix, AZ, USA S. Smolka, P. Kumar, D. Kahn, N. Foster, J. Hsu, D. Kozen, and A. Silva

S1 S4k+1

H1 S0 S3 S4k S4k+3 H2

S2 pfail S4k+2 pfail

Figure 9. Chain topology Scalable Verification of Probabilistic Networks PLDI ’19, June 22–26, 2019, Phoenix, AZ, USA PLDI ’19, June 22–26, 2019, Phoenix, AZ, USA S. Smolka, P. Kumar, D. Kahn, N. Foster, J. Hsu, D. Kozen, and A. Silva

103 60 S1 S4k+1 FatTree S 14 3 H1 S S S S + H2 50 FatTree S 16 10 0 3 4k 4k 3 TiPe liPit 3600s S2 pfail S4k+2 pfail 102 40 102 30 Figure 9. Chain topology Figure 8. Speedup due to parallelization. Scalability3RIS0 6SeeduS Parallelization 1 Comparison 101 20 10 TiPe (secRnds) 3RIS0 (#I 0) TiPe (seconds) native Semantically,10 this construct is equivalent to a cascade of native (#I 0) 100 s1 s2 s3 s4 s5 s6 s7 s8 0 conditionals; but the native backend compiles it in parallel 10 4 0 0 1 2 3 4 5 102 103 10using a map-reduce-style0 20 strategy,40 using60 one process80 per100 10 10 10 10 10 10 = 1uPber RI switches 1umber of Fores 1uPber of switcKes Figure 6. A FatTree topology with p 4. core by default. Figure 7. Scalability on a family of data center topologies.To evaluate theFigure impact 8. Speedup of parallelization, due to parallelization. we compiled Thompson-style construction [37]. Each edge is labeled with two representative FatTree models (p = 14 and p = 16) PLDI ’19, JuneScalable 22–26, Verification 2019, Phoenix, of Probabilistic AZ, USA Networks S. Smolka,PLDI P. ’19, Kumar, June D. 22–26, Kahn, PLDI 2019, N. Foster, ’19, Phoenix, June J. 22–26, Hsu, AZ, D. USA2019, Kozen, Phoenix, and S. A. AZ, Smolka, Silva USA P. Kumar, D. Kahn, N. Foster, J. Hsu, D. Kozen, and A. Silva a predicate ϕ, a probability pi , and an update ui , subject to trees that are widely used as topologies in modern data cen-using ECMPSemantically, routing on this an construct increasing is equivalent number of to cores. a cascade With of = the following well-formedness conditions: ters. FatTrees can be specifed in terms of a parametermpcores,conditionals; we used one but master the native machine backend together compiles it with in parallelr using a map-reduce-style strategy, using one process per Figure 10. Scalability on chain topology. 1. For each state, the predicates on its outgoing edges ⌈m/16 − 1⌉ remote machines, adding machines one by one as 1.0 corresponding1.00 to the number of ports on each switch. A core by default.M(F100, fk ) M(F103, fk ) M(F103,5, fk ) compare compare compare form a partition. 1 3 5 2 needed to obtain more physical cores. The results are shown p-ary FatTree connectsC 4p servers using 4p switches. To To evaluatek the≡ impact of≡ parallelization,≡ we compiled k F100 F103 F103,5 in Figure 8. We see! nearteleport linear! speedupteleport on! a=teleport single machine,= F10 F10 teleport 2. For each state and predicate, the probabilities of all route packets,0.95 we used a form of Equal-Cost Multipath Rout- two representative FatTree models (p 14 and p 16) p = 13 /1000. For3,5 k > 1, the network consists of k diamonds cutting execution time by more than an order of magnitude fail 0.9 outgoing edges guarded by that predicate sum to one. ing (ECMP) that randomly maps trafc fows onto shortest using ECMP routing on an increasing number of cores. With linked together into a chain as shown in Figure 9. Within on our 16-corem cores,0 test we machine. used✓✓ one master Beyond machine a single together machine,✓ with ther = 0 ≡≡≡ Intuitively, the state machine encodes the control-fow graph. paths. We measured0.90 the time needed to construct the sto- each diamond,Figure switch 10. ScalabilityS0 forwards on chain packets topology. with equal proba- speedup⌈m depends/161− 1⌉ onremote the✗✓✓ machines, complexity adding of machines the submodels one by one for as 1 < 0.8 ≡≡ This intuition serves as the inspiration for the next transla- chastic matrix representation✗ of the program on a single needed to obtain more physical cores. The results are shown bility to switches S1 and S2, which in turn forward to switch DeliveryAB FatTree,′ F10 ′′ each switch—theResilience longer✗✓✓ it takes to generate the matrix for < ≡≡Latency A 0 2 2 AB Fat7rHH, F100 3r[delLvery] A A tion step, which collapses each basic block of the graph into machine using0.85 two backends (native and PRISM) and under in Figure 8. We see near linear speedup on a single machine, S3. However, S2 drops the packet if the link to S3 fails. We AB FatTree, F103 each switch, the higher✗✗✓ the speedup. For example, with a p =<<1/1000. For k > 1, the≡ networkAB consists Fat7rHH, of kF10diamonds two failure models (no failures and independent failures with cutting3 execution time by more than an order of magnitude 3 analyzefail the probability that a packet originating3 at H1 is a single state. This step is crucial for reducing the state space, p = 16 FatTree, we obtained a 30x speedup using 40 cores Fount 3r[hop ≤ [] 0.7 AB FatTree, F103, 5 ✗✗ ✗ linked<<< together into a chain as shownAB Fat7rHH, in Figure F10 9. Within3, 5 since the state space of the initial automaton is linear in the probability 1/1000). on our4 16-core test machine. Beyond a single machine, the 4 successfully delivered to H2. Our implementation does not 0.80 FatTree, F10 across 3 machines. each diamond, switch S0 forwards packets with equal proba- 3, 5 speedup∞ depends✗✗ on the complexity of the ✗ submodels for ∞ <<

201 201 Summary: McNetKAT

First scalable veriﬁcation tool for probabilistic networks ๏ can reason, e.g., about fault-tolerance

Based on theory of Markov chains ๏ provides solid mathematical foundation ๏ enables computing limits in closed form

Scales thanks to ๏ sparsity, symbolic data structures ๏ parallelization ๏ optimized linear algebra solver Thank you!

Scalable Verifcation of Probabilistic Networks SteCodefen Smolka available:Praveen https://smolka.st/artifacts/mcnetkat/ Kumar David M. Kahn∗ Nate Foster Cornell University Cornell University Carnegie Mellon University Cornell University Ithaca, NY, USA Ithaca, NY, USA Pittsburgh, PA, USA Ithaca, NY, USA Justin Hsu∗ Dexter Kozen Alexandra Silva University of Wisconsin Cornell University University College London Madison, WI, USA Ithaca, NY, USA London, UK Abstract 1 Introduction This paper presents McNetKAT, a scalable tool for verifying Networks are among the most complex and critical com- probabilistic network programs. McNetKAT is based on a puting systems used today. Researchers have long sought new semantics for the guarded and history-free fragment to develop automated techniques for modeling and analyz- of Probabilistic NetKAT in terms of fnite-state, absorbing ing network behavior [40], but only over the last decade Markov chains. This view allows the semantics of all pro- has programming language methodology been brought to grams to be computed exactly, enabling construction of an bear on the problem [5, 6, 28], opening up new avenues automatic verifcation tool. Domain-specifc optimizations for reasoning about networks in a rigorous and principled and a parallelizing backend enable McNetKAT to analyze way [3, 12, 19, 21, 25]. Building on these initial advances, networks with thousands of nodes, automatically reasoning researchers have begun to target more sophisticated net- about general properties such as probabilistic program equiv- works that exhibit richer phenomena. In particular, there is alence and refnement, as well as networking properties such renewed interest in randomization as a tool for designing as resilience to failures. We evaluate McNetKAT’s scalabil- protocols and modeling behaviors that arise in large-scale ity using real-world topologies, compare its performance systems—from uncertainty about the inputs, to expected against state-of-the-art tools, and develop an extended case load, to likelihood of device and link failures. study on a recently proposed data center network design. Although programming languages for describing randomized networks exist [11, 15], support for automated reasoning CCS Concepts • Theory of computation → Automated remains limited. Even basic properties require quantitative reasoning; Program semantics; Random walks and Markov reasoning in the probabilistic setting, and seemingly sim- chains; • Networks → Network properties; • Software and ple programs can generate complex distributions. Whereas its engineering → Domain specifc languages. state-of-the-art tools can easily handle deterministic net- Keywords Network verifcation, Probabilistic Programming works with hundreds of thousands of nodes, probabilistic tools are currently orders of magnitude behind. ACM Reference Format: This paper presents McNetKAT, a new tool for reason- Stefen Smolka, Praveen Kumar, David M. Kahn, Nate Foster, Justin ing about probabilistic network programs written in the Hsu, Dexter Kozen, and Alexandra Silva. 2019. Scalable Verifcation guarded and history-free fragment of Probabilistic NetKAT of Probabilistic Networks. In Proceedings of the 40th ACM SIGPLAN (ProbNetKAT)[3, 11, 12, 35]. ProbNetKAT is an expressive Conference on Programming Language Design and Implementation programming language based on Kleene Algebra with Tests, (PLDI ’19), June 22–26, 2019, Phoenix, AZ, USA. ACM, New York, NY, USA, 14 pages. htps://doi.org/10.1145/3314221.3314639 capable of modeling a variety of probabilistic behaviors and properties including randomized routing [22, 38], uncertainty about demands [30], and failures [17]. The history-free ∗Work performed at Cornell University. fragment restricts the language semantics to input-output behavior rather than tracking paths, and the guarded fragment Permission to make digital or hard copies of all or part of this work for provides conditionals and while loops rather than union and personal or classroom use is granted without fee provided that copies are not made or distributed for proft or commercial advantage and that iteration operators. Although the fragment we consider is a copies bear this notice and the full citation on the frst page. Copyrights restriction of the full language, it is still expressive enough for components of this work owned by others than the author(s) must to encode a wide range of practical networking models. Ex- be honored. Abstracting with credit is permitted. To copy otherwise, or isting deterministic tools, such as Anteater [27], HSA [19], republish, to post on servers or to redistribute to lists, requires prior specifc and Verifow [21], also use guarded and history-free models. permission and/or a fee. Request permissions from [email protected]. PLDI ’19, June 22–26, 2019, Phoenix, AZ, USA To enable automated reasoning, we frst reformulate the © 2019 Copyright held by the owner/author(s). Publication rights licensed semantics of ProbNetKAT in terms of fnite state Markov to ACM. ACM ISBN 978-1-4503-6712-7/19/06...$15.00 htps://doi.org/10.1145/3314221.3314639

190 PLDI ’19, June 22–26, 2019, Phoenix, AZ, USA S. Smolka, P. Kumar, D. Kahn, N. Foster, J. Hsu, D. Kozen, and A. Silva

S1 S4k+1

H1 S0 S3 S4k S4k+3 H2

S2 pfail S4k+2 pfail

Figure 9. Chain topology Comparison vs state of the art

103 TiPe liPit 3600s

102 Figure 8. Speedup due to parallelization. 101 TiPe (seconds) Semantically, this construct is equivalent to a cascade of 0 conditionals; but the native backend compiles it in parallel 10 using a map-reduce-style strategy, using one process per 100 101 102 103 104 105 core by default. 1uPber of switcKes To evaluate the impact of parallelization, we compiled Bayonet 3risP (exact) 31K p = p = 3risP (approx) 331K (exact) 31K (cluster) two representative FatTree models ( 14 and 16) 331K (approx) using ECMP routing on an increasing number of cores. With m cores, we used one master machine together with r = Figure 10. Scalability on chain topology. m 16 1 remote machines, adding machines one by one as d / e needed to obtain more physical cores. The results are shown in Figure 8. We see near linear speedup on a single machine, p = 1 1000. For k > 1, the network consists of k diamonds cutting execution time by more than an order of magnitude fail / linked together into a chain as shown in Figure 9. Within on our 16-core test machine. Beyond a single machine, the each diamond, switch S forwards packets with equal proba- speedup depends on the complexity of the submodels for 0 bility to switches S and S , which in turn forward to switch each switch—the longer it takes to generate the matrix for 1 2 S . However, S drops the packet if the link to S fails. We each switch, the higher the speedup. For example, with a 3 2 3 analyze the probability that a packet originating at H1 is p = 16 FatTree, we obtained a 30x speedup using 40 cores successfully delivered to H2. Our implementation does not across 3 machines. exploit the regularity of these topologies. Comparison with other tools. Bayonet [17] is a state-of- Figure 10 gives the running time for several tools on the-art tool for analyzing probabilistic networks. Whereas this benchmark: Bayonet, hand-written PRISM, ProbNetKAT McNetKAT has a native backend tailored to the networking with the PRISM backend (PPNK), and ProbNetKAT with the domain and a backend based on a probabilistic model checker, native backend (PNK). Further, we ran the PRISM tools in Bayonet programs are translated to a general-purpose prob- exact and approximate mode, and we ran the ProbNetKAT abilistic language which is then analyzed by the symbolic in- backend on a single machine and on the cluster. Note that ference engine PSI [18]. Bayonet’s approach is more general, both axes in the plot are log-scaled. as it can model queues, state, and multi-packet interactions We see that Bayonet scales to 32 switches in about 25 under an asynchronous scheduling model. It also supports minutes, before hitting the one hour time limit and 64 GB Bayesian inference and parameter synthesis. Moreover, Bay- memory limit at 48 switches. ProbNetKAT answers the same onet is fully symbolic whereas McNetKAT uses a numerical query for 2048 switches in under 10 seconds and scales to linear algebra solver [8] (based on oating point arithmetic) over 65000 switches in about 50 minutes on a single core, to compute limits. or just 2.5 minutes using a cluster of 24 machines. PRISM To evaluate how the performance of these approaches scales similarly to ProbNetKAT, and performs best using the compares, we reproduced an experiment from the Bayonet hand-written model in approximate mode. paper that analyzes the reliability of a simple routing scheme Overall, this experiment shows that for basic network in a family of “chain” topologies indexed by k, as shown in verication tasks, ProbNetKAT’s domain-specic backend Figure 9. based on specialized data structures and an optimized linear- For k = 1, the network consists of four switches organized algebra library [8] can outperform an approach based on a into a diamond, with a single link that fails with probability general-purpose solver.