1 GossipSub: A Secure PubSub Protocol for Unstructured, 59 2 60 3 Decentralised P2P Overlays 61 4 62 5 Dimitris Vyzovitis Yiannis Psaras 63 6 64 7 Protocol Labs UCL, UK & Protocol Labs 65 8 [email protected] [email protected] 66 9 ABSTRACT 67 10 68 11 This report is discussing the design choices behind gossipsub, the 69 12 pubsub protocol in use today in the IPFS ecosystem and in partic- 70 13 ular as a message mechanism protocol for IPNS records. We are 71 14 discussing the requirements of the protocol, related works in the 72 15 area, as well as the specific parameters that influence its behaviour. 73 16 74 17 CCS CONCEPTS 75 18 • Computer systems organization → Embedded systems; Re- Figure 1: Topics & Peers 76 19 dundancy; Robotics; • Networks → Network reliability. 77 20 78 21 KEYWORDS 79 22 IPFS, libp2p, pubsub, gossipsub 80 23 81 ACM Reference Format: 24 Dimitris Vyzovitis and Yiannis Psaras. 2019. GossipSub: A Secure PubSub 82 25 Protocol for Unstructured, Decentralised P2P Overlays. In Proceedings of 83 26 Protocol Labs TechRep (PL-TechRep-gossipsub-v0.1-Dec19). Protocol Labs, 84 27 8 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn 85 28 Figure 2: Message Delivered to Subscribers 86 29 1 INTRODUCTION 87 30 Publish/Subscribe systems have traditionally been used to facilitate 88 31 distribution of messages in an asynchronous manner between a set of pub/sub messaging and other operations related to the system 89 32 of publishers and subscribers. Senders (publishers) and receivers on a stable and continuous fashion. 90 33 (subscribers) are not in direct communication, but instead com- Unstructured P2P overlay networks, on the other hand, do not 91 34 municate through the pub/sub system. Subscribers declare their assume any connectivity properties for any of its nodes. That is, 92 35 topics of interest; publishers publish in one of the system’s topics. in unstructured P2P networks, nodes can be of any type (i.e., from 93 36 The pub/sub system then matches the two and delivers new mes- always-on rack servers, to ephemerally connected laptops and mo- 94 37 sages (or more commonly called events) to all subscribers under bile devices) and thus, connect and disconnect at random times. 95 38 a topic. Pub/Sub systems have been extensively used by These random connectivity patterns make it impossible to assign 96 39 applications (see Twitter, RSS Feeds, Facebook), but also by general extra event routing or message caching responsibilities to any node 97 40 purpose Peer-to-Peer (P2P) systems. Not for distribution.in unstructured P2P networks. In turn, designing message propaga- 98 41 By and large, peer-to-peerUnpublished (P2P) networks can be split in two workingtion and guaranteeing draft. reliability of message delivery (that is, that a 99 42 categories: i) structured P2P overlays and ii) unstructured P2P over- message will reach all nodes in the network within a given amount 100 43 lays. In structured P2P overlays (or networks), the network has of time) is very difficult. 101 44 some structure, e.g., it is based on some topological or node hier- For these reasons, unstructured P2P networks very often use 102 45 archy. In such cases, some nodes (often called Super Nodes) can pub/sub protocols that are closer to flooding, or random walks 103 46 be assigned more responsibilities than others, such as for example, on the overlay to propagate event and membership information. 104 47 relay published events to subscribed nodes. Those nodes are also However, naïve flooding introduces a lot of extra traffic inthe 105 48 assumed to be dedicated servers, hence, they can support routing network, while random walks may take extended amounts of time 106 49 before reaching all nodes. 107 50 Permission to make digital or hard copies of all or part of this work for personal or 108 classroomUnpublished use is granted working without fee draft. provided Not that copies for distribution.are not made or distributed While previous work has addressed many different aspects and 51 for profit or commercial advantage and that copies bear this notice and the full citation requirements of pub/sub design for structured P2P networks, little 109 52 on the first page. Copyrights for components of this work owned by others than ACM has been done for unstructured P2P networks. “Gossip" has been 110 must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, 53 111 to post on servers or to redistribute to lists, requires prior specific permission and/or a introduced as a way to limit the number of messages propagated be- 54 fee. Request permissions from [email protected]. tween peers in pub/sub systems, as compared to flooding, where all 112 55 PL-TechRep-gossipsub-v0.1-Dec19, Dec 2019, AoE published messages are forwarded to all subscribed peers. In gossip- 113 © 2019 Association for Computing Machinery. 56 114 ACM ISBN 978-x-xxxx-xxxx-x/YY/MM...$15.00 based approaches, peers forward metadata related to messages they 57 https://doi.org/10.1145/nnnnnnn.nnnnnnn have “seen" without forwarding the messages themselves. There 115 58 2019-12-30 05:20. Page 1 of 1–8. 116 PL-TechRep-gossipsub-v0.1-Dec19, Dec 2019, AoE Vyzovitis and Psaras

117 have been numerous studies that explored gossip-based systems, paper intends to pursue. In its current version, we are starting by 175 118 but the scalability of large-scale, unstructured pub/sub systems has illustrating the protocol’s main design choices, together with a 176 119 not been addressed thoroughly. conceptual comparison against related works in the area. 177 120 In recent years, there has been significant momentum for design Gossipsub Design Summary: The pubsub protocol proposed 178 121 and deployment of decentralised Internet services and applications. here falls in the category of gossip-based pubsub protocols. Its main 179 122 Among others, such services include distributed and decentralised design features are a connected mesh, complemented by gossip 180 123 storage [8], [7], [12], [34], [2] but also computation systems [3]. functionality and lazy push. Lazy push is a technique used in pub- 181 124 The aim is to replace, or complement traditional centrally man- sub systems, according to which only metadata and not the full 182 125 aged and operated cloud services. End-users are contributing part message, is forwarded to subscribed nodes. In turn, if nodes are in- 183 126 of their resources to the network and get rewarded according to terested in the actual message whose metadata they have received, 184 127 contribution. These emerging systems are distributed in the sense they request the message explicitly. As is the case with many un- 185 128 of geographical spread and decentralised from the point of view of structured P2P networks, the first version of the protocol was closer 186 129 ownership, management and operation. to flood-based pubsub. However, as the IPFS network grew over 187 130 We are, therefore, witnessing a trend towards building P2P over- time, it was clear that flooding is not an efficient approach. Scal- 188 131 lays, where, in most cases, unreliable and non-dedicated end-user ability requirements quickly surfaced and resulted in the design 189 132 devices are active contributors to the network. In the absence of of meshsub, a connected mesh which is complemented by gossip 190 133 central control, messaging in those systems is of utmost impor- and lazy push as core features of the protocol. The combination of 191 134 tance in order to communicate operational processes (e.g., find file those techniques, along with the ability to plug in custom routing 192 135 or execute function), but also propagate management events. heuristics, constitute the main novelty of gossipsub. Simplicity of 193 136 Pub/Sub has seen a surge in usage from distributed applications implementation, as well as being reactive to dynamic network con- 194 137 in the area of decentralised services, such as decentralised chat ditions have been the driving philosophies behind the design of 195 138 and social networks [12], [10], collaborative editing tools without a gossipsub. 196 139 backend server [11], hosting of dynamic website content in unman- 197 140 aged P2P networks [8], storage and synchronisation of evolving 2 BACKGROUND & RELATED WORK 198 141 datasets, to name a few. 2.1 Tradeoffs in PubSub 199 142 In distributed storage systems and in the case of the InterPlan- 200 143 etary File System (IPFS) ecosystem in particular, pub/sub can be General purpose pub/sub messaging systems can prove very useful 201 144 used for several purposes, including content routing, i.e., one of the from several different aspects (from management and operation 202 145 most central and vital functions of the system. IPFS is a content- to performance) in P2P networks and as such come with many 203 146 addressable, distributed P2P storage network with hundreds of tradeoffs [19], [25]. Due to the wide variety of applications building 204 147 thousands of daily users. Users can participate in the network on top of pubsub systems, not all tradeoffs apply to all systems and 205 148 as unreliable nodes, e.g., using laptop devices and with frequent many of them are contradictory to each other. Below, we discuss 206 149 disconnections. The gossip-based pubsub protocol proposed here some of them and, where relevant, note how they influenced the 207 150 (acronymed “gossipsub") was developed with those system and design of gossipsub. 208 151 environment requirements in mind (i.e., unmanaged network and (1) Reliable Delivery. In case of no node downtime, all pub- 209 152 unreliable nodes) and is currently in use to push naming record lished messages should be delivered to all subscriber nodes. 210 153 updates to the decentralised naming system of IPFS (acronymed Pub/Sub systems should be robust against node churn and 211 154 IPNS [9]. should still reach most nodes (i.e., achieve high hit rate). 212 155 Pubsub and in particular the proposed gossipsub protocol will Apart from robustness, fast recovery from churn is also nec- 213 156 also be used in the very near future by emergingNot P2P systems, for distribution.essary. Node churn and disconnected environments are tack- 214 157 such as ETH2 [4] and FilecoinUnpublished [5] as the main routing protocol for workingled in gossipsub draft. by using gossip messaging. This is a hard 215 158 transaction blocks. Systems such as ETH2 and Filecoin are primarily requirement for the protocol’s use in the IPFS network, but 216 159 financial systems and are expected to carry transaction messages also more generally as a feature in libp2p. 217 160 worth millions in monetary value. That said, the security properties (2) Load Balancing. The event message relay load should be 218 161 of gossipsub need to be investigated in detail, which we will do in roughly equally split between nodes. Assuming a scaled 219 162 subsequent versions of this report. up system where nodes might be subscribed to 5K events, 220 163 Previously proposed pubsub systems have proven scalability relaying messages is becoming a heavy task and therefore, 221 164 properties in managed and cloud-based environments. A few pub- the more nodes a node is connected to (in terms of degree), 222 165 sub protocols have been proposed for unstructured P2P overlays the more the relay tasks it will have to carry out. 223 166 and among those even fewer have been tested with more than (3) Scalability. Given the growth of networked systems, both 224 167 10,000 nodes and high rates of churn. IPFS already has hundreds in a cloud environment, but also in decentralised, P2P envi- 225 168 of thousands of daily users and is expected to grow exponentially ronments, the system should be able to scale up to millions 226 169 to multiple millions. ETH1.0 already has more than 16,000 users of nodes1. There have been very few (if any) systems that 227 170 and when ETH2.0 arrives this number is expected to rise by several 228 171 orders of magnitude. 229 1The Bittorent network had tens of millions of active daily nodes, while IPFS today has 172 That said, a thorough evaluation of the protocol is essential hundreds of thousands of nodes and the ETH2 network is expected to have millions of 230 173 before its deployment in those systems. This is the goal that this nodes. 231 174 2019-12-30 05:20. Page 2 of 1–8. 232 GossipSub: A Secure PubSub Protocol for Unstructured, Decentralised P2P Overlays PL-TechRep-gossipsub-v0.1-Dec19, Dec 2019, AoE

233 achieved scalability of that order in unmanaged, P2P envi- Closer to our work are systems such as Vitis [29], Tera [13], 291 234 ronments, while still achieving acceptable performance with Rappel [27] and StAN [26], which use gossiping to propagate in- 292 235 today’s standards (i.e., lookup in ms and delivery in less than formation and build on unstructured P2P overlays. Gossip-based 293 236 1sec). protocols distinguish between two types of peering: full message 294 237 (4) Resilience & Resource-Efficiency. It is generally common and metadata peering (see Figs. 3 and 4). When two nodes are in 295 238 in pub/sub systems for a message to be delivered twice to full message peering, messages received by one node are forwarded 296 239 subscriber nodes. Clearly, this increases load on individual to its full message peers. In contrast, when two nodes are metadata 297 240 relay nodes, but also the overall system’s bandwidth require- peers, they exchange only metadata for received messages. Meta- 298 241 ments. On the other hand, redundant delivery can enhance data peers are “aware" of published messages, but do not actually 299 242 the security and resilience properties of the system. Dupli- have the published messages. Extending the concept of meta-data 300 243 cate message delivery has to be kept within certain levels in peering, gossipsub is implementing the previously proposed concept 301 244 order to both achieve resilience, but at the same time avoid of lazy push, according to which peers in the network are notified 302 245 having negative impact in terms of resource-efficiency. of the existence of messages only (e.g., through their message ID), 303 246 but messages are only forwarded if a node explicitly requests the 304 247 newly advertised message. 305 Building broadcast (spanning) trees has been proposed before 248 306 [22], [23] to deal with many of the tradeoffs listed abovee.g., ( load- 249 307 balancing and resource efficiency), but a tree can only scale linearly 250 308 to its length and most importantly it is not robust against churn. 251 309 Gossiping, on the other hand, is inherently more robust against 252 310 node failures and churn, but might suffer from resource efficiency 253 311 and load-balancing. 254 312 Striking the right balance, especially in an unmanaged, unstruc- 255 313 tured P2P overlay has not seen a solution to date. This is the gap that 256 314 the proposed solution is filling, leveraging the simplicity of its design, 257 315 but also a resource-efficient gossip approach. 258 316 The literature in this space is vast. The purpose of this section Figure 3: Full vs. Metadata Peering 259 317 is not to survey the majority of the proposals, but instead to point 260 318 to the most important contributions from which lessons can be 261 In particular, the authors in [29] are arguing that most similar 319 learned and applied to an unstructured, unmanaged, P2P pub/sub 262 systems are building one separate overlay per topic, which results 320 overlay. 263 in nodes being members of an extensive number of overlays. This, 321 264 in turn, according to Vitis [29] increases overhead for relay nodes, 322 265 which might get disincentivised and leave the system. In contrast, 323 266 2.2 Scalable Topic-based Pub/Sub Vitis is dealing with this problem by bounding the number of con- 324 267 Scalability in pub/sub systems was a key requirement in cloud nections per node. This is done by using gossip messages to sample 325 268 environments, as cloud-based systems had to scale to accommodate the topics that other nodes have subscribed to. Then grouping nodes 326 269 demand. Amazon [1] and Google Cloud Messaging [6] both have into the same overlay where topics overlap reduces the number of 327 270 pub/sub protocols in operation, although their operational details overlays needed, while at the same time keeping nodes subscribed 328 271 have not been widely revealed. to the topics of their choice. 329 272 Scribe [18] was one of the very first pub/subNot systems for that pro- distribution.Rappel [27] is targeting low overhead and noise to subscriber 330 273 posed a decentralised multicastUnpublished overlay, on top of the Pastry DHT. workingnodes, that is, receiving draft. messages for topics that the node is not sub- 331 274 DHTs have been used in several pub/sub systems (see Meghdoot scribed to. Rappel achieves that by building a network of “friends 332 275 [21], Bayeux [36]), where in most cases the DHT is used to find overlay" building on interest locality. Rappel also targets fast dis- 333 276 where subscriptions are located and route to them [21], or as a semination of messages by taking into account network locality on 334 277 rendezvous point [18] for a topic. Poldercast [31] is an interesting top of interest locality. 335 278 approach which uses a ring overlay (and additional links as op- While these have been very interesting approaches to gossip- 336 279 timisation). Subscribers to a topic are connected to this ring and based pub/sub for unstructured P2P overlays, none of them has 337 280 messages propagate to all other subscribers. Clearly, the latency to been tested for scalability at massive scales (e.g., millions of nodes), 338 281 inform all nodes is increasing linearly, unless direct links exist and or significant node churn. Rappel [27] is claiming to be robust for 339 282 can be exploited. Dynatops [35] was proposed as a self-configured up to 25% node churn, while Tera [13] leaves this part for future 340 283 topic-based pub/sub system which can deal with short-lived sub- evaluation. 341 284 scriptions. All these systems require a broker network that extends The scale required by systems such as name registry propagation 342 285 across a wide-area network to cover subscribers. Dynamoth [20] in IPNS, or request routing in Filecoin and transaction routing in 343 286 was proposed to reduce latency as a hybrid between the dissinter- ETH2 can be orders of magnitude higher than the systems tested 344 287 mediated pub/sub and the directly-connected client-server models. in the past for unstructured, unmanaged P2P overlays. Node churn 345 288 It is, however, exclusively applicable to cloud-based systems and can also be significant, but we expect that a 30% threshold should 346 289 not to P2P overlays. be acceptable. 347 290 2019-12-30 05:20. Page 3 of 1–8. 348 PL-TechRep-gossipsub-v0.1-Dec19, Dec 2019, AoE Vyzovitis and Psaras

349 as a middleware with the interesting feature that it adds and re- 407 350 moves nodes based on their load. Both of these approaches are 408 351 exclusively cloud-based and although they achieve scalability and 409 352 low-latencies, they do not apply to unmanaged P2P networks. In 410 353 fact, most content-based pub/sub systems either target cloud envi- 411 354 ronments, or some broker-based infrastructure (see Elvin, Sienna 412 355 [17], HERMES [28], Gryphon [33]). Privacy-preserving techniques 413 356 to protect subscribers’ information have also been investigated 414 357 extensively in content-based pub/sub systems with some notable 415 358 works, such as [30], [32]. 416 359 417 360 3 GOSSIPSUB PROTOCOL DETAILS 418 361 3.1 FloodSub 419 362 420 363 The initial pubsub protocol in libp2p was floodsub. Floodsub imple- 421 364 ments pubsub in the most basic manner, with two defining aspects: 422 365 i) ambient peer discovery; and ii) flooding, as the most basic routing 423 366 and message propagation protocol 424 367 3.1.1 Ambient Peer Discovery. With ambient peer discovery, the 425 368 peer discovery task is pushed outside the scope of the pubsub pro- 426 369 tocol. Instead, the mechanism for discovering peers is provided for 427 370 by the environment. In practice, and in the case of IPFS, this can be 428 371 embodied by DHT walks, rendezvous points or similar techniques. 429 372 This level of decoupling is possible thanks to the modular design of 430 373 libp2p. Floodsub relies on the ambient connection events produced 431 374 by these discovery mechanisms. Whenever a new peer is connected, 432 375 the protocol checks to see if the peer implements floodsub and/or 433 376 gossipsub, and if so, it sends a ‘hello’ packet that announces the 434 377 topics that it is currently subscribing to. 435 378 This allows the peer to maintain soft overlays for all topics of 436 379 interest. The overlay is maintained by exchanging subscription 437 380 control messages whenever there is a change in the topic list. The 438 381 subscription messages are not propagated further, so each peer 439 382 maintains a topic view of its direct peers only. Whenever a peer 440 Figure 4: Gossip-based Message Delivery 383 disconnects, it is removed from the overlay. 441 384 Ambient peer discovery can be driven by arbitrary external 442 385 means, which allows orthogonal development and no external de- 443 386 444 Gossipsub is building on simplicity of both design and implemen- pendencies for the protocol implementation. 387 445 tation, in order to be able to scale and address the size of network There are a couple of options we are exploring as canonical 388 446 we require. Being responsive to changing networkNot conditions, for such distribution.approaches for the discovery function: 389 447 as node churn in unreliable environmentsUnpublished is another feature that is working• DHT rendezvous draft. using provider records; peers in the topic 390 448 achieved by gossipsub’s design. announce a provider record named after the topic. 391 449 • Rendezvous through known or dynamically discovered ren- 392 450 dezvous points. 393 2.3 Scalable Content-based Pub/Sub 451 394 Content-based pub/sub systems have been extensively studied 3.1.2 Flood Routing. With flooding, routing is almost trivial: for 452 395 in the past, e.g., [17], but also more recently in the context of each incoming message, forward to all known peers in the topic. In 453 396 Information-Centric Networks, e.g., [15], [16]. Generally speak- our implementation, the router maintains a timed cache of previous 454 397 ing, content-based (or sometimes called attribute-based) pub/sub messages, so that seen messages are not forwarded more than once. 455 398 systems can provide finer granularity matching between publish- The protocol also never forwards a message back to the source or 456 399 ers and subscribers, but in order to achieve this they require more the peer that forwarded the message. 457 400 compute resources. Gossipsub is is not building on content-based 458 401 pub/sub, hence, in this subsection we discuss a few approaches for 3.1.3 Retrospective. Evaluating floodsub as a viable pubsub proto- 459 402 completeness. col reveals the following highly desirable properties: 460 403 BlueDove [24] is one of the well-known approaches in this space. • it is straightforward to implement. 461 404 It supports multi-dimensional attributes and is organising over- • it minimizes latency; messages are delivered across minimum 462 405 lay servers in a scalable topology. E-StreamHub [14] is proposed latency paths, modulo overlay connectivity. 463 406 2019-12-30 05:20. Page 4 of 1–8. 464 GossipSub: A Secure PubSub Protocol for Unstructured, Decentralised P2P Overlays PL-TechRep-gossipsub-v0.1-Dec19, Dec 2019, AoE

465 • it is highly robust; there is very little maintenance logic or if |peers| > D_high: 523 466 state. select |peers| - D mesh peers at random and remove them 524 467 The problem, however, is that messages do not just follow the from the mesh 525 468 minimum latency paths; they follow all edges, thus creating a flood. sleep t 526 469 The outbound degree of the network is unbounded, which further 527 470 means that peers’ bandwidth is saturated with redundant traffic. The parameters of the meshsub algorithm are: D, which is the tar- 528 471 On the other hand, redundant traffic is increasing the resilience get degree, and two relaxed degree parameters D_low and D_high, 529 472 of the network and unbounded degree is naturally creating a re- representing admissible mesh degree bounds. Their goal is to intro- 530 473 sistance level against Sybil attacks. There are clear benefits that duce elasticity and curtail excessive flapping. 531 474 come with message redundancy that gossipsub attempts to exploit, 532 475 3.2.3 gossipsub: The gossiping mesh router. The meshsub router 533 but at the same time without being prohibitive to the peers’ re- offers a baseline construction with good amplification control prop- 476 sources. This is essential in order to increase scalability and support 534 477 erties, which we augment with gossip about message flow. The 535 decentralisation. gossip is emitted to random subsets of peers that are currently not 478 Similarly, the amplification factor is only bounded by the sumof 536 479 part of the mesh. Using gossip messages we can propagate metadata 537 degrees of all nodes in the overlay, which creates a scaling problem about message flow throughout the network. The metadata canbe 480 for densely connected overlays at large. 538 481 arbitrary, but as a baseline we include the message IDs that the 539 router emitting the gossip has seen in the last few seconds. The 482 3.2 Controlling the flood 540 483 actual messages themselves are cached, so that peers receiving the 541 484 In order to scale pubsub without excessive bandwidth waste or peer gossip can request them for transmission with a control message. 542 485 overload, we need a router that bounds the degree of each peer and The router can use this metadata to improve the mesh and create 543 486 globally controls the amplification factor. epidemic broadcast trees. Beyond that, the metadata can restart 544 message transmission at different points in the overlay to rectify 487 3.2.1 randomsub: A random message router. As a first step, we 545 488 downstream message loss. Alternatively, it can simply jump hops 546 consider the simplest bounded floodsub variant, which we call opportunistically and accelerate message transmission for peers 489 randomsub. In this construction, the router is still stateless, apart 547 490 who are at some distance in the mesh. 548 from a list of known peers in the topic. But instead of forwarding Essentially, gossipsub is a blend of meshsub for data and ran- 491 D 549 messages to all peers, it forwards to a random subset of up to domsub for mesh metadata, realised in the form of lazy push. The 492 D 550 peers, where is the desired degree of the network. combination of those techniques provides a powerful construction 493 The problem with this construction is that the message prop- 551 494 that meets requirements of disconnected mesh, “squatting peers", 552 agation patterns are non-deterministic. This results in extreme responsiveness to network conditions and most importantly scala- 495 message route instability, manifesting as message reordering and 553 496 bility. Gossipsub provides bounded degree and amplification factor 554 varying timing patterns, which is an undesirable property for many with the meshsub construction and augments it using gossip propa- 497 applications. 555 498 gation of metadata with the randomsub technique. 556 499 3.2.2 meshsub: An overlay mesh router. meshsub is improving on 557 500 the randomsub construction by limiting the number of messages in 3.3 The gossipsub Protocol Implementation 558 501 the pubsub network. This is essential in order to reduce bandwidth We proceed to provide a specification of the gossipsub protocol by 559 502 requirements, network and node/router load. However, instead sketching out the router implementation. The router is backwards 560 503 of randomly selecting peers on a per message basis, we form an compatible with floodsub, as it accepts floodsub peers and behaves 561 504 overlay mesh where each peer forwards to a subsetNot of its peers for on a distribution.like floodsub towards them. 562 505 stable basis. We construct a routerUnpublished in this fashion, dubbed meshsub. working draft. 563 506 Each peer maintains its own view of the mesh for each topic, 3.3.1 Control messages. The protocol defines four control mes- 564 507 which is a list of bidirectional, reciprocal links to other peers. That sages: 565 508 is, in steady state, whenever a peer A is in the mesh of peer B, then • GRAFT: graft a mesh link; this notifies the peer that it has 566 509 peer B is also in the mesh of peer A. been added to the local mesh view. 567 510 The overlay is initially constructed in a random fashion, based • PRUNE: prune a mesh link; this notifies the peer that it has 568 511 only on topics. Whenever a peer joins a topic, then it selects D peers been removed from the local mesh view. 569 512 (in the topic) at random and adds them to the mesh, notifying them • IHAVE: gossip; this notifies the peer that the following mes- 570 513 with a control message. When it leaves the topic, it notifies its peers sages were recently seen and are available on request. 571 514 and forgets the mesh for the topic. • IWANT: request transmission of messages announced in an 572 515 The mesh is maintained with the following periodic stabilization IHAVE message. 573 516 algorithm: 574 517 at each peer: 3.3.2 Heartbeat. The router periodically runs a heartbeat proce- 575 518 loop: dure, which is responsible for maintaining the mesh, emitting gos- 576 519 if |peers| < D_low: sip, and shifting the message cache. 577 520 select D - |peers| non-mesh peers at random and add 578 521 them to the mesh 3.3.3 Router state. The router maintains the following state: 579 522 2019-12-30 05:20. Page 5 of 1–8. 580 PL-TechRep-gossipsub-v0.1-Dec19, Dec 2019, AoE Vyzovitis and Psaras

581 3.4 Topic Membership 639 582 Topic membership is controlled by two operations supported by 640 583 the router, as part of the local pubsub SDK API: 641 584 642 • On JOIN(topic) the router joins the topic. In order to do 585 643 so, if it already has D peers from the fanout peers of a topic, 586 644 then it adds them to mesh[topic], and notifies them with a 587 645 GRAFT(topic) control message. Otherwise, if there are less 588 646 than D peers (let this number be x) in the fanout for a topic (or 589 647 the topic is not in the fanout), then it still adds them as above 590 648 (if there are any). The algorithm also selects the remain- 591 649 Figure 5: Graft New Peering Nodes ing number of peers (D-x) from peers.gossipsub[topic], 592 650 and adds them to mesh[topic] notifying them with a GRAFT(topic) 593 651 control message. 594 652 • On LEAVE(topic) the router leaves the topic. It notifies the 595 653 peers in mesh[topic] with a PRUNE(topic) message and 596 654 forgets mesh[topic]. 597 655 598 656 599 657 600 658 601 659 602 660 603 661 604 Figure 6: Prune Existing Nodes 662 605 663 606 664 607 665 608 • peers: a set of all known peers; peers.gossipsub denotes 666 609 the gossipsub peers while peers.floodsub denotes the flood- 667 610 sub peers. 668 611 • mesh: the overlay meshes as a map of topics to lists of peers. 669 612 • fanout: the mesh peers to which we are publishing to with- 670 613 out topic membership, as a map of topics to lists of peers. 671 614 • seen: this is the timed message ID cache, which tracks seen 672 615 messages. 673 616 • mcache: a message cache that contains the messages for the 674 617 last few heartbeat ticks. 675 618 The message cache is a data structure that stores windows of mes- 676 619 sage IDs and the corresponding messages. It supports the following 677 620 operations: Not for distribution. 678 621 Figure 7: Graft Full Peering Nodes 679 • mcache.put(m) Unpublished working draft. 622 : adds a message to the current window and 680 623 the cache. 681 • mcache.get(id) Note that the router can publish messages without topic mem- 624 : retrieves a message from the cache by its 682 bership. In order to maintain stable routes in that case, it maintains 625 ID, if it is still present. 683 • mcache.window() a list of peers for each topic it has published in the fanout map. If 626 : retrieves the message IDs for messages 684 the router does not publish any messages of a topic for some time, 627 in the current history window. 685 • mcache.shift() then the fanout peers for that topic are forgotten – in other words, 628 : shifts the current window, discarding mes- 686 this is soft state. 629 sages older than the history length of the cache. 687 Also note that as part of the pubsub API, the peer emits SUB- 630 The seen cache is the flow control mechanism. It tracks the 688 SCRIBE and UNSUBSCRIBE control messages to all its peers when- 631 message IDs of seen messages for the last two minutes. It is separate 689 ever it joins or leaves a topic. This is provided by the the ambient 632 from mcache for implementation reasons in Go (the seen cache 690 peer discovery mechanism and nominally not part of the router. A 633 is inherited from the pubsub framework), but they can also be 691 standalone implementation would have to implement those control 634 the same data structure. Note that the two minute cache interval 692 messages. 635 is non-normative; a router could use a different value, chosen to 693 636 approximate the propagation delay in the overlay with some healthy 3.4.1 Message Processing. Upon receiving a message, the router 694 637 margin. first processes the payload of the message. If it contains avalid 695 638 2019-12-30 05:20. Page 6 of 1–8. 696 GossipSub: A Secure PubSub Protocol for Unstructured, Decentralised P2P Overlays PL-TechRep-gossipsub-v0.1-Dec19, Dec 2019, AoE

697 be significant for densely connected peers. Although our choice 755 698 of metadata (i.e., message IDs) is used as a baseline (see Sec. 3.2.3), 756 699 piggybacking is a very elegant way to propagate this type of infor- 757 700 mation without inserting extra traffic in the network and therefore, 758 701 being able to scale to much larger network sizes. 759 702 760 703 4 CONCLUDING REMARKS 761 704 762 We have presented the design choices of the pubsub protocol used 705 763 in the IPFS ecosystem together with reasoning behind those choices. 706 764 We have provided a brief survey of related works in the area of 707 765 pubsub and have conceptually compared the design choices of 708 766 gossipsub with those in related literature. This report should by no 709 767 means be considered as a comprehensive literature survey. Instead, 710 768 we have pointed to survey papers that paint the bigger picture. 711 769 With the imminent deployment of gossipsub as the main routing 712 770 protocol for transaction messages/blocks in the Filecoin network, 713 771 we are currently investigating security measures of gossipsub and 714 772 will update this report once the initial security enhancements are 715 773 designed. 716 774 717 775 718 REFERENCES 776 719 Figure 8: Prune Full Peering Nodes [1] Amazon SND. https://aws.amazon.com/pub-sub-messaging/. 777 [2] Protocol Foundation. https://dat.foundation. 720 [3] Dfinity: The Internet Computer. https://dfinity.org. 778 721 [4] Ethereum 2.0. https://docs.ethhub.io/ethereum-roadmap/ethereum-2.0/eth-2.0- 779 message that has not been previously seen, then it publishes the 722 phases/. 780 message. In particular, [5] filecoin: A decentralised market for storage. 723 [6] Google/Firebase Cloud Messaging. https://firebase.google.com/docs/cloud- 781 724 • It forwards the message to every peer in peers.floodsub[topic], messaging/. 782 [7] IPFS - Content Addressed, Versioned, P2P File System. https://ipfs.io/ipfs/ 725 provided none of these peers are the source of the message. 783 • mesh[topic] QmR7GSQM93Cx5eAg6a6yRzNde1FQv7uL6X1o4k7zrJa3LX/ipfs.draft3.pdf. 726 It forwards the message to every peer in , pro- [8] IPFS: InterPlanetary File System. https://ipfs.io. 784 727 vided none of these peers are the source of the message. [9] Ipns: Interplanetary naming system. https://docs.ipfs.io/guides/concepts/ipns/. 785 [10] Social Network. https://joinmastodon.org. 728 After processing the payload, it then processes the control mes- [11] PeerPad. https://peerpad.net. 786 729 sages in the envelope: [12] Secure Scuttlebutt. https://scuttlebutt.nz/. 787 730 • GRAFT(topic) mesh[topic] [13] Baldoni, R., Beraldi, R., Quema, V., Querzoni, L., and Tucci-Piergiovanni, S. 788 On it adds the peer to if it is Tera: Topic-based event routing for peer-to-peer architectures. In Proceedings of 731 subscribed to the topic. If it is not subscribed, it responds the 2007 Inaugural International Conference on Distributed Event-based Systems 789 732 with a PRUNE(topic) control message. (New York, NY, USA, 2007), DEBS ’07, ACM, pp. 2–13. 790 [14] Barazzutti, R., Felber, P., Fetzer, C., Onica, E., Pineau, J.-F., Pasin, M., Rivière, 733 791 • On PRUNE(topic) it removes the peer from mesh[topic]. E., and Weigert, S. Streamhub: A massively parallel architecture for high- 734 • On IHAVE(ids) it checks the seen set and requests unknown performance content-based publish/subscribe. pp. 63–74. 792 735 messages with an IWANT message. [15] Carzaniga, A., Khazaei, K., Papalini, M., and Wolf, A. L. Is information- 793 centric multi-tree routing feasible? SIGCOMM Comput. Commun. Rev. 43, 4 (Aug. 736 • On IWANT(ids) it forwards all request messagesNot that for are distribution.2013), 3–8. 794 737 present in mcache to theUnpublished requesting peer. working[16] Carzaniga, A., Papalini, draft. M., and Wolf, A. L. Content-based publish/subscribe 795 738 networking and information-centric networking. In Proceedings of the ACM 796 When the router publishes a message that originates from the SIGCOMM Workshop on Information-centric Networking (New York, NY, USA, 739 router itself (at the application layer), then it proceeds similarly to 2011), ICN ’11, ACM, pp. 56–61. 797 740 [17] Carzaniga, A., Rosenblum, D. S., and Wolf, A. L. Design and evaluation of a 798 the payload reaction: wide-area event notification service. ACM Transactions on Computer Systems 19, 741 799 • It forwards the message to every peer in peers.floodsub[topic]. 3 (Aug. 2001), 332–383. 742 800 • If it is subscribed to the topic, then it must have a set of peers [18] Castro, M., Druschel, P., Kermarrec, A. ., and Rowstron, A. I. T. Scribe: a 743 large-scale and decentralized application-level multicast infrastructure. IEEE 801 in mesh[topic], to which the message is forwarded. Journal on Selected Areas in Communications 20, 8 (Oct 2002), 1489–1499. 744 802 • If it is not subscribed to the topic, it then forwards the mes- [19] Eugster, P. T., Felber, P. A., Guerraoui, R., and Kermarrec, A.-M. The many 745 faces of publish/subscribe. ACM Comput. Surv. 35, 2 (June 2003), 114–131. 803 sage to the peers in fanout[topic]. If this set is empty, it 746 [20] Gascon-Samson, J., Garcia, F.-P., Kemme, B., and Kienzle, J. Dynamoth: A 804 chooses D peers from peers.gossipsub[topic] to become scalable pub/sub middleware for latency-constrained applications in the cloud. 747 805 the new fanout[topic] peers and forwards to them. Proceedings - International Conference on Distributed Computing Systems 2015 (07 748 2015), 486–496. 806 [21] Gupta, A., Sahin, O. D., Agrawal, D., and Abbadi, A. E. Meghdoot: 749 3.4.2 Control message piggybacking. 807 Gossip and other control mes- Content-based publish/subscribe over p2p networks. In Proceedings of the 5th 750 sages do not have to be transmitted on their own message. Instead, ACM/IFIP/USENIX International Conference on Middleware (Berlin, Heidelberg, 808 751 they can be coalesced and piggybacked on any other message in the 2004), Middleware ’04, Springer-Verlag, pp. 254–273. 809 [22] Leitao, J., Pereira, J., and Rodrigues, L. Epidemic broadcast trees. In 2007 26th 752 regular flow, for any topic. This can lead to message rate reduction IEEE International Symposium on Reliable Distributed Systems (SRDS 2007) (Oct 810 753 whenever there is some correlated flow between topics, and can 2007), pp. 301–310. 811 754 2019-12-30 05:20. Page 7 of 1–8. 812 PL-TechRep-gossipsub-v0.1-Dec19, Dec 2019, AoE Vyzovitis and Psaras

813 [23] Leitao, J., Pereira, J., and Rodrigues, L. Hyparview: A membership protocol 2011), IPDPS ’11, IEEE Computer Society, pp. 746–757. 871 814 for reliable gossip-based broadcast. In Proceedings of the 37th Annual IEEE/IFIP [30] Rao, W., Chen, L., and Tarkoma, S. Toward efficient filter privacy-aware content- 872 International Conference on Dependable Systems and Networks (Washington, DC, based pub/sub systems. IEEE Transactions on Knowledge and Data Engineering 25, 815 USA, 2007), DSN ’07, IEEE Computer Society, pp. 419–429. 11 (Nov 2013), 2644–2657. 873 816 [24] Li, M., Ye, F., Kim, M., Chen, H., and Lei, H. Bluedove: A scalable and elastic [31] Setty, V., van Steen, M., Vitenberg, R., and Voulgaris, S. Poldercast: Fast, 874 publish/subscribe service. In IEEE IPDPS 2011 (05 2011), pp. 1254–1265. robust, and scalable architecture for p2p topic-based pub/sub. In Proceedings of the 817 875 [25] Malekpour, A., Carzaniga, A., Pedone, F., and Toffetti Carughi, G. End- 13th International Middleware Conference (New York, NY, USA, 2012), Middleware 818 to-end reliability for best-effort content-based publish/subscribe networks. In ’12, Springer-Verlag New York, Inc., pp. 271–291. 876 819 Proceedings of the 5th ACM International Conference on Distributed Event-based [32] Shikfa, A., Önen, M., and Molva, R. Privacy-preserving content-based pub- 877 System (New York, NY, USA, 2011), DEBS ’11, ACM, pp. 207–218. lish/subscribe networks. vol. 297, pp. 270–282. 820 [26] Matos, M., Nunes, A., Oliveira, R., and Pereira, J. Stan: Exploiting shared in- [33] Strom, R., Banavar, G., Chandra, T., Kaplan, M., Miller, K., Mukherjee, B., 878 821 terests without disclosing them in gossip-based publish/subscribe. In Proceedings Sturman, D., and Ward, M. Gryphon: An information flow based approach to 879 822 of the 9th International Conference on Peer-to-peer Systems (Berkeley, CA, USA, message brokering. CoRR cs.DC/9810019 (10 1998). 880 2010), IPTPS’10, USENIX Association, pp. 9–9. [34] Tarr, D., Lavoie, E., Meyer, A., and Tschudin, C. Secure scuttlebutt: An identity- 823 [27] Patel, J., Riviere, E., Gupta, I., and Kermarrec, A.-M. Rappel: Exploiting centric protocol for subjective and decentralized applications. In Proceedings of 881 824 interest and network locality to improve fairness in publish-subscribe systems. the 6th ACM Conference on Information-Centric Networking (New York, NY, USA, 882 Computer Networks 53 (08 2009), 2304–2320. 2019), ICN ’19, ACM, pp. 1–11. 825 883 [28] Pietzuch, P. R., and Bacon, J. M. Hermes: a distributed event-based middle- [35] Zhao, Y., Kim, K., and Venkatasubramanian, N. Dynatops: A dynamic topic- 826 ware architecture. In Proceedings 22nd International Conference on Distributed based publish/subscribe architecture. pp. 75–86. 884 827 Computing Systems Workshops (July 2002), pp. 611–618. [36] Zhuang, S. Q., Zhao, B. Y., Joseph, A. D., Katz, R. H., and Kubiatowicz, J. D. 885 [29] Rahimian, F., Girdzijauskas, S., Payberah, A. H., and Haridi, S. Vitis: A gossip- Bayeux: An architecture for scalable and fault-tolerant wide-area data dissemina- 828 based hybrid overlay for internet-scale publish/subscribe enabling rendezvous tion. In Proceedings of the 11th International Workshop on Network and Operating 886 829 routing in unstructured overlay networks. In Proceedings of the 2011 IEEE In- Systems Support for Digital Audio and Video (New York, NY, USA, 2001), NOSSDAV 887 830 ternational Parallel & Distributed Processing Symposium (Washington, DC, USA, ’01, ACM, pp. 11–20. 888 831 889 832 890 833 891 834 892 835 893 836 894 837 895 838 896 839 897 840 898 841 899 842 900 843 901 844 902 845 903 846 904 847 905 848 906 849 907 850 908 851 909 852 Not for distribution. 910 853 Unpublished working draft. 911 854 912 855 913 856 914 857 915 858 916 859 917 860 918 861 919 862 920 863 921 864 922 865 923 866 924 867 925 868 926 869 927 870 2019-12-30 05:20. Page 8 of 1–8. 928