RCC: Resilient Concurrent Consensus for High-Throughput

RCC: Resilient Concurrent Consensus for High-Throughput Secure Transaction Processing SuyashGupta JelleHellings Mohammad Sadoghi Moka Blox LLC Exploratory Systems Lab Department of Computer Science University of California, Davis Abstract—Recently, we saw the emergence of consensus-based At the core of consensus-based systems are consensus database systems that promise resilience against failures, strong protocols that enable independent participants (e.g., different data provenance, and federated data management. Typically, companies) to manage a single common database by reliably these fully-replicated systems are operated on top of a primary- backup consensus protocol, which limits the throughput of these and continuously replicating a unique sequence of transactions systems to the capabilities of a single replica (the primary). among all participants. By design, these consensus protocols To push throughput beyond this single-replica limit, we pro- are resilient and can deal with participants that have crashed, pose concurrent consensus. In concurrent consensus, replicas are unable to participate due to local network, hardware, or independently propose transactions, thereby reducing the influ- software failures, or are compromised and act malicious [10], ence of any single replica on performance. To put this idea in practice, we propose our RCC paradigm that can turn any [11]. As such, consensus protocols can be seen as the fault- primary-backup consensus protocol into a concurrent consensus resilient counterparts of classical two-phase and three-phase protocol by running many consensus instances concurrently. RCC commit protocols [12], [13], [14]. Most practical systems is designed with performance in mind and requires minimal use consensus protocols that follow the classical primary- coordination between instances. Furthermore, RCC also promises backup design of PBFT [15] in which a single replica, the pri- increased resilience against failures. We put the design of RCC mary, proposes transactions by broadcasting them to all other to the test by implementing it in RESILIENTDB, our high- performance resilient blockchain fabric, and comparing it with replicas, after which all replicas exchange state to determine state-of-the-art primary-backup consensus protocols. Our exper- whether the primary correctly proposes the same transaction to iments show that RCC achieves up to 2.75× higher throughput all replicas and to deal with failure of the primary. Well-known 91 than other consensus protocols and can be scaled to replicas. examples of such protocols are PBFT [15], ZYZZYVA [16], Index Terms—High-throughput resilient transaction process- SBFT [17], HOTSTUFF [18], POE [19], and RBFT [20], and ing, concurrent consensus, limits of primary-backup consensus. fully-optimized implementations of these protocols are able to process up-to tens-of-thousands transactions per second [21]. I. INTRODUCTION A. The Limitations of Traditional Consensus Fueled by the emergence of blockchain technology [2], Unfortunately, a close look at the design of primary-backup [3], [4], we see a surge in consensus-based data processing consensus protocols reveals that their design underutilized frameworks and database systems [4], [5], [6], [7], [8], [9]. available network resources, which prevents the maximization arXiv:1911.00837v2 [cs.DB] 2 Nov 2020 This interest can be easily explained: compared to traditional of transaction throughput: the throughput of these protocols is distributed database systems, consensus-based systems can determined mainly by the outgoing bandwidth of the primary. provide more resilience during failures, can provide strong To illustrate this, we consider the maximum throughput by support for data provenance, and can enable federated data which primaries can replicate transactions. Consider a system n f processing in a heterogeneous environment with many inde- with replicas of which are faulty and the remaining nf n f T pendent participants. Consequently, consensus-based systems = − are non-faulty. The maximum throughput max can prevent disruption of service due to software issues or of any such protocol is determined by the outgoing bandwidth B n cyberattacks that compromise part of the system, and can aid of the primary, the number of replicas , and the size of st T B n st in improving data quality of data that is managed by many transactions : max = /(( −1) ). No practical consensus independent parties, potentially reducing the huge societal protocol will be able to achieve this throughput, as dealing costs of cyberattacks and bad data. with crashes and malicious behavior requires substantial state exchange. Protocols such as ZYZZYVA [16] can come close, A brief announcement of this work was presented at the 33rd International however, by optimizing for the case in which no faults occur, Symposium on Distributed Computing (DISC 2019) [1]. this at the cost of their ability to deal with faults efficiently. This material is based upon work partially supported by the U.S. De- partment of Energy, Office of Science, Office of Small Business Innovation For PBFT, the minimum amount of state exchange consists Research, under Award Number DE-SC0020455. of two rounds in which PREPARE and COMMIT messages (20txn/proposal) (400txn/proposal) are exchanged between all replicas (a quadratic amount, see ·105 ·105 1.5 1.5 ) Example III.1 in Section III). Assuming that these messages T T BFT ) T T BFT s max P s max P / T T / T T have size sm, the maximum throughput of PBFT is TPBFT = cmax cPBFT cmax cPBFT txn B/((n−1)(st +3sm)). To minimize overhead, typical imple- 1.0 txn 1.0 mentations of PBFT group hundreds of transactions together, 0.5 0.5 assuring that st ≫ sm and, hence, Tmax ≈ TPBFT . Throughput ( Throughput ( The above not only shows a maximum on throughput, but 0 0 0 0 . 0 20 40 60 80 100 . 0 20 40 60 80 100 also that primary-backup consensus protocols such as PBFT Number of replicas (n) Number of replicas (n) and ZYZZYVA severely underutilize resources of non-primary Fig. 1. Maximum throughput of replication in a system with B = 1 Gbit/s, replicas: when st ≫ sm, the primary sends and receives n = 3f + 1, nf = 2f + 1, sm = 1KiB, and individual transactions are roughly (n − 1)st bytes, whereas all other replicas only send 512 B. On the left, each proposal groups 20 transactions (st = 10KiB) and st and receive roughly st bytes. The obvious solution would on the right, each proposal groups 400 transactions ( = 2MiB). be to use several primaries. Unfortunately, recent protocols such as HOTSTUFF [18], SPINNING [22], and PRIME [23] concurrent consensus. In specific, we design for a system that regularly switch primaries all require that a switch from that is optimized for high-throughput scenarios in which a a primary happens after all proposals of that primary are plentitude of transactions are available, and we make every processed. Hence, such primary switching does load balance replica a concurrent primary that is responsible for proposing overall resource usage among the replicas, but does not address and replicating some of these transactions. As we have nf non- the underutilization of resources we observe. faulty replicas, we can expect to always concurrently propose B. Our Solution: Towards Resilient Concurrent Consensus at least nf transactions if sufficient transactions are available. The only way to push throughput of consensus-based Such concurrent processing has the potential to drastically databases and data processing systems beyond the limit T , improve throughput: in each round, each primary will send max nf is by better utilizing available resources. In this paper, we out one proposal to all other replicas, and receive − 1 pro- propose to do so via concurrent consensus, in which we use posals from other primaries. Hence, the maximum concurrent T nfB n st nf st many primaries that concurrently propose transactions. We throughput is cmax = /(( − 1) + ( − 1) ). also propose RCC, a paradigm for the realization of concurrent In practice, of course, the primaries also need to participate consensus. Our contributions are as follows: in state exchange to determine the correct operations of all 1) First, in Section II, we propose concurrent consensus concurrent primaries. If we use PBFT-style state exchange, we T nfB n and show that concurrent consensus can achieve much end up with a concurrent throughput of cPBFT = /(( − st sm nf st n sm higher throughput than primary-backup consensus by 1)( +3 ) + ( − 1)( + 4( − 1) )). In Figure 1, we T T T effectively utilizing all available system resources. have sketched the maximum throughputs max, PBFT , cmax, T 2) Then, in Section III, we propose RCC, a paradigm and cPBFT . As one can see, concurrent consensus not only for turning any primary-backup consensus protocol into promises greatly improved throughput, but also sharply re- a concurrent consensus protocol and that is designed duces the costs associated with scaling consensus. We remark, for maximizing throughput in all cases, even during however, that these figures provide best-case upper-bounds, as malicious activity. they only focus on bandwidth usage. In practice, replicas are 3) Then, in Section IV, we show that RCC can be utilized also limited by computational power and available memory to make systems more resilient, as it can mitigate the ef- buffers that puts limits on the number of transactions they can fects of order-based attacks and throttling attacks (which process in parallel and can execute (see Section V-B). are not prevented by traditional consensus protocols), and can provide better load balancing. III. RCC: RESILIENT CONCURRENT CONSENSUS 4) Finally, in Section V, we put the design RCC to the The idea behind concurrent consensus, as outlined in the 1 test by implementing it in RESILIENTDB, our high- previous section, is straightforward: improve overall through- performance resilient blockchain fabric, and compare put by using all available resources via concurrency. Designing RCC with state-of-the-art primary-backup consensus and implementing a concurrent consensus system that operates protocols.

Load more