Gosig: Scalable Byzantine Consensus on Adversarial Wide Area Network for Blockchains

Gosig: Scalable Byzantine Consensus on Adversarial Wide Area Network for Blockchains Peilun Li Guosai Wang Xiaoqi Chen Tsinghua University Tsinghua University Princeton University Wei Xu Tsinghua University Abstract longer have to worry about Sybil attacks [21], there are Existing Byzantine fault tolerance (BFT) protocols still some other significant challenges. face significant challenges in the consortium blockchain The blockchain is replicated to each participant, and scenario. On the one hand, we can make little assump- the key problem is how to reach consensus on these repli- tions about the reliability and security of the underlying cas. Comparing to traditional distributed transaction sys- Internet. On the other hand, the applications on consor- tems, the three biggest challenges of a blockchain are: tium blockchains demand a system as scalable as the Bit- 1) Players are from different organizations without coin but providing much higher performance, as well as mutual trust. Failures, even Byzantine failures, are com- provable safety. We present a new BFT protocol, Gosig, mon. Thus, we cannot rely on a small quorum (e.g., that combines crypto-based secret leader selection and Chubby [9] or ZooKeeper [27]) for consensus. Instead, multi-round voting in the protocol layer with implemen- we need Byzantine consensus and allow all players to tation layer optimizations such as gossip-based message participate, i.e., supporting a consensus group with thou- propagation. In particular, Gosig guarantees safety even sands of servers. in a network fully controlled by adversaries, while pro- 2) The system runs on an open Internet. The network viding provable liveness with easy-to-achieve network can drop/delay communication arbitrarily. Even worse, connectivity assumption. On a wide area testbed con- as the addresses of the participants are known to all, an sisting of 140 Amazon EC2 servers spanning 14 cities adversary can easily launch attacks targeting any chosen on five continents, we show that Gosig can achieve over participant at any time, using techniques like distributed 4,000 transactions per second with less than 1 minute denial-of-service (DDoS) [26, 49]. The adversary can transaction confirmation time. strategically choose which players to attack, and victims will remain unavailable until the adversary adaptively 1 Introduction choose to attack others. We call these attacks adaptive attacks, which strongly threatens the special nodes in a The rise of cryptocurrencies, such as Bitcoin [43], in- protocol, such as the leaders. creases the awareness and adoption of the underly- 3) Different from Bitcoin that allows temporary incon- arXiv:1802.01315v1 [cs.DC] 5 Feb 2018 ing blockchain technology. Blockchain offers a dis- sistency (i.e., a fork), people usually expect a permis- tributed ledger that serializes and records the transac- sioned chain to provide more traditional transaction se- tions. Blockchain provides attractive properties such as mantics, i.e., a committed transaction is durable. Also, full decentralization, offline-verifiability, and most im- applications on a permissioned chain [1] require much portantly, scalable Byzantine fault tolerance on the In- higher throughput and lower latency than Bitcoin. ternet. Thus, Blockchain has become popular beyond While there are many Byzantine fault-tolerant (BFT) cryptocurrencies, expanding into different areas such as protocols, most of them do not sufficiently address these payment services, logistics, healthcare, and Internet-of- three challenges. For example, PBFT [13] and its suc- Things (IoT) [8, 54, 48]. cessors [33, 38] and even some Bitcoin variants [23] all While Bitcoin provides a permissionless protocol, depend on a leader or a small quorum to participate mul- where everyone can join, we focus on consortium tiple rounds of communications, and thus they are vul- blockchains (aka permissioned blockchains), where a nerable to adaptive attacks on the leader. Other protocols participant needs offline authentication to join. It is use- that try to avoid faulty leaders by changing leaders in a ful in many commercial applications [25]. While we no fixed order [35, 42, 51,3] cannot avoid adaptive leader 1 attacks, because once the adversaries know who is the 140 nodes, and achieve promising performance results.2 next leader, they can change their target accordingly. There is a new generation of BFT protocols designed 2 Related Work to run over the Internet. To avoid adaptive attacks, Al- gorand [24] hides the leader identity. To improve scal- Bitcoin and its variants. Permissionless public ability, ByzCoin[32] combines Proof of Work (PoW) blockchains like Bitcoin [43], Ethereum [53], PP- with multi-signature-based PBFT. To tolerate arbitrary Coin [31] need proof of work (PoW) or proof of stake network failures, HoneyBadgerBFT[41] adopts asyn- (PoS) to prevent Sybil attacks. They also need incen- chronous atomic broadcast [11] and asynchronous com- tive mechanisms to encourage people to join the public mon subset (ACS) [4]. network to keep the system safe. Other designs [17, 20] Unfortunately, as we will detail in Section 3.3, none of try to avoid chain forking but retain the design of PoW these BFT protocols offer the following properties at the or PoS. We assume consortium blockchains [10, 45, 52], same time: 1) liveness under adaptive attack, 2) scalabil- and mainly focus on the performance and safety of the ity to 10,000s of nodes with low latency (15 seconds in system, instead of the other economic aspects. our simulation) for commitment, and 3) provable safety Byzantine fault tolerance. The most important fea- (i.e., no fork at any time) with arbitrary network fail- ture of a BFT protocol is safety. Unfortunately, many ure. Also, there is no straightforward way to combine open source BFT protocols are not safe [12]. There are the techniques used in these protocols. two major approaches to design provable BFT agree- We present Gosig1, a new BFT protocol for permis- ment protocols. 1) Using multi-round voting: exam- sioned blockchains. Gosig can achieve all three proper- ple systems include PBFT [13] and its successors [33, ties above, and also provide provable liveness with par- 38, 14]; 2) Using leader-less atomic broadcast: Hon- tially synchronous network (details in Section 3.2). eyBadgerBFT [41] and [16, 34]. To prevent malicious Gosig elects different leaders secretly for every block, leaders from affecting the system, Aardvark [15] use and it eliminates the leader’s involvement after it pro- performance metrics to trigger view changes and Spin- poses a block to defend against adaptive attacks on lead- ning [51], Tendermint [35] or others [3, 42] rotates leader ers. At the implementation level, we use gossip-based roles in a round robin manner. However, there methods communications to fully exploit the link redundancy on can not avoid adaptive attacks because the leader role is the Internet while keeping the source safe. Since we known to all in advance, and thus can be muted by at- need to gather signatures during gossip, we adopt asyn- tacks like DDoS right before it becomes a leader. Gosig chronous multi-signature [6, 46] to largely reduce the adopts similar voting mechanism like PBFT to get good network overhead of consensus messages. performance without failure, and keeps safety and live- We evaluate Gosig on an Amazon EC2-based 140- ness under attacks. server testbed that spans 14 cities on five continents. We In order to scale the system, many systems adopt can achieve a throughput of 4,000 tps (transactions per the “hybrid consensus” design [30, 32, 44] that uses a second) with an average transaction confirmation latency Bitcoin-like protocol to select a small quorum, and use of 1 minute. Even when 1/4 of the nodes fail, we can another (hopefully faster) BFT protocol to commit trans- still maintain over 3,500 tps with the same latency. With actions. If adversaries can instantly launch adaptive at- over 100 participants, it effectively doubles the through- tacks on leaders, it is hard for these protocols to main- put and reduces the latency by 80% comparing to Honey- tain liveness. Algorand [24] leverages secret leader elec- BadgerBFT [41], the state-of-the-art protocol that offers tion and quorum member replacement methods to keep the same level of safety guarantee. Also, using simula- liveness. Gosig lets every player participate in the con- tions, we show that Gosig is able to scale to 10K nodes. sensus, but combines similar secret leader selection with In summary, our major contributions are: signature-based voting to prevent such attacks. 1) We propose a new BFT protocol that achieves scal- We use similar methods and adversary models pro- ability, provable safety and resilient to adaptive attack. posed in Algorand [24]. We adopt the idea of multi- 2) We propose a novel method of combining secret round voting from PBFT and HoneyBadgerBFT [41], leader selection, random gossip, multi-round voting, and and the idea of multi-signature from ByzCoin [32]. multi-signature into a single BFT protocol in a compati- Gosig combines these incompatible methods in a coher- ble way. ent protocol and achieves good performance. We com- 3) We provide a real Gosig implementation, evaluate pare the key differences of these protocols in Section 3.3. it on a real-world geographically distributed testbed with Overlay network and gossip. Most BFT protocols and blockchains use broadcast as a communication primitive. 1The name is a combination of Gossip and Signature aggregation, two techniques we use. 2We will opensouce Gosig when the paper is published. 2 To improve broadcast reliability on the Internet, people Here we define safety and liveness of blocks instead often use application-layer overlay networks. We adopt of transactions for simplicity. We rely on gossip mecha- techniques like gossip from reliable multicast [5], prob- nisms to ensure that a transaction will reach most play- abilistic broadcast [22, 29] and other peer-to-peer (P2P) ers, and an honest player will pack the transaction when networks [28, 50].

Gosig: Scalable Byzantine Consensus on Adversarial Wide Area Network for Blockchains

Chapter 1 from Byzantine Consensus to Blockchain Consensus

A Survey of Distributed Consensus Protocols for Blockchain Networks

A MILP Model for a Byzantine Fault Tolerant Blockchain Consensus

Byzantine Consensus in Asynchronous Message-Passing Systems: a Survey

Virginia Gold 212-626-0505 [email protected]

Byzantine Fault Tolerance

Distributed Computing Column 36 Distributed Computing: 2009 Edition

Leaderless Byzantine Fault Tolerance

Modelling and Testing Composite Byzantine-Fault Tolerant Consensus Protocols

Enabling Secure and Resource-Efficient Blockchain Networks with VOLT

Byzantizing Paxos by Refinement

Asynchronous Byzantine Fault Tolerant