BLOCKBENCH: a Framework for Analyzing Private Blockchains
Total Page:16
File Type:pdf, Size:1020Kb
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by ScholarBank@NUS BLOCKBENCH: A Framework for Analyzing Private Blockchains WANG JI (B.E, Harbin Institute of Technology, China) A THESIS SUBMITTED FOR THE DEGREE OF MASTER OF SCIENCE DEPARTMENT OF COMPUTER SCIENCE NATIONAL UNIVERSITY OF SINGAPORE 2017 Supervisor: Professor Ooi Beng Chin Examiners: Associate Professor Khoo Siau Cheng Associate Professor Teo Yong Meng Declaration I hereby declare that this thesis is my original work and it has been written by me in its entirety. I have duly acknowledged all the sources of information which have been used in the thesis. This thesis has also not been submitted for any degree in any university previously. Wang Ji 18 June 2017 Acknowledgements First and the most important, I would like to express my most sincere gratitude to my supervisor, Prof. Beng Chin Ooi for his continuous guidance and support all the way. Prof. Ooi taught me many things, not limited to how to do research, but how to over- come challenges not only in research area but also in life. Without Prof Ooi’s generous support, I would not have been able to complete this thesis. I also appreciate the help from all my colleagues and my friends. Thanks to my se- niors Wei Wang, Qian Lin, Sheng Wang, Jinyang Gao, Zhongle Xie, Hao Zhang, Kaip- ing Zheng, Mo Sha, Wentian Guo, Zhaojing Luo, Yingjun Wu, Chang Yao, Xin Ji, Fei He, Wenfeng Wu and Xiangrui Cai for their continuous help and advice from since I first came here. Special thanks to the excellent work and support from all my co-authors, Anh Dinh, Rui Liu, Qian Lin, Pingcheng Ruan, Sheng Wang and Hao Zhang. I also want to express my gratitude to my parents for their endless love, understand- ing, and support to me. Contents 1 Introduction 1 1.1 Motivations . 1 1.2 Background . 4 1.3 Contributions . 8 1.4 Synopsis . 9 2 Literature Review 10 2.1 System Benchmarks . 10 2.2 Security and Peformance Studies on Public Blockchains . 11 2.3 Summary . 13 3 BLOCKBENCH Design 14 3.1 Blockchain Layers . 14 3.1.1 Consensus . 15 3.1.2 Data model . 17 3.1.3 Execution layer . 18 3.1.4 Application layer . 19 3.1.5 Survey of Blockchain Platforms . 20 3.2 BLOCKBENCH Implementation . 22 3.3 Evaluation Metrics . 24 3.4 Workloads . 26 3.4.1 Macro benchmark workloads . 26 3.4.2 Micro benchmark workloads . 28 i 4 Performance Benchmark 30 4.1 Macro benchmarks . 31 4.1.1 Throughput and latency . 31 4.1.2 Scalability . 35 4.1.3 Fault tolerance and security . 39 4.1.4 Comparision with transactional database . 41 4.2 Micro benchmarks . 42 4.2.1 Execution layer . 42 4.2.2 Data model . 43 4.2.3 Consensus . 48 5 Discussion and Conclusion 49 5.1 Discussion and Future Work . 49 5.2 Conclusion . 52 ii Abstract Blockchain technologies are taking the world by storm. Public blockchains, such as Bitcoin and Ethereum, enable secure peer-to-peer applications like crypto-currency or smart contracts. Their security and performance are well studied. This thesis concerns recent private blockchain systems designed with stronger security (trust) assumption and performance requirement. These systems aim to disrupt applications which have so far been implemented on top of database systems, e.g., banking, finance, and trading. Mul- tiple platforms for private blockchains are under active development and fine-tuning. However, there is an evident lack of a systematic framework with which different sys- tems can be analyzed and compared against each other. Such a framework can be used to assess blockchains’ viability as another distributed data processing platform while helping developers to identify bottlenecks and accordingly improve their platforms. In this thesis, we first describe BLOCKBENCH, the first evaluation framework for an- alyzing private blockchains. Any private blockchain can be integrated to BLOCKBENCH via simple APIs and benchmarked against workloads that are based on real and synthetic smart contracts. BLOCKBENCH measures overall and component-wise performance regarding throughput, latency, scalability, and fault-tolerance. Next, we use BLOCK- BENCH to conduct a comprehensive evaluation of three major private blockchains: Ethereum, Parity and Hyperledger Fabric. The results demonstrate that these systems are still far from displacing current database systems in traditional data processing workloads. Fur- thermore, there are gaps in performance among the three systems which are attributed to the design choices at different layers of the blockchain’s software stack. BLOCKBENCH serves as a fair means of comparison for different platforms and enables a deeper under- standing of different system design choices. iii List of Figures 1.1 Blockchain data structure for Bitcoin. Transactions are kept outside of the block header. 5 1.2 Blockchain software stack on a fully validating node. A non-validating node stores only the block headers. Different blockchain platforms offer different interfaces between the blockchain and application layer. 6 1.3 An example of smart contract, written in Solidity language, for a pyra- mid scheme on Ethereum. 7 3.1 Abstraction layers in blockchain, and the corresponding workloads in BLOCKBENCH................................ 15 3.2 BLOCKBENCH software stack. New workloads are added by implement- ing IWorkloadConnector interface. New blockchain backends are added by implementing IBlockchainConnector . 22 4.1 BLOCKBENCH Peak performance of blockchains with 8 clients and 8 servers. 32 4.2 BLOCKBENCH Performance with varying request rates of blockchains with 8 clients and 8 servers. 33 4.3 Resource utilization. 34 4.4 Client’s request queue, for request rates of 8 tx/s and 512 tx/s. 35 4.5 Block generation rate. 35 4.6 Latency distribution. 36 4.7 Performance scalability (with the same number of clients and servers). 36 4.8 Scalability with Smallbank benchmark. 37 4.9 Queue length at the client. 37 iv 4.10 Performance scalability (with 8 clients). 38 4.11 Failing 4 nodes at 250th second (fixed 8 clients) for 12 and 16 servers. X-12 and X-16 mean running 12 and 16 servers using blockchain X re- spectively. 39 4.12 Blockchain forks caused by attacks that partitions the network in half at 100th second and lasts for 150 seconds. X-total means the total number of blocks generated in blockchain X, X-bc means the total number of blocks that reach consensus in blockchain X. 40 4.13 Performance of the three blockchain systems versus H-Store. 41 4.14 CPUHeavy workload, ‘X’ indicates Out-of-Memory error. 43 4.15 Average write throughput in IOHeavy workload, ‘X’ indicates Out-of- Memory error. 44 4.16 Average read throughput in IOHeavy workload, ‘X’ indicates Out-of- Memory error. 44 4.17 Disk usage in IOHeavy workload, ‘X’ indicates Out-of-Memory error. 44 4.18 Analytics workload Q1 latency. 46 4.19 Analytics workload Q2 latency. 46 4.20 Code snippet from the VersionKVStore smart contract for analytics workload (Q1 and Q2). 47 4.21 Throughput of DoNothing workload. 48 List of Tables 3.1 Comparison of blockchain platforms . 21 3.2 Summary of smart contracts implemented in BLOCKBENCH. Each con- tract has one Solidity version for Parity and Ethereum, and one Golang version for Hyperledger. 26 v Chapter 1 Introduction 1.1 Motivations Blockchain technologies are gaining massive momentum in the last few years, largely due to the success of Bitcoin crypto-currency [46]. A blockchain, also called distributed ledger, is essentially an append-only data structure maintained by a set of nodes which do not fully trust each other. All nodes in a blockchain network agree on an ordered set of blocks, each containing multiple transactions, thus the blockchain can be viewed as a log of ordered transactions. In a database context, blockchain can be viewed as a so- lution to distributed transaction management: nodes keep replicas of the data and agree on an execution order of transactions. However, traditional database systems work in a trusted environment and employ well-known concurrency control techniques [10, 40, 57] to order transactions. Blockchain’s key advantage is that it does not assume nodes trust each other and therefore is designed to achieve Byzantine fault tolerance. In the original design, Bitcoin’s blockchain stores coins as the system states shared by all participants. For this simple application, Bitcoin nodes implement a simple repli- cated state machine model which simply moves coins from one address to another. Since then, blockchain has grown rapidly to support user-defined states and Turing complete state machine models. Ethereum [2] is a well-known example which enables any de- 1 centralized, replicated applications known as smart contracts. More importantly, inter- est from the industry has started to drive the development of new blockchain platforms that are designed for private settings in which participants are authenticated. Blockchain systems in such environments are called private (or permissioned), as opposed to the early systems operating in public (or permissionless) environments where anyone can join and leave. Applications for security trading and settlement [53], asset and financial management [44, 45], banking and insurance [30] are being built and evaluated. These applications are currently supported by enterprise-grade database systems like Oracle and MySQL, but blockchain has the potential to disrupt this status quo because it incurs lower infrastructure and human costs [30]. In particular, blockchain’s immutability and transparency help reduce human errors and the need for manual intervention due to con- flicting data. Blockchain can help streamline business processes by removing duplicate efforts in data governance. Goldman Sachs estimated 6 billion saving in the current cap- ital market [30], and J.P.