Database and Distributed Computing Foundations of Blockchains

Database and Distributed Computing Foundations of Blockchains Sujaya Maiyya Victor Zakhary Mohammad Javad Amiri Divyakant Agrawal Amr El Abbadi Department of Computer Science, University of California, Santa Barbara Santa Barbara, California {victorzakhary,sujaya-maiyya,amiri,agrawal,amr}@cs.ucsb.edu ABSTRACT 1 INTRODUCTION The uprise of Bitcoin and other peer-to-peer cryptocurren- Bitcoin [26] is considered the first successful global scale cies has opened many interesting and challenging problems peer-to-peer cryptocurrency. The Bitcoin protocol explained in cryptography, distributed systems, and databases. The by the mysterious Nakamoto allows financial transactions main underlying data structure is blockchain, a scalable fully to be transacted among participants without the need for a replicated structure that is shared among all participants and trusted third party, e.g., banks, credit card companies, or Pay- guarantees a consistent view of all user transactions by all Pal. Bitcoin eliminates the need for such a trusted third party participants in the system. In this tutorial, we discuss the by replacing it with a distributed ledger that is fully repli- basic protocols used in blockchain, and elaborate on its main cated among all participants in the cryptocurrency system. advantages and limitations. To overcome these limitations, This distributed ledger is referred to as blockchain. we provide the necessary distributed systems background in managing large scale fully replicated ledgers, using Byzan- tine Agreement protocols to solve the consensus problem. Finally, we expound on some of the most recent proposals to design scalable and efficient blockchains in both permissionless and permissioned settings. The focus of the tutorial is on the distributed systems and database aspects of the recent innovations in blockchains. Figure 1: Blockchain is a chain of blocks of transactions linked by hash pointers. KEYWORDS Blockchain, as shown in Figure 1, is a secure linked list Permissionless Blockchain, Permissioned Blockchain, Dis- of blocks containing financial transactions that occur in the tributed Consensus, Byzantine Faults system and linked by hash pointers. The main challenge that ACM Reference Format: Bitcoin addresses is to maintain a consistent view of this Sujaya Maiyya Victor Zakhary Mohammad Javad Amiri, replicated blockchain in a secure and fault-tolerant manner Divyakant Agrawal Amr El Abbadi. 2019. Database and Dis- in a permissionless setting and in the presence of malicious tributed Computing Foundations of Blockchains. In 2019 Interna- participants. Unlike permissioned settings where all the par- tional Conference on Management of Data (SIGMOD ’19), June 30- ticipants in the system are known a priori, a permissionless July 5, 2019, Amsterdam, Netherlands. ACM, New York, NY, USA, setting allows participants to freely join and leave the system 6 pages. https://doi.org/10.1145/3299869.3314030 without maintaining any global knowledge of the number of participants. To address these challenges, Bitcoin builds Permission to make digital or hard copies of all or part of this work for on foundations developed over the last few decades from personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear diverse fields [27], but primarily from the fields of cryp- this notice and the full citation on the first page. Copyrights for components tography [5, 30], distributed systems [8, 20, 21] and data of this work owned by others than ACM must be honored. Abstracting with management [6, 24, 33] as illustrated in Figure 2. Figure 2 credit is permitted. To copy otherwise, or republish, to post on servers or to shows the space of techniques that are used in Bitcoin to redistribute to lists, requires prior specific permission and/or a fee. Request implement a permissionless decentralized payments. Digital permissions from [email protected]. SIGMOD ’19, June 30-July 5, 2019, Amsterdam, Netherlands signatures and hashing are the cryptographic foundations © 2019 Association for Computing Machinery. that are used in Bitcoin to support distributed transactions ACM ISBN 978-1-4503-5643-5/19/06...$15.00 stored in a ledger that is replicated across globally distributed https://doi.org/10.1145/3299869.3314030 sites in the presence of malicious faults. the Bitcoin protocol shortcomings in permissionless, e.g., Bit- coinNG [13], Byzcoin [18], Elastico [22], and Algorand [15], or permissioned, e.g., Tendermint [19], Hyperledger Fabric [4], and Quorum [25], settings using efficient and practical solutions that integrate cryptographic and consensus mech- anisms. In this tutorial, our goal is to present to the database community an in-depth understanding of state-of-the-art solutions for efficient scalable blockchains. We progress towards this goal by starting from a detailed description of the proto- Figure 2: The space of techniques that are used to im- cols and techniques underlying the design of Bitcoin. Since plement permissionless decentralized cryptosystems. most recent innovations in blockchain design depend criti- cally on consensus protocols in malicious settings, we outline the basic foundations of distributed fault-tolerant consensus In spite of its current success, Bitcoin suffers from scala- protocols. This is followed by a discussion of the most re- bility performance limitations represented by the number of cent proposals to solve the blockchain design problem using transactions executed per second, especially when compared scalable fault-tolerant solutions. with commercially successful On Line Transaction Process- ing (OLTP) financial systems, such as credit card companies. Bitcoin uses a notion of miners who need to perform a com- 2 TUTORIAL OUTLINE putationally challenging Proof of Work (PoW) puzzle before 2.1 Background they can add any block of transactions to the replicated blockchain. Since the PoW puzzle is computationally hard, Consensus and Byzatine Agreement (BA) are well studied prob- very few miners can successfully solve the puzzle, and hence lems that were first proposed by Lamport, Shostak and Pease a successful miner can add a block to the blockchain and be in 1982 [21]. Any solution to the BA problem tries to reach guaranteed, with very high probability, to be unique. Many agreement among a well defined set of processes on a single concerns have been raised about the wasted massive energy value. The distributed systems community has extensively requirements to mine one Bitcoin block. In addition, the dif- explored this problem in both synchronous and asynchro- ficulty of Bitcoin’s Proof of Work (PoW) discourages miners nous systems. In an asynchronous system, the Fisher Lynch from mining independently and pushes them to form few and Patterson (FLP) impossibility result states that consen- powerful mining cartels. Concentrating most of the mining sus is not guaranteed to terminate in the presence of even power in few hands, or pools, risks to derail the whole idea a single crash failure [14]. This led to many BA protocols of decentralization. for synchronous systems, where the lower bound requires This mining approach to determine the process eligible that the number of maliciously faulty processes is at most to add a new block to the block chain is in contrast to the one third of the total number of processes. Synchronous BA distributed systems approach, that has been promoting the protocols require multiple rounds of communication and use of Byzantine Agreement or consensus, which is efficient extensive message passing. On the other hand, several effi- and more egalitarian. In fact, consensus protocols such as cient asynchronous BA protocols have been developed based Paxos have been quite successful in recent years in laying on Lamport’s Paxos protocol [20]. The main challenge such the foundations of large global scale data management sys- asynchronous protocols face is that they do not guarantee ter- tem. Unfortunately, Paxos has many limitations, especially mination. However, many systems have been designed that from a global cryptocurrency point of view, including the depend on Paxos, and have been practically successful [7, 9]. requirement of a permissioned setting, and that participants Paxos, however, assumes that processes may only fail by can only fail by crashing. An alternative to Paxos that toler- crashing. In a cryptocurrency setting, processes may act in a ates malicious failures is Practical Byzantine Fault-Tolerance malicious manner. In 1999, Castro and Liskov proposed the (PBFT) [8]. Although it tolerates malicious failures, PBFT still Practical Byzantine Fault Tolerance (PBFT) algorithm [8], requires a permissioned setting, and requires a large num- which is similar to Paxos in that it uses a small number of ber of message exchanges, hence does not scale to the large message rounds and assumes an asynchronous system. How- number of participants expected in modern day permission- ever, it tolerates malicious faults. During the tutorial, we will less cryptocurrencies. Recently, the security and distributed provide a general overview of the consensus problem and systems communities have been aggressively exploring al- a high level description of various BA protocols. In particu- ternative scalable solutions. These systems attempt to solve lar, we will provide a detailed description of PBFT, given its particular relevance to many recent cryptocurrency propos- The difficulty of PoW aims to

Database and Distributed Computing Foundations of Blockchains

The Power of Abstraction

MIT Turing Laureates Propose Creation of School of Computing an Open Letter to President Rafael Reif

On-The-Fly Garbage Collection: an Exercise in Cooperation

Arxiv:1909.05204V3 [Cs.DC] 6 Feb 2020

Mathematical Writing by Donald E. Knuth, Tracy Larrabee, and Paul M

Verification and Specification of Concurrent Programs

The Specification Language TLA+

Using Lamport Clocks to Reason About Relaxed Memory Models

Should Your Specification Language Be Typed?

(LA)TEX Changed the Face of Mathematics: An

A Fast Mutual Exclusion Algorithm

The Greatest Pioneers in Computer Science