Falcondb: Blockchain-Based Collaborative Database

FalconDB: Blockchain-based Collaborative Database Yanqing Peng Min Du Feifei Li University of Utah UC Berkeley University of Utah [email protected] [email protected] [email protected] Raymond Cheng Dawn Song UC Berkeley UC Berkeley [email protected] [email protected] ABSTRACT 1 INTRODUCTION Nowadays an emerging class of applications are based on The growth of the Internet has triggered tremendous op- collaboration over a shared database among different entities. portunities for data cooperation. Many applications could However, the existing solutions on shared database may benefit from a shared database among multiple parties, such require trust on others, as collaborative benchmarking [4], crowdsourcing [43], shar- have high hardware demand that is unaffordable for in- ing economy [5], cooperative scientific computation [14], dividual users, or have relatively low performance. In other and collaborative machine learning[7]. words, there is a trilemma among security, compatibility In such scenarios, a key opponent is a shared database and efficiency. In this paper, we present FalconDB, which management system that allows each party to execute up- enables different parties with limited hardware resources dates and queries to the database, while maintaining a con- to efficiently and securely collaborate on a database. Fal- sistent view among all participants in the network. Many conDB adopts database servers with verification interfaces projects require collaborations among a group of individual accessible to clients and stores the digests for query/update users. As individual participants, they may need to collabo- authentications on a blockchain. Using blockchain as a con- rate through their personal devices with limited storage and sensus platform and a distributed ledger, FalconDB is able computation power. Furthermore, they could be easily com- to work without any trust on each other. Meanwhile, Fal- promised and become malicious. Therefore, it is essential conDB requires only minimal storage cost on each client, to design a shared database that enables individual users to and provides anywhere-available, real-time and concurrent collaborate with each other without any trust. access to the database. As a result, FalconDB overcomes the Traditionally, individual users who have limited compu- disadvantages of previous solutions, and enables individual tational power and local storage typically utilize a central- users to participate in the collaboration with high efficiency, ized server to facilitate collaboration. Such a service enables low storage cost and blockchain-level security guarantees. anywhere-available, real-time, and concurrent access to the database. However, it requires to fully trust the central server, ACM Reference Format: which is expected to correctly execute all requests. One im- Yanqing Peng, Min Du, Feifei Li, Raymond Cheng, and Dawn Song. mediate concern is that a malicious server can fool any client 2020. FalconDB: Blockchain-based Collaborative Database. In Pro- without being detected. Another issue arises from the client ceedings of the 2020 ACM SIGMOD International Conference on Man- side: a client may manipulate the records arbitrarily for its agement of Data (SIGMOD’20), June 14–19, 2020, Portland, OR, USA. ACM, New York, NY, USA, 16 pages. https://doi.org/10.1145/3318464. own interest. A sophisticated mechanism must be adopted 3380594 to prevent clients from issuing undesired updates. The emergence of blockchain techniques brings practi- cal solutions to handle Byzantine failures [26], namely, a Permission to make digital or hard copies of all or part of this work for party could behave arbitrarily. Specifically, a permissioned personal or classroom use is granted without fee provided that copies are not blockchain platform could be viewed as a distributed system, made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components where there is a chain of blocks that keep immutable records. of this work owned by others than ACM must be honored. Abstracting with The blocks of records are agreed by all blockchain nodes credit is permitted. To copy otherwise, or republish, to post on servers or to via some consensus protocol that tolerates Byzantine fail- redistribute to lists, requires prior specific permission and/or a fee. Request ures. Such platforms include Hyperledger [2], Tendermint permissions from [email protected]. [8], HotStuff [47] and BigchainDB [33]. This design does not SIGMOD’20, June 14–19, 2020, Portland, OR, USA require a centralized server, and works in an untrusted en- © 2020 Association for Computing Machinery. ACM ISBN 978-1-4503-6735-6/20/06...$15.00 vironment with up to a small proportion (e.g., 1/3) of nodes https://doi.org/10.1145/3318464.3380594 being malicious. Undesired operations like malicious updates a traceable and tamper-free historical record of charitable could be denied by the system. giving [40]. Although blockchain systems have attracted much atten- Our contribution. We present FalconDB, which is a block- tion, it is still very rare for an individual user to participate chain database that has low hardware requirements (e.g., stor- as a node, due to the high cost. First, blockchain nodes need age, computation, bandwidth) on individual clients, achieves to locally store a full copy of the blocks, which could easily query efficiency as high as a centralized database design, and be hundreds of gigabytes, potentially exhausting local disk ensures security guarantees as strong as blockchain level. In space for individual users. Second, blockchain nodes must FalconDB, a blockchain node could either be a server node maintain network communication with others to promptly that stores the entire database, or a client node that uses receive and validate new blocks, which consumes both net- the database by sending query and update requests to the work bandwidth and computation power. Finally, since there server nodes. To grant the clients with the ability to verify is no longer a service provider to facilitate queries, users if an update or query is correctly executed, the database have to execute queries locally with their personal hardware. contents are stored with authenticated data structures (ADS) Executing complex queries with limited CPU power could on server nodes. The ADS generates a small piece of digest be unacceptably slow. for the current database content, and the clients can then The high cost on blockchain nodes limits the compati- use this digest to authenticate the results returned by the bility of blockchain. The concept of lightweight clients is server. The servers and clients are synchronized under a proposed to alleviate the cost on individual participants [35]. decentralized blockchain network. Each server node is a full Lightweight clients communicate with full nodes to update blockchain node that stores the entire database and block- or query the blockchain database. Hence their capabilities chain blocks, while each client node stores only the block are constrained by the APIs exposed by the full nodes. More- headers. The newest block header contains the digest of the over, query integrity needs to be ensured if the full nodes current database content, which the clients are able to access are not trustworthy [45]. A baseline solution is to leverage for authentication. With this digest, FalconDB could tolerate smart contracts [22]. A user could submit a query to a smart up to 1/3 of the total blockchain nodes being malicious, and contract. Then, all full nodes execute the query and run a is able to proceed even all full nodes are dishonest except Byzantine fault tolerance (BFT) consensus protocol on the one, which is enabled by the authenticated database design. query result. Once a consensus is reached, the result will be Note that, many previous works on outsourced database committed to blockchain as well as revealed to the client. (ODB) [21, 28, 29, 34, 36, 37, 46, 46, 49, 50] have studied the This approach guarantees integrity with a majority of honest problem of query authentication, which validates the results nodes. However, it has limited throughput, high latency, high of the database server with essential digests. Unfortunately gas consumption and introduces privacy concerns. they cannot be directly applied to the collaborative database None of the previous solutions is able to simultaneously scenario as we will show later in Section 2.3.1. achieve security, compatibility and efficiency. This becomes a We summarize our contributions as follows. paramount trilemma when we need all three properties. Con- sider the case of charitable giving, where a large number of donors make donations to multiple non-profit organizations • We propose and implement FalconDB, which to the best of (NGOs). An NGO needs to record each received donation, our knowledge, is the first platform that enables individual while an individual donor may want to audit how an NGO is users to collaborate on a database with strong security spending her money. In fact, such a database already exists guarantee, low storage cost, as well as high efficiency. [13], where a centralized database server decides the execu- • FalconDB leverages a blockchain platform which tolerates tion results of all updates and queries. In such a design, the up to 1/3 participants being malicious, as well as employs a database host is able to hide any possible corruption or mis- temporal

Falcondb: Blockchain-Based Collaborative Database

Failures in DBMS

Blockchain Database for a Cyber Security Learning System

An Introduction to Cloud Databases a Guide for Administrators

Middleware-Based Database Replication: the Gaps Between Theory and Practice

Concurrency Control

Database Technology for Bioinformatics from Information Retrieval to Knowledge Systems

Sundial: Harmonizing Concurrency Control and Caching in a Distributed OLTP Database Management System

What Is a Database? Differences Between the Internet and Library

Data Quality Requirements Analysis and Modeling December 1992 TDQM-92-03 Richard Y

Concurrency Control and Recovery ACID • Transactions • Recovery Transaction Model Concurency Control Recovery

CAS CS 460/660 Introduction to Database Systems Transactions

DBOS: a Proposal for a Data-Centric Operating System