Angela: a Sparse, Distributed, and Highly Concurrent Merkle Tree
Total Page:16
File Type:pdf, Size:1020Kb
Angela: A Sparse, Distributed, and Highly Concurrent Merkle Tree Janakirama Kalidhindi, Alex Kazorian, Aneesh Khera, Cibi Pari Abstract— Merkle trees allow for efficient and authen- with high query throughput. Any concurrency overhead ticated verification of data contents through hierarchical should not destroy the system’s performance when the cryptographic hashes. Because the contents of parent scale of the merkle tree is very large. nodes in the trees are the hashes of the children nodes, Outside of Key Transparency, this problem is gen- concurrency in merkle trees is a hard problem. The eralizable to other applications that have the need current standard for merkle tree updates is a naive locking of the entire tree as shown in Google’s merkle tree for efficient, authenticated data structures, such as the implementation, Trillian. We present Angela, a concur- Global Data Plane file system. rent and distributed sparse merkle tree implementation. As a result of these needs, we have developed a novel Angela is distributed using Ray onto Amazon EC2 algorithm for updating a batch of updates to a merkle clusters and uses Amazon Aurora to store and retrieve tree concurrently. Run locally on a MacBook Pro with state. Angela is motivated by Google Key Transparency a 2.2GHz Intel Core i7, 16GB of RAM, and an Intel and inspired by it’s underlying merkle tree, Trillian. Iris Pro1536 MB, we achieved a 2x speedup over the Like Trillian, Angela assumes that a large number of naive algorithm mentioned earlier. We expanded this its 2256 leaves are empty and publishes a new root after algorithm into a distributed merkle tree implementation, some amount of time. We compare the performance of our concurrent algorithm against the approach used by Angela, that is able to scale to handle demanding Google Trillian in a local setting for which we see nearly workloads while maintaining its performance. Angela a 2x speed up. We also show that Angela’s distribution is comprised of an orchestrator node that handles load is scalable through our benchmarks. balancing. This is written in Python and leverages Ray to handle distribution [9]. The orchestrator distributes I. INTRODUCTION the work to multiple worker nodes written in Golang Merkle trees are simple in their design. For some for concurrent merkle tree updates. arbitrary node in the tree, we can derive its value by The rest of the paper is structured as follows. Section applying a hash function on the values of all of its 2 lays out our metrics of success and what we consider children nodes [8]. One can think of a merkle tree as to be a strong and promising results. In Section 3, a set. By applying the definition of a node’s value, we discuss the scope of what Angela can do. In inserting data into a merkle tree can be thought of Section 4, we provide a fundamental assumption for our as a commitment to the data. This commitment to a merkle tree implementation and a detailed description piece of data allows for verification of the integrity of of the specifics on how the underlying merkle tree is data received over a network, giving them increased implemented. Section 5 describes both the naive algo- importance in recent years. rithm and why concurrency is not immediately obvious, One such example is the Key Transparency project following which we describe the concurrent insertion by Google, a public key infrastructure, that uses Google algorithm. Section 6 presents the system architecture Trillian, a sparse merkle tree implementation with for our complete distributed system under Ray. Section transparent storage. Key Transparency uses a merkle 7 describes some of the challenges we faced and tree to pair a user identifier key in a leaf with a public reasons behind the language decisions that were made key value [4]. When an update to a public key comes in, along the way. Section 8 shows benchmarking results the entire tree is locked and updated serially, introduc- of the system and discusses the evaluation of Angela ing massive bottlenecks which are exacerbated when overall. Section 9 presents related work, and Section 10 the intended scale of Key Transparency is introduced and 11 present future optimizations and work that we due to the high surface area of contention. Google plan to act on before wrapping up with a conclusion Key Transparency requires the scale of a billion leaves and acknowledgements. 1 II. METRICS OF SUCCESS nodes during inserts or when generating proofs. The Our first goal was to develop a concurrent algorithm merkle tree will also be organized as a binary search in the non-distributed case that cleanly beats the naive tree, where most nodes are actually empty. A prefix algorithm while maintaining the integrity of the merkle is assigned to each node in the merkle tree with a 0 tree. The speedup we expect to see here should respect appended on a left child and a 1 appended on a right the number of cores available on the testing hardware child, resulting in the leaf locations to be determined with some latencies due to the percolation paths not via their sorted key values [7]. Writes are batched being completely parallelizable. The second goal was together and processed in epochs. In our system, we to adjust the algorithm for effective distribution on Ray wait for some number of update requests to be queued and launch the entire system on a cluster of Amazon before processing them; however, it is also possible to EC2 instances while maintaining performance. Our fi- use time elapsed as an indicator for publishing applying nal goal was to measure performance on this distributed updates and publishing a new root. system with respect to the number of compute nodes IV. DESIGN in a cluster. Ideally, we would see the throughput of A. Verifiable Random Function Assumption writes scale with the number of nodes in the cluster. CONIKS uses a verifiable random function to map III. PROJECT OVERVIEW user IDs to leaves within the tree [7]. We assume that In this section, we define the project scope of Angela IDs given to the service have already gone through through the client API. The specifics on how Angela a verifiable random function. This is a reasonable handles these requests is detailed further in the Archi- assumption since this can be added in trivially to all tecture section. inputs, but to reduce the complexity of our implemen- 1) insert leaf(index, data): A client is able to insert tation, we chose to not to. This gives us the interesting a new node into the merkle tree as well as update an property that any query or workload will appear to be existing node. index refers to the key used to insert into random and, therefore, properly load balanced. the merkle tree and data refers to the value that the key B. Size refers to. 256 2) get signed root(): A client is able to request the The merkle tree contains 2 leaves. Again, we published root of the merkle tree. This is used in order assume that a large number of nodes in the tree are to verify any read request from Angela. empty. Moreover, to ensure that IDs are mapped to leaves properly, we assume that the aforementioned 3) generate proof(index): A client is able to read a 256 value from the merkle tree. This works for both items verifiable random function has an image of size 2 . that have been inserted in the tree and for those that do C. Hash Functions not exist in the tree. In either case, we return a Proof We use SHA256 as our hash function for Angela, but object that the client can verify to confirm that a items this is easily replaceable with any other cryptographic actually does or doesn’t exist. We use membership hash function. proofs for items that exist in the tree and and non- membership proofs for items that do not exist [10]. D. Encoding index refers to the key that would have been used to Our merkle tree design is similar to CONIKS in that insert into the merkle tree. we use a very basic prefix encoding. All nodes on level 4) verify proof(proof, data, root): A client is able i have prefixes of length i. To determine the prefix of to verify that a proof generated by the server main- some node, one can just traverse the tree. Going down taining the merkle tree is valid. This verification hap- the left branch means we append a 0 to our prefix. pens locally on the client side and is not sent to Going down the right branch means we append a 1 to the server. proo f is the value that is returned from our prefix. generate proo f . data is the value that we are verifying to be correct. root is the digest that is retrieved form E. Sparse Merkle Trees get signed root(). Literature on such sparse representations already Similar to CONIKS and Trillian, we fix the values exists under the name of Sparse merkle trees. Sparsity of empty nodes to something related to their depth [7] allows for lazy construction of and queries on our data. [5]. This means we can lazily calculate hashes of empty Thus, the tree itself will be more space efficient and 2 will theoretically result in easier coordination among its children when updated. Now, consider a scenario the distributed clusters [3]. Figure 1 provides a visual in which we are concurrently inserting or updating the representation of how a sparse tree would look using value of two leaf nodes.