Jellyfish Merkle Tree
Total Page:16
File Type:pdf, Size:1020Kb
Jellyfish Merkle Tree Zhenhuan Gao, Yuxuan Hu, Qinfan Wu* Abstract. This paper presents Jellyfish Merkle Tree (JMT), a space-and-computation-efficient sparse Merkle tree optimized for Log-Structured Merge-tree (LSM-tree) based key-value storage, which is designed specially for the Diem Blockchain. JMT was inspired by Patricia Merkle Tree (PMT), a sparse Merkle tree structure that powers the widely known Ethereum network. JMT further makes quite a few optimizations in node key, node types and proof format to find the ideal balance between the complexity of data structure, storage, I/O overhead and computation efficiency, to cater to the needs of the Diem Blockchain. JMT has been implemented in Rust, but it is language- independent such that it could be implemented in other programming languages. Also, the JMT structure presented is of great flexibility in implementation details for fitting various practical use cases. 1 Introduction Merkle tree, since introduced by Ralph Merkle [1], has been a de facto solution to productionized account-model blockchain systems. Ethereum [2] and HyperLedger [3], are examples with various versions of Merkle trees. Although Merkle tree fits pretty well as an authenticated key-value store holding a huge amountof data in a tamper-proof way, its performance in terms of computation cost and storage footprint are two major concerns where people have been trying to achieve some enhancements. Among them, the sparseness of Merkle trees [2], [4], [5] has been aptly leveraged to cut off unnecessary complexity in more compact and efficient Merkle tree design. Virtually, a Merkle tree in reality has far fewer leaf nodes at the bottom level, a far cry from the intractable size that can be held by a perfect Merkle tree. If sticking rigorously to the original Merkle tree structure, any design will by no means escape from excessive computation and space overhead. The optimizations offered by the sparseness can also benefit other related data, notably proofs[6]. With sparseness, and its resultant smaller tree sizes, shorter proofs can be properly designed with less CPU time in generation and verification, and less network bandwidth in transmission, benefiting both server and client sides. Based on earlier works in the industry, we present JMT, a space-time win-win optimized sparse Merkle tree designed specially for such blockchain systems as the Diem Blockchain [7], [8]. JMT is built on top of an LSM-tree based key-value storage, featuring version-based key that circumvents heavy I/O brought about by the randomness of a pervading hash-based key. ∗ The authors work at Novi Financial, a subsidiary of Facebook, Inc., and contribute this paper to the Diem Association under a Creative Commons Attribution 4.0 International License. Revised January 12, 2021 1 This mini whitepaper is structured as follows: First, we give a concise retrospection of Merkle trees in practical uses cases where each leaf node is addressable via a key, followed by the variation and evolvement (Section 2). We then present JMT, including node key structure and its implicit benefits, logical data structures of two node types and related operations (Section 3). We then describe the proof format of JMT with the verification algorithm, paired with examples (Section 4). Finally, we conclude and discuss our future works (Section 5). 2 A Retrospection of Addressable Merkle Trees Merkle trees are widely adopted by the blockchain industry as a cryptographically authenticated, deterministic data structure that can be used to store and map between arbitrary binary data. In the family of Merkle trees, an addressable Merkle tree, becomes the de facto standard of authenticated data structure that keeps record of the global blockchain state. We presume the readers of this paper are familiar with the background of Merkle trees and how Merkle trees are used to allow efficient and secure verification of the contents of large data structures with proofs. 2.1 Addressable Merkle Tree (AMT) An authenticated data structure allows a verifier 푉 to hold a short authenticator 푎, which forms a binding commitment to a larger data structure 퐷. An untrusted prover 푃 , which holds 퐷, computes 푓(퐷) → 푟 and returns both 푟 — the result of the computation of some function 푓 on 퐷 — as well as 휋 — a proof of the correct computation of the result — to the verifier. 푉 can run Verify(푎, 푓, 푟, 휋), which returns true if and only if 푓(퐷) = 푟. In the context of the Diem Blockchain, provers are generally validators and verifiers are clients executing read queries. In summary, an addressable Merkle tree is an authenticated data structure as a form of a binary Merkle tree that stores maps. In a Merkle tree of size 2ℎ, the structure 퐷 maps every key 푖 ∈ [0, 2ℎ) to a value 푣푖. The authenticator is formed from the root of a full binary tree created from the values, labeling leaves as hash(푖 ‖ 푣푖) and internal nodes as hash(left ‖ right), where H is a cryptographic hash function. Keys are encoded as the binary paths from the root to each leaf which are referred to as keys in production. 1 The function 푓, which the prover wishes to authenticate, is an inclusion proof that a key-value pair (푘, 푣) is within the map 퐷. a=H(h4||h5) 0 1 h4=H(h0||h1) h5=H(h2||h3) 0 1 0 1 h0=H(0||s0) h1=H(1||s1) h2=H(2||s2) h3=H(3||s3) Figure 1: A Merkle tree storing 퐷 = {0 ∶ s0, …}. If 푓 is a function that gets the third item (shown with a dashed line) then 푟 = s2 and 휋 = [h3, h4] (these nodes are shown with a dotted line). Verify(푎, 푓, 푟, 휋) verifies that 푎 =? hash(h4 ‖ hash(hash(2 ‖ 푟) ‖ h3)). 푃 authenticates lookups for an item 푖 in 퐷 by returning a proof 휋 that consists of the labels of the sibling of each of the ancestors of node 푖. Figure 1 shows an AMT with 2-bit address where the 1 Secure Merkle trees must use different hash functions to hash the leaves and internal nodes to avoid confusion between the two types of nodes. While we have omitted this detail in the example for simplicity, the Diem Core uses a unique hash function to distinguish between different hash function types to avoid attacks based on type confusion [?]. 2 addresses of nodes in binary are 0x00, 0x01, 0x10, 0x11, respectively. The addressing process is also straightforward: Following each bit of the address from MSB to LSB, if it reaches a 0, visit the left child of the current node; otherwise, the right child. Figure 1 also illustrates a lookup for the 3푟푑 item which is surrounded by dotted lines, and the nodes included in 휋 are shown with dashed lines. In this way, the root digest, 푎, efficiently authenticates all the data stored at all the leaf nodes, addressed by their keys, altogether by recursively hashing the concatenation of the hashes of the two children. 2.1.1 Tractable Representations While a full tree with ℎ-bit keys (256 bits in Diem) of size 2ℎ is an intractable representation when ℎ is a large value, Figure 2 shows two optimizations that can be applied to transform a naive binary implementation ( 1 ) into an efficient one for a sparse Merkle tree. First, subtrees that consist entirely of empty nodes are replaced with a placeholder value (□ in 2 ) whose digest is a predefined constant 퐷푑푖푔푒푠푡 as used in the certificate transparency system9 [ ]. This optimization creates a representation of a tractable size without substantially changing the proof generation of the Merkle tree. However, leaves are always stored at the bottom level of the tree, meaning that the structure requires ℎ hashes to be computed on every leaf modification. A second optimization replaces subtrees consisting of exactly one leaf with a single node ( 3 ). In expectation, the depth of any given item is 푂(log 푛), where 푛 is the number of items in the tree. This optimization reduces the number of hashes to be computed when performing operations on the map including proof-related operations. Some works [5] create an identical default digest only for the empty nodes on the same height ℎ and ℎ ℎ−1 ℎ−1 let 퐷푑푒푓푎푢푙푡 = 퐻(퐷푑푒푓푎푢푙푡 ‖ 퐷푑푒푓푎푢푙푡). In this paper, we choose a single default digest 퐷푑푒푓푎푢푙푡 to represent empty subtrees at any level without ambiguity, especially in our design. 1 2 0 1 0 1 0 0 1 0 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 A B C A B C 3 0 1 A 0 1 0 1 B C Figure 2: Three versions of a sparse Merkle tree storing 퐷 = {01002 ∶ A, 10002 ∶ B, 10112 ∶ C}. 2.2 Addressable Radix Merkle Tree (ARMT) The same architecture could be extended to a more generalized radix tree since AMT is just a special case with radix 푟 = 2. In turn, an ARMT is the compressed version of the corresponding AMT such that every 푙표푔2(푟) levels of the original AMT are compressed into a single level where each node has maximum 푟 children, though some children may not exist. We will use ARrMT to denote an addressable Merkle tree with radix 푟. Notwithstanding 푟 = 2푘, 푘 ∈ ℕ+, most ARMT designs choose AR16MT in production, i.e., each internal node has up to 16 children. 3 The choice is mostly attributed from the following tradeoff between storage and I/O overhead because an update of a single leaf node can require all the nodes on the path to the root to be updated together.