An Alternative to the Merkle Tree

CRYPTOLOGY EPRINT ARCHIVE, VOL. 00, NO. 00, MONTH 20YY 1 HEX-BLOOM: An Efficient Method for Authenticity and Integrity Verification in Privacy-preserving Computing Ripon Patgiri, Senior Member, IEEE, and Malaya Dutta Borah, Member, IEEE Abstract—Merkle tree is applied in diverse applications, Merkle tree is a time-consuming data structure that wastes namely, Blockchain, smart grid, IoT, Biomedical, financial trans- computational resources significantly. It is used to verify data actions, etc., to verify authenticity and integrity. Also, the blocks’ authenticity and integrity. It allows verification of Merkle tree is used in privacy-preserving computing. However, the Merkle tree is a computationally costly data structure. It the data block’s authenticity and integrity after successfully uses cryptographic string hash functions to partially verify the downloading a particular data block using Merkle root. It data integrity and authenticity of a data block. However, the requires a few hash values, but it does not require the entire verification process creates unnecessary network traffic because Merkle tree. Moreover, the time complexity of the Merkle tree it requires partial hash values to verify a particular block. is high. In addition, the Merkle tree (server) requires high Moreover, the performance of the Merkle tree also depends on the network latency. Therefore, it is not feasible for most memory to store entire hash values; however, a user does not of the applications. To address the above issue, we proposed require to store the whole Merkle tree. Each block requires a an alternative model to replace the Merkle tree, called HEX- few hash values, which require network access. This process BLOOM, and it is implemented using hash, Exclusive-OR, creates enormous network traffics cumulatively. Therefore, it and Bloom Filter. Our proposed model does not depends on degrades the performance of the Merkle tree and increases the network latency for verification of data block’s authenticity and integrity. HEX-BLOOM uses an approximation model, Bloom network traffic. The Merkle tree’s performance depends not Filter. Moreover, it employs a deterministic model for final only on the time complexity of the data structures but also on verification of the correctness. In this article, we show that our the network latency and network traffic. Also, verification of proposed model outperforms the state-of-the-art Merkle tree in a particular block is costly due to network access. It impacts every aspect. the entire process to verify each block using the Merkle tree Index Terms—Merkle tree, Blockchain, Bitcoin, verification, from the network, by which the entire process is slowed down authentication, integrity, privacy, Hash, Security. dramatically. Notably, the partial verification process of the Merkle tree I. INTRODUCTION is a disadvantage, and it should be obviated. The verification ERKLE tree [1] is widely used nowadays due to process of each block creates unnecessary network traffic, M the diverse requirements of security. Recent devel- which can easily be avoided. Also, the time complexity can opments suggest that Merkle tree is adapted in numerous further be reduced. Therefore, we propose an alternative model research domains including privacy-preserving computation of the Merkle tree to address the above issue. Our proposed [2], Blockchain [3], [4], [5], [6], [7], [8], [2], [9], cryptography model uses Hash, Exclusive-OR and Bloom Filter, HEX- [10], [11], agriculture [12], Healthcare [13], [14], financial BLOOM for short. It is two-fold; first, LinkedHashX, and transactions [15], [16], [17], Smart Grid [18], [19], Cloud second, Bloom Filter. We construct a deterministic model Computing [20], [18], Big Data [21], and Wireless networking called LinkedHashX to verify the entire process’s correctness. [22]. Therefore, the Merkle tree is modified to enhances its LinkedHashX uses a cryptographic hash function and XOR performance. Jakobsson et al. [23] presents fractal Merkle operation to provide an alternative model to the Merkle tree. tree to enhance the time and space. Similarly, M. Szydlo LinkedHashX performs a hash on all data blocks and merges [24] enhances Jakobsson’s fractal Merkle tree. Buchmann the data blocks’ hashes into a single data block using XOR et al. [25] improves the Merkle tree. Moreover, We have operation to create LinkedHashX root. User or creator of already witnessed diverse variants of the Merkle tree [8], [20], LinkedHashX does not maintain the entire process; instead, [22], [26], [27]. It shows that the Merkle tree is adapted LinkedHashX root is maintained for future use. A user needs in diverse applications and modified Merkle tree as per the to reconstruct the LinkedHashX root and compares it with requirements of the applications. Therefore, the Merkle tree the original root. Secondly, we use Bloom Filter to verify has met wider applications in the diverse domain, demanding the block’s authenticity and integrity, an approximation data an efficient alternative to the Merkle tree, which features low structure. All data blocks are inserted into Bloom Filter space consumption, fewer network accesses, and low time during the construction of the LinkedHashX. A user requires complexity. LinkedHashX root and Bloom Filter to download to verify a data block’s authenticity and integrity. Ripon Patgiri, Department of Computer Science & Engineering, National Institute of Technology Silchar, Cachar-788010, Assam, India Our key contributions are outlined below- Manuscript received Month DD, 20YY; revised Month DD, 20YY. • HEX-BLOOM uses Bloom Filter to verify a data block’s CRYPTOLOGY EPRINT ARCHIVE, VOL. 00, NO. 00, MONTH 20YY 2 authenticity and integrity in $¹:º time complexity for : the SHA2 hash function, and these hash values are used to distinct hash functions. In contrast, the Merkle tree takes build the Merkle tree. Therefore, there are L leaf nodes in $¹;>6=º time complexity, = is the total number of nodes the tree. Theorem 1 shows that an <-ary Merkle tree has ¹ L−1º of the Merkle tree, and : < ;>6=. I = ¹<−1º internal nodes. It shows that there is an overhead • The total verification time complexity for data authen- of I hash functions. Alternatively, Lemma 1 shows the total ticity and integrity is $¹:Lº and L is the total number cost in terms of the number of leaf nodes. The cryptographic of blocks whereas Merkle tree takes $¹L ;>6Lº time hash functions are slower than the non-cryptographic hash complexity and :L L ;>6L. functions. Its cryptographic hash functions impact the Merkle • The construction cost of HEX-BLOOM is two-folded, tree’s performance, for instance, SHA2. firstly, the construction cost of Bloom Filter, and secondly, the construction cost of LinkedHashX. The con- A. Construction struction cost of Bloom Filter and LinkedHashX are $¹:Lº and $¹Lº, respectively for L data blocks. The Merkle tree takes $¹=º and :L =. • The extra space complexity of Bloom Filter is ` which is derived in Equation (12). The extra space complexity of LinkedHashX is $¹1º. Therefore, the total extra space complexity is $¹`º, whereas the Merkle tree takes $¹=º. • The total communication cost of our proposed model is $¹1º whereas a Merkle tree requires $¹Lº communica- tions for all blocks. • Moreover, the time complexity of insertion and deletion is $¹:º whereas the insertion or deletion time complexity of Merkle tree is $¹;>6Lº and : ;>6L. The article is organized as follows- Section II analyzes the advantages and disadvantages of the Merkle tree. Section IV demonstrates the architecture of Bloom Filter and also analyzes the memory consumption and false positive proba- Fig. 1: Construction of the conventional binary Merkle Tree. bility. Section V presents the first part of the proposed system, LinkedHashX, and elaborates its working principles. Section Figure 1 demonstrates the construction of the Merkle tree. VI combines LinkedHashX and Bloom Filter to provide partial Initially, all blocks are hashed using cryptographic string hash verification on data blocks. Finally, Section VIII concludes the functions, for instance, SHA256. Then, the two consecutive proposed system. hash values are concatenated, and the concatenated hash value is hashed using the same hash function to form their parents. II. MERKLE TREE Again, the same procedure is applied to the subsequent consec- Most of the Merkle tree implementation is binary; however, utive blocks. This process is rerun repeatedly until it becomes we assume <-ary Merkle tree for generalization. Definitions a single node, i.e., it repeats the process to get the Merkle 1 and 2 define the <-ary tree. root. Finally, the Merkle root is published publicly and can be distributed to peers. The creator of the Merkle tree maintains Definition 1. The <-ary or <-way tree is a tree of order < where a) it is a rooted tree, b) each node can have at most the tree, and the peers do not require the entire Merkle tree. ¹< − 1º keys, c) each node can have at most < children, and d) Keys are not in ordered. B. Verification Definition 2. An <-ary Merkle tree has leaf nodes, internal Figure 2 shows the process of verifying a particular block. node and Merkle root which are as follows- a) the direct hash Merkle tree verifies the authenticity of a specific block. For value of data blocks are known as leaf nodes in the Merkle instance, a user has downloaded a block and needs to verify tree, b) the Merkle tree concatenates < hash values and hashes the block for its authenticity and integrity. In this case, the user the concatenated hash value to form a single value is called does not require the entire Merkle tree to verify the block. It a parent node or an internal node, and c) the Merkle root is requires only a few hash values to verify the authenticity of a root node of the tree that contains a hash value of its child the block, as shown in the dashed circle for block 13 in Figure nodes. 2.

Load more