Sparse Merkle Trees: Definitions and Space-Time Trade-Offs With

Sparse Merkle Trees: Definitions and Space-Time Trade-Offs with Applications for Balloon Rasmus Östersjö Faculty of Health, Science and Technology Computer Science C-Dissertation 15 HP Advisor: ToBias Pulls Examiner: Thijs J. HolleBoom Date: June 8, 2016 Sparse Merkle Trees: Definitions and Space-Time Trade-Offs with Applications for Balloon Rasmus Ostersjö¨ c 2016 The author and Karlstad University This dissertation is submitted in partial fulfillment of the requirements for the Bachelor's degree in Computer Science. All material in this dissertation which is not my own work has been identified and no material is included for which a degree has previously been conferred. Rasmus Ostersjö¨ Approved, June 8, 2016 Advisor: Tobias Pulls Examiner: Thijs J. Holleboom iii Abstract This dissertation proposes an efficient representation of a sparse Merkle tree (SMT): an authenticated data structure that supports logarithmic insertion, removal, and look-up in a verifiable manner. The proposal is general in the sense that it can be implemented using a variety of underlying non-authenticated data structures, and it allows trading time for space by the use of an abstract model which represents caching strategies. Both theoretical evaluations and performance results from a proof-of-concept implementation are provided, and the proposed SMT is applied to another authenticated data structure referred to as Balloon. The resulting Balloon has preserved efficiency in the expected case, and is improved with respect to worst case scenarios. v Acknowledgements First and foremost, I would like to thank my advisor Tobias Pulls for his support throughout the entire project. Not only did he provide invaluable feedback in terms of technical expertise and report writing, but he also invested time to prepare me for future challenges that goes far beyond the scope of this dissertation. Secondly, I would like to thank Roel Peeters for his valuable insights. He suggested an improved layout of the dissertation, and shared his thoughts on the subject. Finally, I sincerely thank my brother Victor Ostersjö.¨ He supports me in every project that I participate in, be that with either proof reading, grammar, or in-depth discussions. vi Contents 1 Introduction1 1.1 Motivation and Challenges...........................1 1.2 Expected Outcome and Scope.........................2 1.3 Contribution...................................2 1.4 Roadmap....................................2 2 Background3 2.1 Cryptographic Primitives............................3 2.1.1 Hash Functions.............................3 2.1.2 Digital Signatures............................4 2.2 Tree-Based Data Structures..........................5 2.2.1 Merkle Tree...............................6 2.2.2 History Tree...............................7 2.2.3 Binary Search Tree...........................8 2.2.4 Heap...................................9 2.2.5 Treap..................................9 2.2.6 Hash Treap............................... 10 2.3 Balloon..................................... 11 2.3.1 Key Building Blocks.......................... 12 2.3.2 Updating................................ 12 2.3.3 Snapshots................................ 13 2.3.4 Algorithms............................... 13 2.4 Summary.................................... 16 3 Sparse Merkle Tree 17 3.1 Notion...................................... 17 3.2 Definition.................................... 18 vii 3.3 Approach.................................... 19 3.4 Recursive Relationships............................. 21 3.5 Relative Information.............................. 23 3.6 Final Proposal.................................. 26 4 Analysis 28 4.1 Space Complexity................................ 28 4.2 Time Complexity................................ 31 4.3 Performance................................... 34 4.3.1 Setup.................................. 35 4.3.2 Space Benchmarks........................... 35 4.3.3 Time Benchmarks............................ 36 4.4 Minding the Adversary............................. 38 4.5 Summary.................................... 39 5 Applications to Balloon 40 5.1 Overview..................................... 40 5.2 Extension of the Sparse Merkle Tree..................... 40 5.3 Integration.................................... 42 5.3.1 Key Setup................................ 42 5.3.2 Update.................................. 42 5.3.3 Query and Verify............................ 42 5.3.4 Pruned Algorithms........................... 43 5.4 Evaluation of the New Balloon......................... 43 6 Conclusion 44 6.1 Project Evaluation............................... 44 6.2 Future Work................................... 45 viii References 46 A Memory Requirements for the BD-Hybrid Oracle 48 ix List of Figures 2.1 Security properties for cryptographic hash functions.............4 2.2 Shapes of binary trees.............................6 2.3 A perfect Merkle tree containing four leaves with attributes a{d. The Merkle audit path for the left-most leaf includes H(b) and H(H(c)kH(d))......7 2.4 Two history trees representing distinct views. The earlier view is recon- structed from the newer one by \forgetting" circled nodes..........8 2.5 A simplified hash treap with associated values omitted. Dashed edges represent paths P determined by binary searches for keys k, and circled nodes constitute the corresponding (non-)membership proofs............ 10 2.6 Balloon viewed in the three party setting................... 11 2.7 A pruned hash treap for the non-membership proofs generated by 3 H(k1) and 9 H(k2). Circled nodes would be redundant without applying the prune operation................................. 15 3.1 A simplified SMT with values inserted into the leaves with indices H(k1) = 5 and H(k2) = C................................. 18 3.2 An illustration of a recursive traversal that obtains the root hash. Dashed components need not be visited........................ 21 3.3 Hashes recorded by the branch, depth, and BD-hybrid oracles. Circled nodes represent branches, and diamonds are recorded by depth-based oracles... 25 4.1 Two branch paths down to the first and the last leaves, respectively.... 34 4.2 The sizes of sparse Merkle audit paths as a function of existing key-value pairs....................................... 36 4.3 The memory used by different oracle models as a function of existing key- value pairs.................................... 36 4.4 The time used to query 1000 (non-)membership proofs as a function of oracle model and existing key-value pairs....................... 37 x 4.5 The time used to insert 1000 key-value pairs as a function of oracle model and existing key-value pairs.......................... 37 4.6 The time required to verify (non-)membership proofs as a function of existing key-value pairs.................................. 38 xi List of Tables 3.1 The notation used when describing recursive relationships.......... 22 xii 1 Introduction Consider a trusted author who maintains a collection of data on an untrusted server. Further suppose that clients issue queries to the untrusted server of the form \what is the value associated with key k". This forms the notion of a (non-)membership query; a kind of query that results in either a retrieved value or false. Regardless of the answer, however, what stops the untrusted server from lying when responding to queries? Existing data could be modified, or even claimed to be non-existing [3]. Therefore clients need proofs, succinct pieces of information, that undoubtedly prove query responses correct with respect to the trust of the author. The field of authenticated data structures and dictionaries proves useful in such settings, combining regular data structures and cryptographic primitives [27]. Dating back to the year of 1987, Merkle [18] pioneered a tree-based data structure that incorporates the use of hash functions. It was intended as an integral part of a digital signature system, but today it is widely deployed in a large variety of other applications. These include Bitcoin [19, 21], certificate issuance [14, 15, 16, 22], and authenticated data structures and dictionaries [8, 20, 24]. The idea of a Merkle tree is general in the sense that it can be applied to any tree, thereby allowing regular data structures such as red- black trees [28], AVL-trees [10, 13], and treaps [2,4] to be transformed into authenticated constructions. Further these transformations can be implemented persistently [8], meaning that a client can verifiably query past versions of the data structure. This comes at the price of additional storage and/or query time, however. Therefore another type of append-only Merkle tree was proposed [6, 17] which resulted in a naturally persistent construction. 1.1 Motivation and Challenges Recently, Pulls and Peeters [24] presented an authenticated data structure referred to as Balloon. It is the composition of a hash treap [8, 24] and a history tree [6], both of which are authenticated data structures on their own. This work intends to replace the 1 hash treap with a sparse Merkle tree (SMT) [3, 16], presumably reducing implementation complexity while preserving efficiency in terms of time and space. The main challenge with this seemingly straight forward task is not the replacement of the hash treap, however. Since the proposal of the SMT by Laurie and Kasper [16], there have been no further publications that outline an efficient approach. Different researchers have claimed it to be both inefficient [25] and promising [24], thereby making it

Load more