Equihash: Asymmetric Proof-Of-Work Based on the Generalized Birthday Problem (Full Version)
Total Page:16
File Type:pdf, Size:1020Kb
Equihash: Asymmetric Proof-of-Work Based on the Generalized Birthday Problem (Full version) Alex Biryukov Dmitry Khovratovich University of Luxembourg University of Luxembourg [email protected] [email protected] Abstract—Proof-of-work is a central concept in modern is given to “owners” of large botnets, which nowadays often cryptocurrencies and denial-of-service protection tools, but the accommodate hundreds of thousands of machines. For prac- requirement for fast verification so far made it an easy prey for tical DoS protection, this means that the early TLS puzzle GPU-, ASIC-, and botnet-equipped users. The attempts to rely on schemes [9], [20] are no longer effective against the most memory-intensive computations in order to remedy the disparity powerful adversaries. between architectures have resulted in slow or broken schemes. a) Memory-hard computing: In this paper we solve this open problem and show how to In order to remedy the construct an asymmetric proof-of-work (PoW) based on a compu- disparity between the ASICs and regular CPUs, Dwork et al. tationally hard problem, which requires a lot of memory to gen- first suggested memory-bound computations [4], [23], where a erate a proof (called ”memory-hardness” feature) but is instant random array of moderate size is accessed in pseudo-random to verify. Our primary proposal Equihash is a PoW based on the manner to get high bandwidth. In the later work [25] they generalized birthday problem and enhanced Wagner’s algorithm suggested filling this memory with a memory-hard function for it. We introduce the new technique of algorithm binding to (though this term was not used) so that the memory amount can prevent cost amortization and demonstrate that possible parallel be reduced only at the large computational cost to the user1. implementations are constrained by memory bandwidth. Our As memory is a very expensive resource in terms of area and scheme has tunable and steep time-space tradeoffs, which impose the amortized chip cost, ASICs would be only slightly more large computational penalties if less memory is used. efficient than regular x86-based machines. Botnets remain a Our solution is practical and ready to deploy: a reference problem, though on some infected machines the use of GBytes implementation of a proof-of-work requiring 700 MB of RAM of RAM, will be noticeable to users. One can also argue runs in 15 seconds on a 2.1 GHz CPU, increases the computations that the reduced ASIC advantages may provide additional by the factor of 1000 if memory is halved, and presents a proof incentives for botnet creators and thus reduce the security of of just 120 bytes long. an average Internet users [19], [33]. Keywords: Equihash, memory-hard, asymmetric, proof-of- No scheme in [23], [25] has been adapted for the practical work, client puzzle. use as a PoW. Firstly, they are too slow for reasonably large amount of memory, and must use too little memory if I. INTRODUCTION required to run in reasonable time (say, seconds). The first Request of intensive computations as a countermeasure memory-hard candidates [25] were based on superconcentra- against spam was first proposed by Dwork and Naor in [24] tors [47] and similar constructions explored in the theory of and denial of service (DoS) protection in the form of TLS pebbling games on graphs [31]. To fill N blocks in mem- client puzzle by Dean and Stubblefield [21]. Amount of work ory a superconcentrator-based functions make N log N hash is certified by a proof, thus called proof-of-work, which is function calls, essentially hashing the entire memory dozens feasible to get by an ordinary user, but at the same time slows of times. Better performance is achieved by the scrypt func- down multiple requests from the single machine or a botnet. tion [46] and memory-hard constructions among the finalists Perhaps the simplest scheme is Hashcash [9], which requires of the Password Hashing Competition [3], but their time-space a hash function output to have certain number of leading zeros tradeoffs have been explored only recently [5], [6], [17]. and is adapted within the Bitcoin cryptocurrency. Nowadays, For example, a superconcentrator-based function using 68 to earn 25 Bitcoins a miner must make an average of 2 calls N = 225 vertices of 32 bytes each (thus taking over 1 GB to a cryptographic hash function. of RAM) makes O(N log N) or cN with large c calls to the Long before the rise of Bitcoin it was realized [23] that hash function (the best explicit construction mentioned in [26] the dedicated hardware can produce a proof-of-work much makes 44N calls), thus hashing the entire memory dozens of faster and cheaper than a regular desktop or laptop. Thus the time. users equipped with such hardware have an advantage over Secondly, the proposed schemes (including the PHC con- others, which eventually led the Bitcoin mining to concentrate in a few hardware farms of enormous size and high electricity 1The actual amount of memory in [25] was 32 MB, which is not that “hard” consumption. An advantage of the same order of magnitude by modern standards. structions) are symmetric with respect to the memory use. To is no evidence that the proposed cycle-finding algorithm is initialize the protocol in [23], [25], a verifier must use the same optimal, and its amortization properties are unknown. Finally, amount of memory as the prover. This is in contrast with the Andersen’s tradeoff allows to parallelize the computations Bitcoin proof-of-work: whereas it can be checked instantly independently thus reducing the time-memory product and the (thus computational asymmetry) without precomputation, vir- costs on dedicated hardware. tually no memory-hard scheme offers memory asymmetry. Thus the verifier has to be almost as powerful as the prover, and Finally, the scheme called Momentum [40] simply looks may cause DoS attacks by itself. In the cryptocurrency setting, for a collision in 50-bit outputs of the hash function with fast verification is crucial for network connectivity. Even one 26-bit input. The designer did not explore any time-space of the fastest memory-hard constructions, scrypt [46], had to tradeoffs, but apparently they are quite favourable to the be taken with memory parameter of 128 KB to ensure fast attacker: reducing the memory by the factor of q imposes verification in Litecoin. As a result, Litecoin is now mined on only pq penalty on the running time [53] (more details in ASICs with 100x efficiency gain over CPU [39]. Appendix A). Finally, these schemes have not been thoroughly analyzed c) Our contributions: We propose a family of fast, for possible optimizations and amortizations. To prove the memory-asymmetric, optimization/amortization-free, limited work, the schemes should not allow any optimization (which parallelism proofs of work based on hard and well-studied adversaries would be motivated to conceal) nor should the computational problems. First we show that a wide range of computational cost be amortizable over multiple calls [24]. hard problems (including a variety of NP-complete problems) can be adapted as an asymmetric proof-of-work with tunable A reasonably fast and memory-asymmetric schemes would parameters, where the ASIC and botnet protection are deter- become a universal tool and used as an efficient DoS coun- mined by the time-space tradeoffs of the best algorithms. termeasure, spam protection, or a core for a new egalitarian cryptocurrency. Our primary proposal Equihash is the PoW based on b) Recent asymmetric candidates: There have been the generalized birthday problem, which has been explored two notable attempts to solve the symmetry and performance in a number of papers from both theoretical and implemen- problems. The first one by Dziembowski et al. [26] suggests an tation points of view [15], [16], [35], [42], [54]. To make it interactive protocol, called a proof-of-space, where the prover amortization-free, we develop the technique called algorithm first computes a memory-hard function and then a verifier binding by exploiting the fact that Wagner’s algorithm carries requests a subset of memory locations to check whether they its footprint on a solution. have been filled by a proper function. The verification can thus be rather quick. However, the memory-hard core of the scheme In our scheme a user can independently tune time, memory, is based on a stack of superconcentrators and is quite slow: to and time-memory tradeoff parameters. In a concrete setting, fill a 1 GB of memory it needs about 1 minute according our 700 MB-proof is 120 bytes long and can be found in to the performance reports in [45]. The scheme in [26] is 15 seconds on a single-thread laptop with 2.1 GHz CPU. Adversary trying to use 250 MB of memory would pay not amortization-free: producing N proofs costs as much as producing one. As a result, a memory-hard cryptocurrency 1000-fold in computations using the best tradeoff strategy, whereas a memoryless algorithm would require prohibitive Spacecoin [45] built on proofs-of-space requires miners to 75 precommit the space well before the mining process, thus 2 hash function calls. These properties and performance are making the mining process centralized. We also note that unachievable by existing proposals. We have implemented and the time-space tradeoff is explored for these constructions for tested our scheme in several settings, with the code available memory reductions by a logarithmic factor (say, 30 for 1 GB) at request. Equihash can be immediately plugged into a and more, whereas the time increases for smaller reductions cryptocurrency or used as a TLS client puzzle.