Twistedpair: Towards Practical Anonymity in the Bitcoin P2P Network
Total Page:16
File Type:pdf, Size:1020Kb
TwistedPair: Towards Practical Anonymity in the Bitcoin P2P Network Abstract—Recent work has demonstrated significant anonymity vulnerabilities in Bitcoin’s networking stack. In particular, the current mechanism for broadcasting Bitcoin transactions allows third-party observers to link transactions to the IP addresses that originated them. This lays the groundwork for low-cost, large-scale deanonymization attacks. In this Fig. 1: Supernodes can observe flooding metadata to infer work, we present TwistedPair, a first-principles, theoretically- which node was the source of a transaction message (tx). justified defense against large-scale deanonymization attacks. TwistedPair is lightweight, scalable, and completely interoperable with the existing Bitcoin network. We evaluate TwistedPair through experiments on Bitcoin’s mainnet to demonstrate with an overview of Bitcoin’s P2P network, and explain why its interoperability with the current network, as well as low it enables deanonymization attacks. broadcast latency overhead. A. Bitcoin’s P2P Network I. INTRODUCTION Bitcoin nodes are connected over a P2P network of TCP links. This network is used to communicate transactions, the Anonymity is an important property for a financial system, blockchain, and control packets, and it plays a crucial role in especially given the often-sensitive nature of transactions [15]. maintaining the network’s consistency. Each peer is identified Unfortunately, the anonymity protections in Bitcoin and similar by its (IP address, port) combination. Whenever a node gen cryptocurrencies can be fragile. This is largely because Bitcoin erates a transaction, it broadcasts a record of the transaction users are identified by cryptographic pseudonyms (a user can over the P2P network; critically, transaction messages do not have multiple pseudonyms). When a user Alice wishes to include the sender’s IP address—only their pseudonym. Since transfer funds to another user Bob, she generates a transaction the network is not fully-connected, transactions are relayed message that includes Alice’s pseudonym, the quantity of funds according to epidemic flooding [37]. This ensures that all transferred, the prior transaction from which these funds are nodes receive the transaction and can add it to the blockchain. drawn, and a reference to Bob’s pseudonym [35]. The system- Hence, the broadcasting of transactions enables the network to wide sequence of transactions is recorded in a public, append- learn about transactions quickly and reliably. only ledger known as the blockchain. The public nature of the blockchain means that users only remain anonymous as long However, the broadcasting of transactions can also have as their pseudonyms cannot be linked to their true identities. negative anonymity repercussions. The statistical symmetry of Bitcoin’s current broadcasting mechanism spreads content This mandate has proved challenging to uphold in practice. isotropically over the graph; this allows adversarial peers Several vulnerabilities have enabled researchers, law enforce who observe the spreading dynamics of a given transaction ment, and possibly others to partially deanonymize users [12]. to infer the source IP of each transaction. For example, in Publicized attacks have so far included: (1) linking different recent attacks [10], [28], researchers launched a supernode public keys that belong to the same user [31], (2) associating that connected to all nodes in the P2P network (Figure 1), users’ public keys with their IP addresses [10], [28], and in and logged all observed traffic. This allowed the supernode some cases, (3) linking public keys to human identities [21]. to observe the spread of each transaction over the network Such deanonymization exploits tend to be cheap, easy, and over time, and ultimately infer the source IP. Since transaction scalable [31], [10], [28]. messages include the sender’s pseudonym, the supernodes were able to deanonymize users, or link their pseudonyms to Although researchers have traditionally focused on the an IP address [10], [28]. Such deanonymization attacks are privacy implications of the blockchain [36], [41], [31], we problematic because of Bitcoin’s transparency: once a user is are interested in lower-layer vulnerabilities that emerge from deanonymized, her other transactions can often be linked, even Bitcoin’s peer-to-peer (P2P) network. Recent work has demon if she creates fresh pseudonyms for each transaction [31]. strated P2P-layer anonymity vulnerabilities that allow trans actions to to be linked to users’ IP addresses with accu There have been recent proposals for mitigating these racies up to 30% [10], [28]. Understanding how to patch vulnerabilities, included broadcasting protocols that reduce the these vulnerabilities without harming utility remains an open symmetry of epidemic flooding. Bitcoin Core [5], the most question. The goal of our work is to propose a practical, commonly-used Bitcoin implementation, adopted a protocol lightweight modification to Bitcoin’s networking stack that called diffusion, where each node spreads transactions with provides theoretical anonymity guarantees against the types independent, exponential delays to its neighbors on the P2P of attacks demonstrated in [10], [28], and others. We begin graph. Diffusion is still in use today. However, proposed solutions (including diffusion) tend to be heuristic, and recent suggesting that transactions by the same user can be linked, work shows that they do not provide sufficient anonymity even if the user adopts different addresses [31]. In response, protection [19]. Other proposed solutions offer theoretical researchers proposed alternative cryptocurrencies and/or tum anonymity guarantees [13], but do so under idealistic assump blers that provide anonymity at the blockchain level [29], tions that are unlikely to hold in practice. The aim of this [4], [43], [42], [25], [26]. It was not until 2014 that re work is to propose a broadcasting mechanism that (a) provides searchers turned to the P2P network, showing that regardless provable anonymity guarantees under realistic adversarial and of blockchain implementation, users can be deanonymized network assumptions, and (b) does not harm the network’s by network attackers [28], [10], [11], [19]. In some cases broadcasting robustness or latency. We do this by revisiting researchers were able to link transactions to IP addresses the DANDELION system proposed in [13], and redesigning it with accuracies up to 30% [10]. These attacks proceeded by to withstand a variety of practical threats. connecting a supernode to a majority of active Bitcoin server nodes. More recently, [8] demonstrated the serious anonymity B. Contributions and routing risks posed by an AS-level attacker. These papers suggest a need for networking protocols that defend against The main contributions of this paper are threefold: deanonymization attacks. (1) We identify several key idealistic assumptions made by Anonymous communication for P2P/overlay networks has DANDELION [13], and show how anonymity is degraded when been an active research topic for decades. Most work relies on those assumptions are violated. In particular, [13] assumes an two ideas: randomized routing (e.g., onion routing, Chaumian honest-but-curious adversary that has limited knowledge of mixes) and/or dining cryptographer (DC) networks. Systems the P2P graph topology and only observes one transaction that use DC nets [14] are typically designed for broadcast per node. If adversaries are instead malicious and collect communication, which is our application of interest. However, more information over time, we show that they are able to DC nets are known to be inefficient and brittle [24]. Proposed arbitrarily weaken the anonymity guarantees of [13] through systems [23], [49], [16], [48] improved these properties sig a combination of attacks, including side information, graph nificantly, but DC networks never became scalable enough to manipulation, black hole, and intersection attacks. enjoy widespread adoption in practice. (2) We propose a new protocol called TWISTEDPAIR1 that Systems based on randomized routing are generally more subtly changes most of the implementation choices of DAN efficient, but focus on point-to-point communication (though DELION, from the graph topology to the randomization mech these tools can be adapted for broadcast communication). Early anisms for message forwarding. Mathematically, these (rel works like Crowds [40], Tarzan [20], and P5 [45] paved the atively small) algorithmic changes completely change the way for later practical systems, such as Tor [18] and I2P anonymity analysis by exponentially augmenting the problem [50], as well as recent proposals like Drac [17], Pisces [34], state space. Using analytical tools from the theory of Galton- and Vuvuzela [46]. Our work differs from this body of work Watson trees and random processes over graphs, we show that along two principal axes: (1) usage goals, and (2) analysis TWISTEDPAIR offers significant anonymity gains over DAN metrics/results. DELION, both theoretically and in simulation, when adversaries (1) Usage goals. have stronger capabilities. TWISTEDPAIR is currently being Among tools with real-world adoption, Tor considered for integration in Bitcoin Core.2 [18] is the most prominent; privacy-conscious Bitcoin users frequently use it to anonymize transmissions. However, Tor (3) We demonstrate the practical feasibility of TWISTED requires users to route their network traffic through a