1

DSMIX: A Dynamic Self-organizing Mix Anonymous System Renpeng Zou∗, Xixiang Lv∗, ∗School of Cyber Engineering, Xidian University, Xian 710071, China

Abstract—Increasing awareness of privacy-preserving has led There are, however, concerns about the lack of efficiency to a strong focus on anonymous systems protecting anonymity. and security in anonymous systems [6], [7], as explained By studying early schemes, we summarize some intractable below: problems of anonymous systems. Centralization setting is a universal problem since most anonymous system rely on central proxies or presetting nodes to forward and mix messages, which • Centralization. Most of anonymous systems, i.e. compromises users’ privacy in some way. Besides, availability Anonymizer [8], LPWA [9] and cMix [10], rely on central becomes another important factor limiting the development of proxies or preset nodes to hide the address information, anonymous system due to the large requirement of additional ad- then still use these proxies to blind and forward messages. ditional resources (i.e. bandwidth and storage) and high latency. Moreover, existing anonymous systems may suffer from different The centralized anonymous systems bring privacy leakage attacks including abominable Man-in-the-Middle (MitM) attacks, risks to users since the service providers can control Distributed Denial-of-service (DDoS) attacks and so on. In this all proxies and mix node to infer the users’ identities. context, we first come up with a BlockChain-based Mix-Net What’s worse, the public proxies or preset nodes are (BCMN) protocol and theoretically demonstrate its security and easily exposed to attackers. For instance, an attacker can anonymity. Then we construct a concrete dynamic self-organizing BlockChain-based MIX anonymous system (BCMIX). In the launch distributed denial-of-service attacks to block one system, users and mix nodes utilize the blockchain transactions or more proxies and thus crash the system. and their addresses to negotiate keys with each other, which • Availability. Efficiency and additional resources are can resist the MitM attacks. In addition, we design an IP the main factors affecting the availability of anony- sharding algorithm to mitigate Sybil attacks. To evaluate the mous systems. High-latency anonymous systems such BCMIX system, we leverage the distribution of mining pools in the real world to test the system’s performance and ability as OneSwarm [11] and A3 [12], though provide high to resistant attacks. Compared with other systems, BCMIX anonymity, are not well suited for practical utilization provides better resilience to known attacks, while achieving low because of the poor efficiency in terms of intolerable la- latency anonymous communication without significant bandwidth tency [11]. In order to hide the identities of the recipients, or storage resources. multicast/broadcast and peer-to-peer based anonymous Index Terms—Anonymous systems, blockchain, anonymity, systems consume more bandwidth resources to cover the self-organizing, mix network attacks. normal traffics [13]. In spite of effectiveness, users may not be willing to contribute a lot of bandwidth, which I.INTRODUCTION hinders the development of such anonymous systems. In EEPING communication private has become increasing addition, some anonymous systems, such as cMix, adopt K important in an era of mass surveillance and carriers- additional inspection schemes to identify the malicious sponsored attacks. Recently, many events about private data nodes, which would place an additional burden on users leakage in Online Social Networks (OSNs) [1], mobile net- and decrease message utilization. arXiv:2009.11546v1 [cs.CR] 24 Sep 2020 work service (Uber, Didi Chuxing) [2] and telephone commu- • Security. Along the research line about security, anony- nications [3] have occurred frequently. Thus it is often the case mous systems based on different mechanism may suffer that two parties want to communicate anonymously, which from different security issues. Some anonymous systems means to exchange messages while hiding the fact that they based on MIX technologies rely on fixed cascade nodes to are in conversation. mix and forward messages. Such systems are vulnerable In this context, the anonymous communication technology to collusion-tagging attacks which is hard to detect. Re- emerges as a critical topic. Aiming to preserve communica- routing or proxy forwarding based anonymous systems, tion privacy within the shared public network environment, such as [14] and Tarzan [15], deliver messages anonymous communication mainly focus on how to hide the through nodes or proxies randomly selected from clusters, identities or address information of one side or both sides hence an eavesdropper can perform traffic analysis attacks in communications. Since the seminal work by Chaum [4] and destroy the anonymity. As for multicast/broadcast for anonymous communication, more than seventy anonymous based anonymous system, an attacker can masquerade systems have been proposed, based on different anonymous as the benign recipients to intercept the messages. With mechanism [5]. Generally, anonymous systems can be di- respect to the P2P based anonymous systems, an attacker vided into the following sub-types, mix re-encryption, mul- can create multiple identities to launch a Sybil attack, ticast/broadcast, mix multi-layer encryption and peer-to-peer. which allows the attacker to analyze the forwarding 2

path and impersonate the recipient to receive the mes- protocols or other schemes utilized in anonymous systems. In sages. Moreover, in anonymous systems applying the case, we intend to construct an anonymous system which exchange schemes, an attacker can employ Man-in-the- can resist the attacks mentioned above. Middle (MitM) attacks [16] to undermine the security of Solution 3. Through the former schemes we found mix these systems. technology can defend against most kind of attacks on anony- mous systems except bring the centralization problem and tagging attacks. To mitigate the weakness, our first con- A. Solutions and Contributions sideration is combining blockchain and mix technology, as Motivated by these identified limitations, we combine mentioned in Solution 1. Unfortunately, the introduction of blockchain technology with mix network, and design a dy- blockchain raises the Sybil attacks into the anonymous system. namic self-organizing blockchain-based mix anonymous sys- By researching the properties of Sybil attacks we structure tem. We expect to dispose of the following challenges. PoW voting and IP sharding algorithms to mitigate the impact Challenge 1. Designing a decentralized self-organizing of Sybil attacks. For MitM attacks, we design a transaction- anonymous system. The first challenge we seek to address based key exchange scheme which makes use of the Bitcoin’s is the centralization issues. As we mention above, the central transaction propagation mechanism to break the single-channel proxies or preset nodes might pry into users’ private data and control of attackers. The detailed illustration is provided in reveal the true identities. Therefore, we intend to construct a Section V and Section VI. decentralized anonymous system in which the mix nodes are To summarize, the contributions of this paper are as follows. dynamic and self-organizing. • We propose a blockchain-based mix-net protocol Solution 1. When it comes to decentralization, the most (BCMN) whose security and anonymity are demonstrated popular technology is blockchain, an emerging decentralized theoretically. Especially, we elect miners as mix nodes via architecture and distributed computing paradigm underlying special algorithms to avoid the centralized settings. Bitcoin [17] and other . Leveraging the dy- • We construct a dynamic self-organizing blockchain-based namic and decentralized properties of blockchain miners, we mix anonymous system (upon the proposed BCMN proto- devise voting algorithms to elect mix nodes from blockchain col) and discuss how the proposal can satisfy the security miners. This trustless and distributed design not only prevents requirements. privacy leakage from service provides, but also mitigates • We demonstrate the feasibility and effectiveness of the single point failure and DDos attacks. proposed BCMIX by developing the system in an analog Challenge 2. Designing a user friendly anonymous network with the miner distribution data in the real world. system. Another challenge is to construct an anonymous Compared with existing systems, our system performs system with high availability. In fact, users are often reluctant well and provides stronger security. to consume more additional resources (i.e. bandwidth and computing resources) and wait for a long time. Thus we are B. Related Works drove to build a high available anonymous system with less Generally, anonymous communication systems can be di- additional resources. vided into the following sub-types, mix re-encryption, mul- Solution 2. Our proposed solution is inspired by cMix. ticast/broadcast, mix multi-layer encryption and peer-to-peer. Similar to cMix, we also split the time-consuming, compli- Mix re-encryption based anonymous systems [18], [19] lever- cated public key operations with the real time phase. The age technologies to dispose messages and hide difference is that cMix adopts fixed mix nodes with a stable the users’ identities, which can resist traffic analysis attacks. joint public key while our approach leverages the dynamic But with the utilizing of public key cryptography, mix re- blockchain miners to compete for mix nodes, which brings a encryption based anonymous systems is expensive and easy problem that the elected mix nodes (miners) have to negotiate to waste resources. To settle the problem, Chaum et.al [10] a joint public key in each round. To avoid interacting with proposed a anonymous system called cMix in 2017. Through other mix nodes in each round, we firstly propose a revised a precomputation, the core cMix protocol eliminates all expen- additive homomorphism mix-net protocol. Then we combine sive real time public-key operationsat the senders, recipients the protocol with Bitcoin account schemes, such that the and mix nodes, thereby decreasing real time cryptographic elected mix nodes can calculate the system public key in a latency and lowering computational costs for clients. The non-interactive manner. It is worth noting that no additional core real time phase performs only a few fast modular resources are required in BCMIX other than the miners’ multiplications. The authors claim that cMix can detect the computing power for solving Bitcoin puzzles. malicious nodes by utilizing Random Partial Checking (RPC) Challenge 3. Designing a secure anonymous system. The and commitment scheme. most intractable challenge is building a secure anonymous Multicast/broadcast based anonymous communication sys- system. Based on different principles, anonymous systems tems [20], [21] achieve anonymity through one-to-many com- are assailable to disparate attacks including internal attacks munications among hosts. This method is expensive and inef- and external attacks. The internal attacks , i.e. traffic analysis ficient for non-broadcast networks. In the case of large scale attacks, DDos attacks and so on, are caused by the design networks, an attacker can easily masquerade as the recipient to principles of systems while the external attacks, such as intercept the message, which further increases the computation MitM attacks and Sybil attacks, arise from the cryptographic and communication overhead required for authentication. 3

As for mix multi-layer encryption based anonymous sys- Definition 1. Suppose p is a prime and Ep(a, b) is an elliptic tems [22] [14] [23], one or more proxies are selected from curve. For the two points G and Q on the elliptic curve, they the cluster, and forward the messages from the former nodes satisfy Q = kG. It can be proved that it is easier to calculate and then the messages in a confusing order. The technology Q from k and G. However, it is difficult to calculate k from can achieve low-latency communications under the premise of Q and G [29]. ensuring efficiency, but is vulnerable to analyzing attacks such The security of ECC is based on Elliptic Curve Discrete as traffic analysis attacks and sniper attacks [24]. Logarithm Problem (ECDLP) which is consider to be compu- The rapid development of peer-to-peer (P2P) networks tationally infeasible to solve. drives the research of anonymous communication technology in P2P network environment [25]. In P2P based anonymous ECDH Key Exchange. Elliptic Curve Diffie-Hellman (ECDH) key exchange is the elliptic cuive analogue of the systems [26], [27] nodes enjoy anonymous services, and ∗ provide anonymous services for other nodes in their spare classical Diffie-Hellman key exchange operating in Zp. We time. Because the P2P network itself has a high degree of describe two communicating parties, usually called Alice and self-organization and disorder, and the number of members Bob, establish a shared secret key in secure communication is large, the P2P network can also provide a high degree channel as follows. We assume that Alice and Bob use the of anonymity when the attacker has a huge attack resource. same set of domain parameters D := (p, a, b, G, n, h) for their However, because of its openness and anonymity, the attacker computations. can control a large number of zombie nodes to launch witch • Alice generates an ephemeral key pair (kA,QA), i.e. attacks, and can disguise as normal nodes for traffic analysis, he/she generates a random number kA in [1, n − 1] thereby destroying the system’s anonymity without being and then performs a scalar multiplication to get the noticed. corresponding public key QA = kA ·G. Then Alice sends QA to Bob. • Bob generates an ephemeral key pair (k ,Q ) with C. Roadmap B B (QB = kB · G) in the same way and sends QB to Alice. The rest of this paper is organized as follows. In Section • After Alice receives QB, he/she performs a scalar multi- II, we review some preliminaries and propose an attacks plication to obtain the shared secret S = kA · QB. against cMix. In Section III, we illustrate the security model • After Bob receives QA from Alice, he/she obtains the and requirements. Nest, we detail the BCMN protocol and shared secret through computation of S = kB · QA the proposed BCMIX in Section IV. Then we evaluate the The security of the ECDH protocol relies on the intractabil- performance and demonstrate security respectively in Section ity of (computational) Elliptic Curve Diffie-Hellman Problem V and Section VI. Finally, this paper is concluded in Section (ECDHP). That is, given an elliptic curve Ep(a, b), a base VII. point G ∈ E(FP ), and two points QA = kA · G and QB = kB ·G, find the point S = kA ·kB ·G without knowledge II.PRELIMINARIESAND PROPOSED ATTACKS ON CMIX of kA, kB. It is clear that an algorithm for solving a generic In this section we briefly review the relevant notations and ECDLP instance would allow one to solve the ECDHP as well. definitions that are used in this paper. Then we describe some attacks against cMix. B. Verifiable Random Function. A Verifiable Random Function (VRF) [30] is the public-key A. Elliptic Curve based Cryptographic Primitives version of a keyed cryptographic hash. In this application, a In this work, we adopt elliptic curves over prime finite field Prover holds the VRF secret key and uses the VRF hashing to 2 3 Fp. The elliptic curve y = x + ax + b over Fp could be construct a hash-based data structure on the input data. Due represented as Ep(a, b). to the nature of the VRF, only the Prover can answer queries EC-Elgamal. The analog of ElGamal crypto system based about whether or not some data is stored in the data structure. on ECC, which is known as EC-Elgamal, was first introduced Anyone who knows the public VRF key can verify that the in [28]. It consists of the following algorithms. Prover has answered the queries correctly. A VRF is a triplet of K • Setup(1 ). The algorithm takes as input the security algorithms VRF := (Gen, Eval, Vfy) providing the following parameter K, and outputs the elliptic curve Ep(a, b) with functionalities. K base point G. • Gen(1 ). The key generation algorithm is a probabilistic • KeyGen. For the elliptic curve Ep(a, b) with base point algorithm that takes as input the security parameter K R G, pick k←−Fp and compute K = kG. Set PK = K and outputs a key pair (vk, vsk). We say that vsk is the and SK = k. secret key and vk is the verification key. R • Enc(m, P K, r). For the plaintext point m, pick r←−Fp. • Eval(vsk, X). The deterministic algorithm on input the k Compute C1 = rG, C2 = m + rK. Set ciphertext points secret key vsk and X ∈ {0, 1} and outputs a function C = (C1,C2). value Y ∈ Y, where Y is a finite set, and a proof π. We 0 • Dec(C,SK). For the ciphertext C, compute m = C2 − write Vvsk(X) to denote the function value Y computed kC1. by Eval on input (vsk, X). 4

• Ver(vk, X, Y, π). The verification algorithm takes as in- Node 1 Node 2 Node n Node 1 Node 2 Node n put (vk, X, Y, π) and outputs a bit b ∈ {0, 1} indicating whether or not π is a valid proof. Blockchain Basics. We review some basic components of a proof-of-work blockchain [31]. We define a transaction −−−−→ −−−−−→ −−−−→ −−−−−→ tx := (inputs, outputs, sig), where inputs and outputs are the inputs and outputs of a UTXO-based model, sig is the signature signed by the transaction sender. A block is a triple of the form B := (s, x, txs, ctr), (a) Precommunication Phase s ∈ {0, 1}K, x ∈ {0, 1}∗, ctr ∈ , where s is the state of the N Node 1 Node 2 Node n Node 1 Node 2 Node n previous block, x is the data and ctr is the proof of work of the block. A block B is valid iff

validBlockD(B) := H(ctr, T (s, x, txs)) < D. Here, H : {0, 1}∗ → {0, 1}K and T : {0, 1}∗ → {0, 1}K are cryptographic hash functions, and the parameter D ∈ N is the difficulty level of the block. A chain is simply a chain of blocks, that we call C. The (b) Real time Phase rightmost block is called the head of the chain, denoted by Head(C). Any chain C with a head Head(C) := (s, x, ctr) can Fig. 1. Workflow of cMix. be extended to a new longer chain C0 := C||B0 by attaching a block B0 := (s0, x0, ctr0) such that s0 = H(ctr, G(s, x)); the 0 0 head of the new chain C is Head(C) := B . We let C := ε to slot2, with slot2 at least s slots ahead of slot1. Then it holds express a chain C is empty. The function len(C) denotes the that len(C2)−len(C1) ≥ τ ·s, for s ∈ N and 0 < τ ≤ 1, where length of a chain C. τ is the speed coefficient. For a chain C of length n and any q > 0, we denote by Chain Quality.The chain quality property informally states Cpq the chain resulting from removing the q rightmost blocks that the ratio of adversarial blocks in any segment of a chain of C, and analogously we denote by Cqq the chain resulting held by a honest party is no more than a fraction µ, where µ in removing the q leftmost blocks of C; note that if q ≥ n is the fraction of resources controlled by the adversary. then Cpq := ε and Cqq := ε. If C is a prefix of C0 we write C ≺ C0. We also leverage slot which is defined in [ []]. A Definition 3. (Chain Quality). Consider a portion of length slot is the continuous amount of divided time. Each slot slotl `-blocks of a chain possessed by an honest party during any is indexed for l ∈ {1, 2, 3, ···}. We assume that users have a given slot intervals, for ` ∈ N. Then, the ratio of adversarial synchronised clock that indicates the current time down to the blocks in this ` segment of the chain is at most µ, where smallest discrete unit. 0 < µ ≤ 1 is the chain quality coefficient. Blockchain protocol. With the illustrations of the basic Common Prefix. The common prefix property informally components, we describe the blockchain protocol [31] Γ = says that if we take the chains of two honest nodes at different (Γ.KeyGen, Γ.Update, Γ.Validate, Γ.Broadcast) as follows. times slots, the shortest chain is a prefix of the longest chain. • {pk, sk} ← Γ.KeyGen: The algorithm generates the key pair (pk, sk) of the blockchain nodes. Definition 4. (Common Prefix). The chains C1, C2 possessed 0 • {C , ⊥} ← Γ.Update: This algorithm returns a longer by two honest parties at the onset of the slots slot1 < slot2 pk pk and valid chain C in the network (if it exists), otherwise are such that C1  C2, where C1 denotes the chain obtained returns ⊥. by removing the last k blocks from C1, where k ∈ N is the • {0, 1} ← Γ.Validate: The validity check algorithm takes common prefix parameter. as inputs a transaction tx, a block B or a chain C and returns 1 iff the transaction, the block or the chain is valid C. cMix Anonymous System according to a public set of rules. The cMix protocol by Chaum et al. [10] is a new mix-net • Γ.Broadcast: The algorithm takes as inputs some txs and protocol which aims to provide an anoymous communication broadcasts it to all the nodes of the blockchain system. tool for users at large scales. In contrast with existing mix-net The security of a PoW blockchain protocol Γ is charac- systems, cMix provides significant performance and security terized by three properties, namely: Chain Growth, Chain upgrades. Figure 1 briefly describes the workflow of cMix. The Quality and Common P refix [31]. protocol contains two participants: Users := (U , ··· ,U ) Chain growth. The chain property quantifies the number 1 β and mix nodes := (N , ··· ,N ). Each node N holds a of blocks that are added to the blockchain during any given 1 n i tuple of the form (π , r , s ,K ,E(·),D(·)), where π is number of slots. i i,j i,j i,t i a random permutation, ri,j, si, j ∈ G(i 6= j) is the random Definition 2. (Chain Growth). Consider the chains C1, C2 elements in cyclic group G, Ki,t ∈ G denotes the shared group possessed by two honest parties at the onset of two slots slot1, element between node Ni and user Ut, E(·) and D(·)) denotes 5 the encryption and decryption algorithm of Elgamal. The Blockchain detailed description of cMix protocol is provided in Appendix A. We now present a collision tagging attack on cMix protocol. tx tx tx

The attacker have to compromise the last node Nl and any mix

node Ni. To launch the attack, only small changes are needed TLS TLS to the protocol: Senders Mix node 1 Mix node i Mix node n Receivers • Precomputation Phase- Step 3: The mix node Ni

calculate the decryption share D(C~1) with the vector

−1 (1, ··· , t , ··· , 1) and commit to the D(C~1). Miners Miners Miners • Real time Phase- Step 1: The mix node Ni adds tag Node pools (1, ··· , t, ··· , 1) to ~m, and sends ~m to the next mix node. • Real time Phase- Step 3: The last node publish the ~ ~ Fig. 2. Architecture of blockchain-based mix anonymous system. Before output of the mixing step Πh(~m×Rh)×Th. Afterwards, anonymity phase, we elect miners in node pools to serve as mix nodes via all mix nodes release their decryption shares Di(C~1) designed algorithms. During anonymity phase, mix nodes mix and forward and the message component of the ciphertext C~ . The messages from users through Transport Layer Security (TLS). Meanwhile, 2 mix nodes sends special transactions to blockchain for auditing. mix node Ni waits for other mix nodes to publish their decryption first, then Ni collude with Nl to obtain the message package with the tagged messages. After that, • Senders: These entities refers to BCMIX users, who hold ~ Ni publish D(C1). the respective accounts addressU , pkU . Before accessing The mix node Ni and Nl get the location of the tag to the anonymous service, senders send transactions to message in advance, and the mixed messages are the same mix nodes and negotiate the corresponding keys with from the senders’ perspective. Thus the attacker can break the mix nodes. Then in the real time phase, senders blind anonymity of cMix. messages with the shared keys and send the message to the first mix nodes. III.SECURITY MODELAND REQUIREMENTS In this section, we present our proposed system model B. Threat Model for BCMIX and the related security requirements. The com- munication methods among them include transactions and BCMIX assumes authenticated communication channels Transport Layer Security (TLS), where the former is an on- among all mix nodes Therefore, we consider a malicious chain communication (i.e., publishing a transaction using P2P adversary (a.k.a Byzantine), who can delay, drop, eavesdrop, communications) and the latter is an off-chain communication forward, and delete messages between mix nodes, but not (i.e., establishing a secure communication channels among mix modify, replay, or inject new ones, without detection. For nodes). any communication not among mix nodes, we assume the adversary can delay, drop, re-order, eavesdrop, modify, or inject messages at any point of the network. BCMIX accepts A. System Model one message per user per batch, starting the pre-computation There are three entities in our proposed BCMIX, that is, once the batch reaches β messages. Miners, Mix nodes and Senders (see Figure 2). The adversary can also create a lot of accounts and compro- • Miners: These entities validate new transactions and mise an arbitrary numbers of users. In addition, we assume the record them on the global ledger. Simultaneously, the adversary can control more than 50% of the system computing entities compete to solve a difficult mathematical puzzle powers. However, such adversary is not able to read the based on a cryptographic hash algorithm. In BCMIX, contents of the messages. We assume the security of the used miners are eligible to become mix nodes through PoW cryptographic primitives, including a secure hash function and algorithm competition. Miners disclose their addresses in a secure signature scheme. the form of (addressM , pkM ) where addressM is the blockchain address and pkM is the blockchain public key C. Security Requirements deriving the related address. • Mix nodes: These entities are selected from miners According to the existing literature [5], [32], [33], BCMIX through the PoW and VRF algorithm. After being elected needs to satisfy the following fundamental security require- as mix nodes successfully, these entities firstly negotiate ments. keys with senders in the set up phase. Then they execute • Resistance to Sybil Attacks. BCMIX should minimize the precomputation and real time phase to encrypt and the possibility of an attacker being successfully selected mix the messages during the duty period and pass the as multiple mix nodes at the same time. message down. Besides, they should commit to their • Resistance to Collision Tagging Attacks. BCMIX computations and send special transactions to blockchain should prevent collision attackers from performing tag- network for subsequent auditing. ging attacks to link a message to a certain sender. 6

• Sender Anonymity. BCMIX provide sender anonymity TABLE I for users. That is, every mix node performs mixing oper- NOTATIONS OF THE REVISED MIX-NET PROTOCOL ations on messages, destroying the associations between Symbol Description senders and receivers. Thus an attacker can not associate di the secret share for mix node Ni of the secret key d, di ∈ Fp; teh export messages with a certain sender. Qi the public key of mix node Ni, Qi = diG; Q the public key of the system, Q = P Q ; • Resistance to MitM attacks. BCMIX should prevent an i i EQ() EQ(m) = (x · G, m + x · Q), x ∈ Fp; attacker from replacing the shared keys between senders P Dd () the decryption share of mix node Ni, Dd ( xi · G) = i P i i and mix nodes. ( i xi)di · G; • Single Point of Failure Resilience. In case of single ri,a random values (freshly generated for each round) of mix node point of failure (e.g. a mix node is under denial of service Ni for groove a; si,a random values (freshly generated for each round) of mix node attacks or the mix node is crashed), BCMIX should detect Ni for groove a; the failure and guarantee the system keep operating. πi a random permutation of the β grooves used by mix node N ; • Resistance to Other Attacks. BCMIX should resist com- i Π the permutation performed by BCMix through mix node N ; mon attacks on mix-net system such as replay attacks, i i ki,j a group element shared between mix node Ni and the sending traffic-analysis attacks and so on. user for groove j. These values are used as keys to blind messages; ki the vector of derived secret keys shared between mix node IV. PROPOSED BCMIX SYSTEM Ni and all users in a batch, i.e. ki = (ki,1, ··· , ki,β ); Kj the product of all shared keys for the sending user of slot j, In this section, we will present our construction of BCMIX Qn i.e. Kj = i=1 ki,j ; system. We first introduce an additive homomorphism mix-net Mj the message sent by user j. Like other values in the system, protocol and then we propose the blockchain based mix-net these vlaues are group elements; Ri, the product of all local random r values through mix node protocol. Thereafter, we describe the concrete BCMIX system. Qi Ni, Ri = j=1 rj ; S the product of all local random s values through mix node i ( s i = 1 A. Additive Homomorphism Mix-net Protocol N , S = 1 . i i π (S ) × s 1 < i ≤ n To solve the MitM attacks of key agreement process and the i i−1 i dependency on trusted entities, we replace Elgamal in cMix

[10] with EC-Elgamal and propose an additive homomorphism Main block Commit block Key exchange block mix-net protocol ∆ to integrate the mix-net protocol with the blockchain. Our mix-net protocol contains two participants, Senders := (U1, ··· ,Uβ) and mix nodes := (N1, ··· ,Nn), where the mix nodes are selected from the blockchain miners. Mix nodes negotiate keys with senders by means of a special 0 1 i j n new transaction txKE and calculate the system public key through i i i their address pair (addressN , pkN ), where pkM = Qi. (We will describe these two processes in the next part). We suppose a authenticated communication among mix nodes and we denote it as ∆.TLS. The proposed mix-net protocol is a tuple of algorithms (Setup, Precom, RealTime). The notations and Fig. 3. Chain Structure of BCMIX the processes of the protocol ∆ are presented in Table I and Algorithm 1. keys between senders and mix nodes, while a commitment B. Basic Components of the Proposed Blockchain Protocol transaction is released to supervise the behavior of mix nodes. Block and Chain. According to the transaction type We build our blockchain protocol Γ∗ by extending and involved, we define three blocks respectively called main modifying the aforementioned protocol Γ. We first define the block B , key-exchange block B and commitment block basic components in our blockchain protocol. M KE B , where B := (s , s , x, tx s, ctr), B := Transaction. Our blockchain contains three types of trans- COM M ke com N KE (s, x, tx s) and B := (s, x, tx s). Here s is the action, namely normal transaction tx , key-exchange transac- KE COM COM N state of the previous block, s is the state of previous blocks, tion tx and commitment transaction tx . The normal ke KE COM x is the data and ctr is the proof of work of the block. transaction tx is the same definition of the transaction that N A chain C is the form of C := (C , C , C ), where in protocol Γ. We define a key-exchange transaction tx := M KE COM −−−−→ −−−−−→ KE C , C and C respectively represent main chain, key- (KE, pk, inputs, outputs, sig) and a commitment transaction M KE COM exchange chain and commitment chain in Figure 3. We define tx := (COM , commitment, sig), where KE, COM COM a stable main block Bi as the origin of the key-exchange denotes the transaction type, pk is the blockchain public key of M chain C and commitment chain C . the sender, commitment is the commitment value issued by KE COM −−−−→ −−−−−→ mix nodes and inputs, outputs, sig are the same as the above Definition 5. Let two chains C1, C2 are possessed by two pw definition. A key-exchange transaction is used to negotiate honest parties at the onset of the slots slot1 < slot2, if C1  7

p(w+1) w C1 and C1  C1, then we call CM is a stable main chain revokes (∆.Setup, ∆.Precom, ∆.RealTime) and outputs i and BM where i ∈ [1, w] are stable main blocks. a permuted message vector Πn(M). • Verify. Given a message vector M := (m , m , ··· , m ) Note that the B and the B do not contain proof of 1 2 β KE COM and a message vector M := (m , m ), ··· , m ), work, thus an adversary can manipulate the contents of these (1) (2) (β) this algorithm decides whether the following conditions blocks. To avoid the problem we specify that the latest main holds: For ∀i ∈ [1, β], m ∈ M, ∃j ∈ [1, β], m ∈ M, block Blatest always contains the states of Blatest and Blatest. i j M KE COM m = m . If so, the algorithm returns 1. Otherwise the We further give out the definition of valid blocks. i j algorithm returns 0. Definition 6. We say blocks BN , BKE and BCOM are valid • Broadcast: The algorithm takes as inputs some txs and iff address (address, pk) and broadcasts them to all the

• The transactions contained in BN , BKE and BCOM are nodes of the blockchain system. valid; ∗ i Theorem 1. If Γ satisfies (τ, s)-chain growth, then Γ satisfies • For any two main blocks BN := p q i i i j (τ, s)-chain growth. (ske , scom , x , txN s, ctr ) and BN := m n j j j (ske , scom , x , txN s, ctr ), where i < j. p ≥ m Proof. We note that the side chain CKE and CCOM do not and q ≥ n hold; contain proof of work, and the update of CKE, CCOM is • H(ctr, T (ske , scom , x, txs)) < D. independent of the update of CM . Thus we conclude that CM satisfies (τ, s)-chain growth as the chain C in Γ. We now prove Blockchain based mix-net protocol. Below we propose a that C and C satisfy (τ, s)-chain growth. blockchain based mix-net protocol Γ∗ which combines the ba- KE COM Consider the chains C1 , C2 and C1 , C2 at the sic blockchain protocol Γ with the proposed mix-net protocol KE KE COM COM onset of two slots slot , slot , with slot at least s slots ahead ∆. The protocol Γ has copies of all the basic functionalities 1 2 2 of slot , and τ , τ are the speed coefficient of C and exposed by Γ and ∆ through the interfaces described above, 1 KE COM KE C . Since a key exchange block B and a commitment and adds additional algorithms including VRF, IP Sharding in COM KE block B are jointly generated by all mix nodes at the order to resist Sybil attacks. We describe the protocol Γ∗ := COM expected speed τ and τ respectively, we derive that (Setup, Update, Vote, Mix, Verify, Broadcast) as follows. KE COM CKE and CCOM are updated at the expected speed τKE and • Setup. System parameters involved in our construction 2 τCOM . Then we hold the following two equations len(CKE)− are {Ep(a, b), , G, K,H,T }, where Ep(a, b) is a non- 1 2 1 G len(C ) = τKE ·s and len(C )−len(C ) = τCOM ·s. singular elliptic curve, is a cyclic group which con- KE COM COM G Thus we conclude that CKE and CCOM satisfies (τ, s)-chain sists of all points on Ep(a, b), as well as the point growth. In summary, Γ∗ satisfies (τ, s)-chain growth. at infinity O, G is the base point of Ep(a, b), K is ∗ K the security parameter and H : {0, 1} → {0, 1} , Theorem 2. Let H,G be two collision-resistant cryptographic ∗ K T : {0, 1} → {0, 1} are cryptographic hash functions. hash functions. If Γ satisfies (µ, `)-chain quality, then Γ∗ Entities invoke the algorithm to generate the blockchain satisfies (µ, `)-chain quality. key pair (pk, sk) := (qG, q), the address (address, pk) Proof. We emphasize that proof of work is not contained in the and the VRF key pair (vk, vsk). Here q ∈ Fp is selected by entities. key exchange chain CKE and the commitment chain CCOM . • Update. This algorithm first invoke Γ.validate to validate And the security of CKE, CCOM is depend on the security the new transactions, chains and blocks. Then returns of the main chain CM since CM takes the states of CKE and a longer valid chain C := (CM , CKE, CCOM ) in the CCOM as inputs of the proof of work. Suppose an adversary A network (if it exists), otherwise returns ⊥. wants to manipulate contents of CKE and CCOM . According • PoWVote. The algorithm take as inputs the data x, the to the formula H(ctr, T (ske, scom, x, txs)) < D (Definition state s and a difficulty level D0 < D, where D is the sys- 7.), A has to amend the corresponding main block BM . Note tem difficulty level, and then computes H(ctr0,T (s, x)) that the capabilities of the adversary and external environment ∗ where ctr0 is a random string. Thereafter, the algorithm in Γ are exactly the same as that in Γ. Thus we can conclude 0 0 estimate whether Hi(ctr ,T (s, x)) < D . If so the that the main chain CM is the only factor affecting the chain algorithm outputs 1 and add the miner into a set M, quality property. We show below that A has only a negligible ∗ otherwise returns 0. probability of violating chain quality of Γ . i • IPSharding. This algorithm takes as input IP prefix of Let us denote by BM the i-th block of the main chain 1 len(C) miners in set M, and outputs n node pools (Miners in CM at some slot intervals so that CM := BM ··· BM . each node pool have the same IP prefix IP/z where z is From Definition 4. we know that the number of main blocks a integer and z ∈ [0, 32].) generated by A in chain CM are at most µ · len(C). According • VRF. The algorithm takes as inputs IP/z and the current to [31], A can not generate more than µ · len(C) main blocks slot slotl, and outputs a value Y ∈ Y, where Y is a finite with the current computing hash power. Or the adversary A ? set, and a proof π. could try to build an valid candidate block BM to replace a j • Mix. The algorithm take as inputs a sender set validate block BM generated by honest parties in CM , where ? j ? j U := (U1,U2, ··· ,Uβ), a mix node set N := j ∈ [1, len(C)], BM 6= BM and H(BM ) = H(BM ). By the (N1,N2, ··· ,Nn) and a message vector M. Then it collision-resistance property of hash function we can draw a 8

Algorithm 1 The Additive Homomorphism Mix-net Protocol ∆ 1: procedure SETUP PHASE i j i i 2: (Q, Ki, ki,j ) ← Setup(pkU , pkM ): This algorithm is invoked by senders Ui with address (addressU , pkU ) and mix nodes Mj with address j j (addressN , pkN ), where i ∈ [1, β], j ∈ [1, n]. The algorithm takes as inputs the public key of senders and mix nodes and returns a shared key Qn P j P ki,j between a send Ui and a mix node Mj . A slot key Ki = j=1 ki,j and the system public key Q = j pkM = j Qj . 3: procedure PRECOMPUTATION PHASE Precom := (Preprocess, Mix, Postprocess) −1 • Preprocess. The mix nodes generate the fresh r, s, π values and computes the encryption E(ri ). At the same time the mix nodes issues the commitment values of the fresh values COMr,COMs,COMπ. Then they collectively compute the product of the received values by sending the following message to the next mix node: ( EQ(r1) i = 1 EQ(Ri) = EQ(Ri−1) × EQ(ri) 1 < i ≤ n

Eventually, the last mix node sends the final values EQ(Rn) to the first mix node as input for the next step and issues the commitment value COM EQ(Rn). • Mix. The mix node together mix the values and compute the results Πn(Rn) × Sn, under encryption. The mix nodes perform this mixing by having each mix node i send the following message to the next mix node: ( π1(EQ(Rn) × EQ(s1)) i = 1 EQ(Π(Rn) × Si) = πi(EQ(Πi−1(Rn) × Si−1)) × EQ(si) 1 < i ≤ n

As with the first step, the last mix node sends the final encrypted values EQ(Πn(Rn) × Sn) to the first mix node. P • Postprocess. To complete the precomputation, each mix node Ni computes its decryption shares Ddi ( i xi · G), where (x, c) = EQ(Πn(Rn) × Sn), and keep its secret. Then each mix node issues the commitment values COMD of their secret shares. The message parts c are multiplied di with all the decryption shares to retrieve the plaintext values Πn(Rn)×Sn. The last mix node to be used in the real time phase stores the decrypted precomputed values. 4: procedure REALTIME PHASE RealTime := (Preprocess, Mix, Postprocess) −1 5: Each user constructs its message MKj (for slot j) by multiplying the message Mj and it sends it to the first mix node, which collects all messages and combines them to get a vector M × K−1. −1 Qn • Preprocess. Each node Ni sends ki × ri to the next mix node, which uses them to compute M × Rn = M × K × i=1 ki × ri and the last node sends the result to the first node. • Mix. Each node Ni computes πi(Πi−1(M × Rn) × Si−1) × si, where Π0 is the identity permutation and S0 = 1. The last node sends a commitment to its message Πn(M × Rn) × Sn to every other node. −1 • Postprocess. Each node Ni opens its precomputed decryption share for (x, c) = ((Πn(Rn) × Sn) ), while the last node sends its decryption share multiplied by the value in the previous step and the message component: Πn(M × Rn) × Sn × Dn(x) × c. Finally, the permuted message Qn can be decrypt as Πn(M) = Πn(M × Rn) × Sn × i=1 Di(x) × c.

6: Note: The shared key kij in Setup Phase is generated through ECDH protocol. After the mix node set (N1,N2, ··· ,Nn) is selected from miners. Every sender sends txKE s to all mix nodes. Since senders and mix nodes know the public key of each other, they can execute the ECDH protocol and obtain the shared key kij .

Genesis Stable Audit Phase Mix Phase block block

6. After the system outputs The Main Chain 4. Users negotiate keys with mix nodes the permuted message, senders The Commitment Chain through txKE and send to mix nodes. invoke Verify to validate the 5.The mix nodes execute percommunication integrity of the message. The Key Exchange Chain and real time phase, and update related data to The Mix Nodes blockchain through BKE and BCOM. BCOM BKE Precommunication Phase tx 4. KE Real Time Phase

Precommunication The Candidate Miners 5. 6. The Miners Real time

Senders Reservers 3.VRF Voting Vote Phase 1. Miners invoke PoWVote to Pool 1 Pool i Pool n complete as the candidate miners. 2. The candidate miners invoke IPSharding to be classified into 2.IP Sharding different pools. 3. The candidate miners in different pools invoke VRF to 1. PoW Voting complete as the current mix nodes.

Fig. 4. The detailed orchestration of BCMIX. 9 conclusion that the adversary has only a negligible chance Algorithm 2 The IPSharding algorithm. ? ? of producing such a candidate block BM where H(BM ) = Input: The IP address of the candidate miners, where ipi := j ∗ H(BM ). Hence Γ satisfies (µ, `)-chain quality. A1.A2. ∗ .∗. Here A1, A2 and ∗ denote four decimal segments; the system parameter σ; w Theorem 3. Let BM be a stable block which is the genesis Output: k mix node pools; of the chain CKE and CCOM . If Γ satisfies k-common prefix, m m 1: coordinateipm := (A1 ,A2 ); then Γ∗ satisfies k-common prefix. 2: |Num2p,2q | denotes the number of nodes distributed in (2p, 2q); Proof. Note that the chain CKE and CCOM will not fork since they don not contain proof-of-work and are uniquely 3: for j = 0,j <= 2 do generated bt the mix nodes. Hence C and C satisfy 4: for i = 0,i <= 7 do KE COM m i m i+1 the k-common prefix property. We recall that the behaviors 5: if Aj < 2 or Aj > 2 then of the adversary and the honest parties in Γ∗ are exactly the 6: i + +; same as that in Γ. Thus according to the literature [], the main 7: else break; chain satisfies k-common prefix property. This concludes the 8: j + +; proof. 9: Classify the coordinate into k parts such that |Num2p,2q |j <= σ, k ∈ [1, k]; Given the above, the tuple Γ∗ := |Num2p,2q |min 10: return k mix node pools. (Setup, Update, Vote, Mix, Verify) is a secure blockchain protocol which satisfies the properties of (τ, s)-chain growth, (µ, `)-chain quality and k-common prefix. Mix. In this phase, the current mix nodes N := (N1,N2, ··· ,Nn) first negotiate shared keys with users U := C. System Design (U1,U2, ··· ,Uβ). Then the current mix nodes blind and mix With the description of the blockchain-based mix-net pro- messages M := (M1, M2, ··· , Mβ) for users. tocol, we propose the dynamic self-organizing blockchain- To negotiate keys with each other through ECDH protocol, based mix anonymous system in this part. Our BCMIX system the mix nodes and the users do as follows. consists of four phases, namely: System Initialization, Vote, 1) After the current mix nodes are selected, they run Mix, Audit. The orchestration of BCMIX is detailed in Figure Γ∗.Broadcast to disseminate their addresses and the VRF 4. parameters, i.e. (address, pk)||(vk, Y, π), to the partici- System Initialization. This phase initializes the sys- pants in the blockchain network. tem parameters and generate accounts for participants. 2) Upon receiving the addresses and parameters of all The system determine the public system parameters mix nodes, a user Ui first invoke VRF.Ver to ver- {Ep(a, b), G, G, K,H,T } (e.g. secp256k1 of Bitcoin). After ify the received. If the algorithm returns 1, then that the entities (i.e., users and miners) invoke Γ∗Setup to i the user Ui sends transactions txKE to every mix (pki , ski ) := i generate their key pairs (i.e., user key pair u u nodes. Then the mix nodes parse tx as txKE := i i j j j j −−−−→ −−−−−→ KE (quG, qu) and miner key pair (pkm, skm) := (qmG, qm)) (KE, pk, inputs, outputs, sig) and obtain the public key i i and addresses (i.e., user address (addressU , pkU ) and miner of each user. address (addressj , pkj ) respectively). In additon, miners N N 3) Finally the mix nodes Nj and the user Ui multiply their also generate their VRF key pairs, which are used to complete secret key with the public key of each other, and obtain as mix nodes. i j the shared key kij = qU · qN · G. Vote. This phase is executed by the miners to elect the candidate mix nodes for mixing messages. After negotiating keys with users, the mix nodes network blind and mix messages as follows. 1) Firstly, miners invokes Γ∗.PoWVote to solve the puzzle D0. If the algorithm Γ∗.PoWVote returns 1, then the 1) The mix nodes invoke ∆.Precom to compute the parame- ters EQ(Πn(Rn)×Sn) utilized in the real time phase and miner Mi is added to the candidate set M. Otherwise miners re-select inputs s, x to compute H(ctr0,T (s, x)) update the corresponding commitment values COMr, 0 0 COMs, COMπ, COMD to the commitment chain until H(ctr ,T (s, x)) < D . di 2) After a period of certain time, e.g. T , M stops ac- CCOM through the commitment transaction txCOM . −1 cepting new miners. Then the candidates miners run 2) After receiving the blind messages M × K from the Γ∗.IPSharding (Algorithm 2) and join the node pool with users, the mix nodes invoke the algorithm ∆.RealTime the same IP prefix. to mix the received messages and obtain the permuted message Πn(M). 3) After that the candidate miners Mi in each node pool ∗ take the IP/z and current slot slotl as inputs, and invoke Audit. In this phase, the users invoke Γ .Verify to check the ∗ ∗ Γ .VRF to generate a value Yi ∈ Y and the corresponding integrity of the permuted messages. If the algorithm Γ .Verify proof π. Then the candidate miner who holds the smallest returns 1 to all the users, then the permuted messages are value Y in each node pool is elected as the mix node considered integrated. If Γ∗.Verify returns 0 to some users, in the current slot. Finally, the elected mix nodes are then the users ask the current mix nodes to check their cal- networked in the order of Y . culation through the corresponding commitment block BCOM 10

Fig. 5. The probability of the expected number of candidate miners Fig. 6. The results of IPSharding algorithm when σ = 5 (four node poos, i.e. four mix nodes). We denote an IP address as four segments, the x axis is the first segment and the y axis denotes the second segment. We can change the division strategy to control the number of mix nodes. We respectively and identify the malicious mix nodes. The malicious mix node simulate the attackers which hold 53.84% (the red circle), 63.43% (the red will be removed out of BCMIX. circles and the green circle) and 74.80% (the red circles, the green circle and the purple circles) of total computing powers. V. PERFORMANCE EVALUATION This section presents the performance evaluation of see that when fixing the difficulty, the longer the average time BCMIX. We firstly theoretically analyze the number of candi- to generate a block, the higher the probability of generating date miners. Then we describe the implementation of BCMIX multiple blocks simultaneously. and evaluate its performance.

A. The Number of candidate miners B. Implementation In Bitcoin, the difficulty value D, the target target and the In order to show the feasibility of BCMIX, we build a Bit- network hash rate satisfy the following formula: coin network in a desktop computer (equipped with a Ubuntu 32 16.04 LTS, Intel (R) Core (TM) i5-8500 CPU of 3.00GHz and DMax D · 2 D = , Hashratemin = , 16GB RAM). We make use of two Github programs, Bitcoin- targetcurrent T Simulator 2 and cMix 3, to evaluate the proposed BCMIX. We set up the public parameters by utilizing the publicly available where DMax is a large constant, targetcurrent denotes the library for cryptography on the secp256k1 curve 4. current target and Hashratemin represents the minimum hash We apply the Algorithm 2 to Table IV and obtain the rate required to calculate a block of difficulty D within time T . distribution of mining pools in the real world. The results are According to literature [], the process of generating n blocks shown in Figure 6. Combining Figure 5 and Figure 6, we within the average time t will be a Poission distribution with can find that when fixing difficulty D to 1.0 × 1011, setting the expected value λ, the average block generation time to 20s or 30s can obtain a ∞ X λk reasonable number of mix nodes, which is more conducive to P (X ≥ M) = e−λ. k! the implementation of BCMIX. To measure the approximate k=M time cost, we test the functionality of PoWVote, IPSharding, Since the Bitcoin network with Hashratemin mining Key Exchange, Precomputation and RealTime under different power generates one block within average time T , we can number of mix nodes and users for 100 times. The test results conclude that are shown in Table II. We can see that PoWVote spends more time than other algorithms. In order to improve the operating Hashrate Hashrate · T real real efficiency of BCMIX, we elect mix nodes at regular intervals. λ = = 32 . Hashratemin D · 2

Here Hashratereal denotes the mining power of the Bitcoin network in the real world. To better simulate BCMIX in the real world scenario, we leverage the mining power distribution VI.SECURITY ANALYSIS from 1. In Appendix B, Table IV details the IP address and According to the proposed additive homomorphism mix-net mining power of different mining pools. With D = 1 × 1011, protocol and basic components of proof-of-work blockchains, we estimate the probability of simultaneous block generation under different T and report our results in Figure 5. We can 2https://github.com/arthurgervais/Bitcoin-Simulator 3https://github.com/byronknoll/cmix 1https://btc.com/stats/pool?pool mode=year 4https://github.com/bitcoin-core/secp256k1 11

TABLE II TIMECOST (INRM S) FOR DIFFERENT NUMBER OF MIX NODES AND USERS TO EXECUTE THE ALGORITHMS

Time Number Mix Node 3 5 7 9 Alg User 10 50 100 10 50 100 10 50 100 10 50 100 PoWVote 29.62 28.90 30.12 29.77 31.23 29.35 29.00 30.27 32.14 31.36 30.80 29.79 IPSharding 0.65 0.65 0.65 0.93 0.93 0.93 1.46 1.46 1.46 1.88 1.88 1.88 Key Exchange 0.60 0.83 1.07 0.69 1.03 1.47 0.77 1.26 1.86 0.86 1.48 2.26 Precomputation 0.33 0.33 0.33 0.47 0.47 0.47 0.62 0.62 0.62 0.81 0.81 0.81 RealTime 0.03 0.11 0.26 0.06 0.19 0.37 0.12 0.23 0.43 0.16 0.29 0.49 We fix the difficulty D = 1 × 1011 and stipulate the block generation time T = 30s. We can change the division strategy in Figure 6 to control the number of mix nodes.

TABLE III SUMMARY COMPARISON OF ATTACK RESILIENCE AND PERFORMANCE

Attacks Performance Technology Traffic analysis DDos Tagging MitM Sybil Bandwidth Latency BCMIX Blockchain Re-encryption !!!!!!! cMix Mix Re-encryption !%%% \ !! Tor Mix Mutilayer Encryption %% \ % \ !! BAR Multicast/Broadcast !%%% \ !% AnonPubSub Probabilistic Forwarding !%!% \\\ Tarzan Peer-to-peer !!!%%!! DiceMix CoinJoin %%!%%!! !- The system is secure against this attack or the system performs well. %- The system is vulnerable to this attack or the system is not performing well. \ - The system does not involve this attack of the authors did not mention the related performance.

Fig. 7. The relationship between difficulty and probability to launch the Sybil Fig. 8. The relationship between difficulty and probability to launch the attacks. collision tagging attacks.

BCMIX system can satisfy all the security requirements de- Sybil attacks than attackers with lower computing power. scribed in Section III-C. Table III summarizes a comparison And if we set the difficulty D to be very small, the among different anonymous systems. probability that an attacker successfully launching Sybil • Resistance to Sybil Attacks. We indicate that an attacker attacks is almost zero. implement Sybil attacks successfully means the attacker • Resistance to Collision Tagging Attacks. As we men- take control of all mix nodes. To simulate a process of tion in Section II C, an attacker compromise the last mix Sybil attacks, we assume that an IP address of mining node Nl and any mix node Ni to launch collision tagging pools in Table IV represents an identity, and all identities attacks. In this case, we can treat a collision tagging attack created by a mining pool share the mining pool’s com- as a special form of a Sybil attack. We illustrates the puting power equally. Figure 7 shows the requirements relationship between the difficulty D and the probability in terms of difficulty (computing power), in order for an of launching collision tagging attacks in Figure 8. Similar attacker with different computing power to successfully to Sybil attacks, we can set the difficulty D to be very launch Sybil attacks. Figure 7 indicates that, when the small to resist the collision tagging attacks. difficulty D is fixed, an attacker with higher computing • Sender Anonymity. Suppose that an adversary A can power have a higher probability of successfully launching compromise β − 2 users and n − 1 nodes. Let Ux and 12

1 Uy denote any two honest users and let Ni be the 1] − P r[EXPan(∆, A, λ) = 1]| ≤ neglECDHP . Sim- 0 only honest mix node. The sender anonymity property ilarly, we conclude that |P r[EXPan(∆, A, λ) = 1] − 1 guarantees that the adversary cannot distinguish the P r[EXPan(∆, A, λ) = 1]| ≤ neglECDLP . This con- messages from the two honest users. We design the cludes the proof. following experiments for a PPT adversary A. • Resistance to MitM attacks. In BCMIX we leverage the gossip protocol to spread messages. We assume that an EXP b (∆, A, λ) : an attack cannot control all access networks of a blockchain ASetup: node. According to [34], this is reasonable since none A λ (Q,Kx,Ky,Km,m6=x,m6=y) ← Setup(1 ), instances of eclipse attacks have arisen in reality up APrecomputation: to now. Besides, many blockchain communities have

(EQ(RA) ·EQ(rNi )) ← Preprocess(r), fixed this vulnerability [35]. In this case, an adversary EQ(Πn(Rn) · Sn) ← Mix(s, π), Ae attempts to simulate an elected mix node to deceive a

(Πn(Rn) · Sn) ← Postprocess(Ddi ), user Ue. At the same time, other nodes connected with ARealTime: Ue inform Ue the latest elected mix node sets, thus Ue identities Ae as an adversary. (mb ·rNi , m1−b ·rNi ) ← Preprocess(Kx ·mb,Ky ·m1−b), • Single point of failure Resilience. Nodes in BCMIX are πNi (mb · rNi · sNi , m1−b · rNi · sNi ) ← Mix, 0 0 (m , m ) ← Postprocess networking dynamically. Once an elected mix node Ni b 1−b loses the response for a period of time, the next mix node The adversary’s advantage in the experiments is: Ni+1 would inform other nodes that i is crashed. Then 0 1 i |P r[EXPan(∆, A, λ) = 1]−P r[EXPan(∆, A, λ) = 1]|. the other nodes validate the situation of node , and begin a new networking process if i goes down, or identify node Definition 7. (Anonymity). An additive homomorphism Ni+1 as a malicious node if i works normally. mix-net protocol ∆ := (Setup, Precom, RealTime) main- • Resistance to Other Attacks. BCMIX can also resist the tains anonymity if the advantage of the adversary in the following attacks. anonymity game is negligible. a) Replay attacks. An attacker may retransmitting a Theorem 4. If E is a ECDLP-secure additive homo- message m form a previous session. Then the at- morphism encryption scheme, ECDH is a ECDHP-secure tacker compare the mixed message sets with the key exchange protocol and a non-interactive commitment previous message sets which contain m, thus the scheme COM is perfectly-hiding, then BCMIX satisfies attacker can associate the egress messages with anonymity defined in Definition 7. the ingress messages. Since the random values and Proof. (Sketch). We prove the security of BCMIX by permutations are never reused, thus BCMIX resists reduction from the security of the encryption system E. replay attacks. Without loss of generality, we assume that the adversary b) Traffic analysis attacks. In connection-based sys- tems such as Tor, attackers can distinguish between A can compromise β − 2 users and n − 1 nodes. Let Ux two different paths in the free mix network by and Uy denote any two honest users and let Ni be the only honest mix node. counting packages and timing communication. Since In the setup phase, A can control all the shared keys BCMIX is a message-based system which batches and permutes messages during the transmitting pro- except Kx, Ky. In the precomputation phase, A gets cess, attackers can not distinguish and analyze the command of random values RA, SA and random per- blind messages, thus BCMIX resists traffic analysis mutations ΠA except rNi , sNi , πNi , the corresponding attacks. ciphertext E(rNi ) and the decrypt share DNi . In the real time phase, users send blind messages to mix networks in c) Intersection attacks and statistical disclosure attacks. the form of M ×K. The adversary A can parse the blind These attacks utilize information given by observing mix networks where the users can freely choose the messages as (Kx · mb,Ky · m1−b). During the mixing process of real time phase, A decrypt the mixed messages mix node for their messages. Since BCMIX adopts a fixed cascade of mix nodes every round, thus and obtain the mixed messages (mb · rNi · sNi , m1−b · BCMIX is not susceptible to these attacks. rNi · sNi ). Finally, A observe the plaintext messages of 0 0 the form (mb, m1−b). We assume a challenger C assigns the blind messages VII.CONCLUSION (Kx · m0,Ky · m1), where m0 = mb and m1 = In this paper, we achieved a dynamic self-organizing mix m1−b as determined by a random bit b. An adversary anonymous system. With blockchain technology, we elect 0 0 holding (mb, m1−b) predicts the association between mix nodes from public, dynamic blockchain miners. Before 0 0 (Kx · m0,Ky · m1) and (mb, m1−b). In the end, if constructing BCMIX, we proposed BCMN protocol with the the anonymity game adversary A predicts the bit b formal security models. Building on the proposed protocol, correctly, we can infer that A can calculate Kx and we designed a transaction-based key exchange scheme and Ky which break the ECDHP-secure of the underlying proposed our BCMIX. Then we illustrated BCMIX can satisfy 0 key exchange system. Thus |P r[EXPan(∆, A, λ) = the relevant security requirements. After that we demonstrated 13

experimentally that BCMIX is resistant to the attacks proposed IX.APPENDIX B. in this paper. Finally, we evaluated the performance of the The distribution of mining pools in the real world. To prototype with the real world data and compared BCMIX with simulate a process of Sybil attacks, we assume that an IP some latest anonymous systems. The results suggested that address of mining pools in Table IV represents an identity, BCMIX is practical for real world deployment. A follow-on work is to find a solution for recipient and all identities created by a mining pool share the mining anonymity, which would improve the anonymity ability of our pool’s computing power equally. system. We believe that building a bidirectional anonymous system allow us to identity additional features and properties. TABLE IV NOTATIONS OF THE REVISED MIX-NET PROTOCOL

VIII.APPENDIX A. Pool Name IP Address Hashrate Proportion The detailed description of cMix protocol is as follows. 124(EH/s) 100% f2pool 203.107.32.162 21.1048 17.02% Setup phase. The mix nodes establish their decryption share Poolin 47.75.234.12 19.716 15.9% X , and the public key y is computed. Each user A will 117.24.1.243, 117.24.1.239 i j BTC.com 16.1076 12.99% individually establish a symmetric key k with each mix node 117.24.1.238, 117.24.1.242 i,j AntPool 47.94.135.145 13.95 11.25% Ni in the network. The mix nodes draw their random values ViaBTC 116.211.155.211, 123.155.158.10 7.8988 6.37% ~ri and vecti for the n slots. Huobi.pool 47.93.94.105 7.626 6.15% 58COIN&1THash 39.98.72.224 6.4728 5.22% Precomputation phase. The goal in this phase is to perform 104.26.5.102, 104.26.4.102 SlushPool 5.6792 4.58% the public-key operations that is needed in the real time phase. 172.67.74.105 −1 OKExPool 208.43.170.231 4.9724 4.01% Step 1-Preprocessing: Mixnode Ni computes E(~r ), and i unknown 47.93.94.105 4.7244 3.81% send their calculated vector to the network handler. The BTC.TOP 123.56.208.222 3.9804 3.21% −1 h −1 ~ Q BytePool 58.218.215.133 2.2816 1.84% network handler then computes E(Rh ) = i=1 E(~ri ). Binance Pool 13.248.150.68, 76.223.2.151 2.0832 1.68% Step 2-Mixing: Ni(i = 1, ··· , h − 1) computes and sends 104.26.5.32, 104.26.4.32 the following to Ni+1 BitFury 2.0336 1.64% 172.67.70.128 ( π (E(R~ −1)) × E(~t−1) i = 1 Lubian.com 47.56.109.242 1.922 1.55% E(Π (R~ −1) × T~ −1) = 1 h 1 i h i ~ −1 ~ −1 ~−1 104.26.11.113, 104.26.10.113 πi(E(Πi−1(Rh ) × Ti−1)) × E(ti ) 1 < i ≤ h NovaBlock 1.488 1.20% 172.67.70.128 −1 SpiderPool 47.52.126.9 0.62 0.5% Nh finally computes: (C~1, C~2) = E(((R~ h) × T~h) ) = ~ −1 ~ −1 ~−1 ~ WAYI.CN 47.103.164.189 0.5459 0.44% πh(E((Rh ) × Th−1)) × E(th ). Nh sends C1 to the other Bitcoin.com 104.18.26.217, 104.18.27.217 0.4092 0.33% MiningCity 23.218.94.192, 23.32.241.177 0.124 0.1% mix nodes and store C~2 locally for use in the real time phase. OKKONG 47.96.193.193 0.062 0.05% Step 3-Postprocessing: Mixnode Ni use their decryption 104.18.40.151, 172.67.148.229 TATMAS Pool 0.0496 0.04% share Xi to decrypt the vector of random components they 104.1841.151 ~ ~ −Xi BitClub 213.173.105.14 0.0496 0.04% received in the previous step; Di(C1) = C1 . They publish Sigmapool.com 18.156.81.156 0.0372 0.03% a commitment to their calculated decryption share. KanoPool 45.77.7.149 0.0124 0.01% Solo CK 51.81.56.15 0.0124 0.01% Real time phase. In this phase, the senders are involved. Aj −1 constructs a blinded message mj ×Kj . The blnded messages are the input to the protocol, and they are combinded by the network handler to yield the vector ~m × K~ −1. ~ REFERENCES Step 1-Preprocessing: Every mix node Ni calculates ki ×~ri, and sends the resulting vector to the network handler. The [1] J. Sanders and D. Patterson, “Facebook data privacy scandal: A cheat −1 sheet,” 2019. network handler then computes ~m × R~ h = ~m × K~ × Qh ~ i ~ −1 [2] D. R. Hayes, C. Snow, and S. Altuwayjiri, “Geolocation tracking i=1 ki × ~r , hence the K vector is replaced with the and privacy issues associated with the uber mobile application,” in random r values of each mix node. Proceedings of the Conference on Information Systems Applied Research Step 2-Mixing: Ni computes and sends the following to ISSN, vol. 2167, 2017, p. 1508. Ni+1: [3] N. Alexopoulos, A. Kiayias, R. Talviste, and T. Zacharias, “Mcmix: ( Anonymous messaging via secure multiparty computation,” in 26th π1( ~m × R~ h) × ~t1 i = 1 (Πi( ~m × R~ ) × T~i) = {USENIX} Security Symposium ({USENIX} Security 17), 2017, pp. h π (Π ( ~m × R~ ) × T~ ) × ~t 1 < i < h i i−1 h i−1 i 1217–1234. ~ ~ [4] D. L. Chaum, “Untraceable electronic mail, return addresses, and digital Mix node Nh computes Πh(~m × Rh) × Th = πh(Πh−1(~m × pseudonyms,” Communications of the ACM, vol. 24, no. 2, pp. 84–90, R~ h) × T~H−1) × ~th. Nh commits to this vector and sends the 1981. commitment to the remaining mix nodes. [5] T. Lu, Z. Du, and Z. J. Wang, “A survey on measuring anonymity in Step 3-Postprocessing: When mix node N (i = 1, ··· , h−1) anonymous communication systems,” IEEE Access, vol. 7, pp. 70 584– i 70 609, 2019. receive the commitment from Nh, they send their decryption [6] G. Danezis and C. Diaz, “A survey of anonymous communication share Di(C~1) computed in the precomputation phase to the channels,” Technical Report MSR-TR-2008-35, Microsoft Research, network handler. The last mix node computes and send the Tech. Rep., 2008. ~ ~ ~ [7] M. Edman and B. Yener, “On anonymity in an electronic society: following to the network handler: Πh(~m × Rh) × Th × C2 × A survey of anonymous communication systems,” ACM Computing Qh ~ ~ ~ ~ ~ −1 Surveys (CSUR), vol. 42, no. 1, pp. 1–35, 2009. i=1 D(C1) = Πh(~m × Rh) × Th × (Πh(Rh) × Th) = [8] J. Boyan, “The anonymizer-protecting user privacy on the web,” 1997. Πh(~m) [9] E. Gabber, P. B. Gibbons, D. M. Kristol, Y. Matias, and A. Mayer, The network handler outputs Πh(~m), that is a permutation “Consistent, yet anonymous, web access with lpwa,” Communications of the input message. of the ACM, vol. 42, no. 2, pp. 42–47, 1999. 14

[10] D. Chaum, D. Das, F. Javani, A. Kate, A. Krasnova, J. De Ruiter, [35] Y. Marcus, E. Heilman, and S. Goldberg, “Low-resource eclipse attacks and A. T. Sherman, “cmix: Mixing with minimal real-time asymmet- on ethereum’s peer-to-peer network.” IACR Cryptol. ePrint Arch., vol. ric cryptographic operations,” in International Conference on Applied 2018, p. 236, 2018. Cryptography and Network Security. Springer, 2017, pp. 557–578. [11] S. Prusty, B. N. Levine, and M. Liberatore, “Forensic investigation of the oneswarm anonymous filesharing system,” in Proceedings of the 18th ACM conference on Computer and communications security, 2011, pp. 201–214. [12] M. Sherr, A. Mao, W. R. Marczak, W. Zhou, B. T. Loo, and M. A. Blaze, “A3: An extensible platform for application-aware anonymity,” 2010. [13] J. Kong and X. Hong, “Anodr: anonymous on demand routing with untraceable routes for mobile ad-hoc networks,” in Proceedings of the 4th ACM international symposium on Mobile ad hoc networking & computing, 2003, pp. 291–302. [14] P. Syverson, R. Dingledine, and N. Mathewson, “Tor: The secondgen- eration onion router,” in Usenix Security, 2004, pp. 303–320. [15] M. J. Freedman and R. Morris, “Tarzan: A peer-to-peer anonymizing network layer,” in Proceedings of the 9th ACM conference on Computer and communications security, 2002, pp. 193–206. Michael Shell Biography text here. [16] M. Conti, N. Dragoni, and V. Lesyk, “A survey of man in the middle attacks,” IEEE Communications Surveys & Tutorials, vol. 18, no. 3, pp. 2027–2051, 2016. [17] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” PLACE Manubot, Tech. Rep., 2019. PHOTO [18] M. Gomułkiewicz, M. Klonowski, and M. Kutyłowski, “Onions based HERE on universal re-encryption–anonymous communication immune against repetitive attack,” in International Workshop on Information Security Applications. Springer, 2004, pp. 400–410. [19] O. Pereira and R. L. Rivest, “Marked mix-nets,” in International Conference on Financial Cryptography and Data Security. Springer, 2017, pp. 353–369. [20] D. Chaum, “The dining cryptographers problem: Unconditional sender and recipient untraceability,” Journal of cryptology, vol. 1, no. 1, pp. 65–75, 1988. [21] P. Kotzanikolaou, G. Chatzisofroniou, and M. Burmester, “Broadcast anonymous routing (bar): scalable real-time anonymous communica- tion,” International Journal of Information Security, vol. 16, no. 3, pp. 313–326, 2017. [22] C. Egger, J. Schlumberger, C. Kruegel, and G. Vigna, “Practical attacks against the network,” in International workshop on recent advances in intrusion detection. Springer, 2013, pp. 432–451. [23] A. M. Piotrowska, J. Hayes, T. Elahi, S. Meiser, and G. Danezis, “The loopix anonymity system,” in 26th {USENIX} Security Symposium ({USENIX} Security 17), 2017, pp. 1199–1216. [24] R. Jansen, F. Tschorsch, A. Johnson, and B. Scheuermann, “The sniper John Doe Biography text here. attack: Anonymously deanonymizing and disabling the tor network,” Office of Naval Research Arlington VA, Tech. Rep., 2014. [25] T. Chothia and K. Chatzikokolakis, “A survey of anonymous peer-to-peer file-sharing,” in International Conference on Embedded and Ubiquitous Computing. Springer, 2005, pp. 744–755. [26] T. Ruffing, P. Moreno-Sanchez, and A. Kate, “P2p mixing and unlinkable bitcoin transactions.” in NDSS, 2017, pp. 1–15. [27] J. Han and Y. Liu, “Mutual anonymity for mobile p2p systems,” IEEE Transactions on Parallel and Distributed Systems, vol. 19, no. 8, pp. 1009–1019, 2008. [28] N. Koblitz, “Elliptic curve cryptosystems,” Mathematics of computation, vol. 48, no. 177, pp. 203–209, 1987. [29] Y. Luo, X. Ouyang, J. Liu, and L. Cao, “An image encryption method based on elliptic curve elgamal encryption and chaotic systems,” IEEE Access, vol. 7, pp. 38 507–38 522, 2019. [30] T. Jager, “Verifiable random functions from weaker assumptions,” in Theory of Cryptography Conference. Springer, 2015, pp. 121–143. [31] J. Garay, A. Kiayias, and N. Leonardos, “The bitcoin backbone protocol: Jane Doe Biography text here. Analysis and applications,” in Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 2015, pp. 281–310. [32] J. Yu, D. Kozhaya, J. Decouchant, and P. Esteves-Verissimo, “Repucoin: Your reputation is your power,” IEEE Transactions on Computers, vol. 68, no. 8, pp. 1225–1237, 2019. [33] S. Hohenberger, S. Myers, R. Pass et al., “Anonize: A large-scale anonymous survey system,” in 2014 IEEE Symposium on Security and Privacy. IEEE, 2014, pp. 375–389. [34] E. Heilman, A. Kendler, A. Zohar, and S. Goldberg, “Eclipse attacks on bitcoins peer-to-peer network,” in 24th {USENIX} Security Symposium ({USENIX} Security 15), 2015, pp. 129–144.