Towards a First Step to Understand the Stealing Attack on

Zhen Cheng1,2∗ Xinrui Hou2,4∗ Runhuai Li1,2 Yajin Zhou1,2† Xiapu Luo3 Jinku Li4 and Kui Ren1,2

1Zhejiang University 2Alibaba-Zhejiang University Joint Research Institute of Frontier Technologies 3The Hong Kong Polytechnic University 4Xidian University

Abstract ured, will expose a JSON-RPC endpoint without any authen- tication mechanism enforced. As a result, they could be re- We performed the rst systematic study of a new attack on motely reached by attackers to invoke many privileged meth- Ethereum that steals . The attack is due to ods to manipulate the Ethereum account, on behalf of the the unprotected JSON-RPC endpoints existed in Ethereum account holder who is using the client 1. Though we have seen nodes that could be exploited by attackers to transfer the spot reports of stealing the Ether by hackers [11,15], there Ether and ERC20 tokens to attackers-controlled accounts. is no systematic study of such an attack. In order words, not This study aims to shed light on the attack, including mali- enough insights were provided on the attack. cious behaviors and prots of attackers. Specically, we rst designed and implemented a honeypot that could capture To this end, we performed a systematic study to under- real attacks in the wild. We then deployed the honeypot stand the cryptocurrency stealing attack on Ethereum. The and reported results of the collected data in a period of six purpose of our study is to shed lights on this attack, includ- months. In total, our system captured more than 308 mil- ing detailed malicious behaviors, attacking strategies, and lion requests from 1,072 distinct IP addresses. We further attackers’ prots. Our study is based on real attacks in the grouped attackers into 36 groups with 59 distinct Ethereum wild captured by our system. accounts. Among them, attackers of 34 groups were stealing Specically, we designed and implemented a system called the Ether, while other 2 groups were targeting ERC20 tokens. Siren. It consists of a honeypot that listens the default JSON- The further behavior analysis showed that attackers were RPC port, i.e., 8545, and accepts any incoming requests. To following a three-steps pattern to steal the Ether. Moreover, make our honeypot a reachable and valuable target, we reg- we observed an interesting type of transaction called zero gas ister it as an Ethereum full node on the Internet and prepare transaction, which has been leveraged by attackers to steal a real Ethereum account that has Ethers inside. In order to ERC20 tokens. At last, we estimated the overall prots of implement an interactive honeypot, we use a real Ethereum attackers. To engage the whole community, the dataset of cap- client (Go-Ethereum in our work) as the back-end. We then tured attacks is released on https://github.com/zjuicsr/eth- redirect all the incoming RPC requests to the back-end, ex- honey. cept for those that may cause damages to our honeypot. Our honeypot logs the information of the request, including the method that the attacker intends to invoke, and its parame- 1 Introduction ters. For instance, our honeypot logs account addresses that attackers intend to transfer the stolen Ether into. We call The Ethereum network [38] has attracted lots of attentions. these accounts as malicious accounts. We further crawl Users leverage this platform to transfer the Ether, the o- transactions from the Ethereum network and analyze them cial cryptocurrency of the network, or build DApps (decen- to identify suspicious accounts, which accept Ethers from tralized applications) using smart contracts. This, in turn, malicious accounts. Though we are unaware of real owners stimulates the popularity of the Ethereum network. of these accounts, they are most likely to be related to at- However, this popularity also attracts another type of tackers since transactions exist from malicious accounts to users, i.e., attackers. They exploit the insecure setting of them. We then estimate attackers’ prots by calculating the Ethereum clients, e.g., Go-Ethereum [7] and Parity [12], to income of malicious and suspicious accounts. steal cryptocurrencies. These clients, if not properly cong-

∗Co-rst authors with equal contribution. 1Note that, the RPC interface is intended to be used with proper authen- †Corresponding author. tication.

USENIX Association 22nd International Symposium on Research in Attacks, Intrusions and Defenses 47 Findings We performed a detailed analysis on the data • We deployed our system and reported attacks observed in collected in a six-month period 2. Some ndings are in the a period of six months. following. • We reported various ndings based on the analysis of col- • Attacks captured During a six-month period, our lected data. The dataset is released to the community for system captured 308.66 million RPC requests from further study. 1,072 distinct IP addresses. Among these IP addresses, The rest of the paper is structured as the follows: we intro- 9 of them are considered as the main source of attacks, duce the background information in Section 2 and present since they count around 83.8% of all requests. One the methodology of our system to capture attacks in Sec- particular IP address 89.144.25.28 sent the most RPC tion 3. We then analyze the attack in Section 4 and estimate requests, with a record of 101.73 million requests in prots in Section 5, respectively. We discuss the limitation total. In order to hide their real IP addresses, attackers of our work in Section 6 and related work in Section 7. We were leveraging the Tor network [20] to launch attacks. conclude our work in Section 8. We also observed that some IP addresses of worldwide universities were probing our honeypot, though none of them were invoking methods to steal cryptocurren- 2 Background cies. Most of these IP addresses are from the PlanetLab nodes [13]. In this section, we will briey introduce the necessary back- ground about the Ethereum network [38] to facilitate the • Cryptocurrencies targeted We grouped attackers understanding of this work. based on IP addresses and target Ethereum accounts. These accounts are the ones that attackers transferred 2.1 Ethereum Clients and the JSON-RPC the stolen cryptocurrencies into. In total, attackers are grouped into 36 clusters with 59 distinct malicious An Ethereum node usually runs a client software. There exist accounts. Among them, attackers of 34 groups were several clients, e.g., Go-Ethereum and Parity. Both clients stealing the Ether, while other 2 groups were targeting support remote procedure call (RPC) through the standard ERC20 tokens. JSON-RPC API [4]. When these clients are being started with a special ag, they will listen a speci port (e.g., 8545), and ac- • Steal the Ether We observed that attackers were fol- cept RPC requests from any host without any authentication. lowing a three-steps pattern to steal the Ether. They After that, functions could be remotely invoked on behalf of rst probed potential victims, and then collected nec- the account holder of the client, including privileged ones to essary information to construct parameters. After that, send transactions (or transfer cryptocurrencies). Note that, they launch the attack, either passively waiting for the though the Ethereum network is a P2P network, attackers account being unlocked by continuously polling the can discover and reach vulnerable Ethereum nodes directly account state, or actively lunching a brute-force attack through the HTTP protocol. to crack the user’s password to unlock the account. 2.2 Ethereum Accounts • Steal ERC20 tokens Besides the Ether, attackers were also targeting ERC20 tokens. We observed a type On the Ethereum platform, there exist two dierent types of transaction called zero gas transaction, in which the of accounts. One is externally owned account (EOA), and sender of a transaction does not need to pay any trans- another one is account. An EOA account action fee. We nd that attackers were leveraging this can transfer the Ether, the ocial in Ethereum, type of transactions to steal tokens from sher accounts to another account. An EOA account can deploy a smart that intended to scam other users’ Ethers, and exploit- contract, which in turn creates another type of account, i.e., ing the mechanism to gain numerous bonus the smart contract account. A smart contract is a program tokens. that executes exactly as it is set up to by its creator (the smart contract developer). In Ethereum, the smart contract Contributions In summary, this paper makes the follow- is usually programmed using the language [14], ing main contributions: and executes on a virtual machine called Ethereum Vir- tual Machine (EVM) [38]. Both types of accounts are de- • We designed and implemented a system that can capture noted in a format. For instance, the account ad- real attacks to steal cryptocurrencies through unprotected dress 0x6ef57be1168628a2bd6c5788322a41265084408a de- JSON-RPC ports of vulnerable Ethereum nodes. notes an (attacker’s) EOA account, while the address 0x87c9ea70f72ad55a12bc6155a30e047cf2acd798 denotes a 2The dataset is released on https://github.com/zjuicsr/eth-honey. smart contract.

48 22nd International Symposium on Research in Attacks, Intrusions and Defenses USENIX Association ERC20 tokens [1] are digital tokens designed and used on the Ethereum platform, which could be shared, exchanged for other tokens or real , e.g., US dollars. The com- munity has created standards for issuing a new ERC20 token using the smart contract. For instance, the smart contract should implement a function called transfer() to transfer the token from one account to another, and a balanceOf() function to query the balance of the token. The values of ERC20 tokens vary for dierent tokens at dierent times. For instance, the market capitalization of the Minereum to- ken [8] was more than 7 million US dollars [9] in August, Figure 1: The overview of our system. 2017, and is around 40,000 US dollars in March, 2019. There are multiple RPC methods that could be re- 2.3 Transactions motely invoked to send (or sign) a transaction on behalf of the account holder, including eth_sendTransaction and Transactions can be used to transfer the Ether, or invoke eth_signTransaction. Note that the account needs to be functions of a smart contract. Specically, the to eld of a unlocked before sending a transaction. This involves the transaction denotes the destination, i.e., an EOA account process to enter the user’s password. Otherwise, the invo- or a smart contract. For a transaction to send the Ether, cation of these methods will fail. In other words, in order to elds including gas and gasPrice specify the gas limit successfully steal the Ether from the victim’s account, his and the gas price of the transaction. Listing 1 (Section 4 or her account should be in the state of being unlocked. In on page 5) shows a real transaction to send the Ether Section 4.3, we will show that attackers continuously moni- to 0x63710c26a9be484581dcac1aacdd95ef628923ab, a mali- tor the victim’s account until it is unlocked by the user, or cious EOA account captured by our system. If the transaction launch a brute-force attack using a predened dictionary of is used to invoke a function of a smart contract, then the data popular passwords. eld species the name and parameters of the function to be invoked. Note that, a function is identied by a function sig- nature, i.e., the four bytes of the Keccak hash of the canonical 3 Methodology expression of the function prototype, including the function name, the parameter types. Listing 2 and 3 (Section 4 on Figure 1 shows the architecture of our system. In order to cap- page 6) shows the data eld and the signature of the invoked ture real attacks towards Ethereum nodes with unprotected function and its prototype. HTTP JSON-RPC endpoints, we design and implement a sys- Sending a transaction consumes gas, which is the name tem called Siren. Our system consists of a honeypot that of the unit that measures the work needs to be done. It is listens the default JSON-RPC port, i.e., 8545. Any attempt similar to the use of a liter of fuel consumed when driving to connect to this port of our honeypot will be recorded. a car. The actual cost of sending a transaction (transaction Our system forwards the request to the back-end Ethereum ‚ ƒ fee) is calculated as the product of the consumed gas and client( ) and returns the results ( ). The Ethereum client 3 the current gas price. The gas price is similar to the cost of used in our system is the Go-Ethereum . Our honeypot logs each liter of fuel that is paid for lling up a car. The smallest the RPC requests with parameters from the attacker, and „ unit of the Ether is Wei. A Gwei consists of a billion Wei, then imports them into a database( ). To further estimate while an Ether consists of a billion Gwei. The amount of gas prots of attackers, we crawl transaction records from the consumed in a transaction is accumulated during instruction Ethereum network ( ), and identify suspicious accounts that execution. Since the operation of transferring the Ether is are connected with attacker’s accounts. Our system combines † a sequence of xed instructions, thus the consumed gas is the transaction records ( ) of both malicious and suspicious ‡ always 21,000. accounts to generate the nal result ( ). The transaction fee is paid by the sender to the miner, who is responsible for packing transactions into blocks and 3.1 Ethereum Honeypot executing smart contract instructions. To earn a higher trans- action fee, miners tend to pack the transaction with a higher In order to capture real attacks and understand attacking be- gas price. Specically, the sender of a transaction can specify haviors, we build a honeypot. It can interact with JSON-RPC the gas price in the eld gasPrice (Section 4 on page 5) to requests that invoke for malicious intents, e.g., transfer- boost the chance of the transaction being packed. We have ring the Ether to attacker controlled malicious accounts. Our observed a trend of higher gas price in the transactions to 3In this paper, if not otherwise specied, the Ethereum client we dis- steal Ether (Figure 3 on page 8). cussed is the Go-Ethereum.

USENIX Association 22nd International Symposium on Research in Attacks, Intrusions and Defenses 49 honeypot logs the information of the API invocation, includ- Pretend as a valuable target The main purpose of ing the method name, parameters and etc., for later analysis. the attack is to steal cryptocurrencies. In order to make Moreover, our system connes the attacker’s behaviors that the attacker believe our honeypot is a valuable target, APIs invoked cannot cause real damages to our honeypot. we create a real Ethereum account with the address To this end, we design a front-end/back-end architec- 0xa33023b7c14638f3391d705c938ac506544b25c3 and trans- ture for our honeypot. Specically, we implement a front- fer some amounts of Ether into this account. Since the end that listens to the 8545 port and accepts any incoming Ethereum network is a public , the amount of the Ether HTTP JSON-RPC request from this port. We also have a inside the account could be obtained by querying on the real Ethereum client in the back-end, which runs as a full network. Our honeypot returns this account address to at- Ethereum node. This client accepts any local JSON-RPC re- tackers if they invoke the eth_accounts method to get a list quest from the front-end. That means our real Ethereum of accounts owned by our Ethereum node. We also return client is not publicly available to the attackers. If the invoked the real amount of Ether inside this account to attackers if API is inside a predened whitelist, our front-end will for- the eth_getBalance method is invoked. ward the request to the back-end client, and then forward Emulate a real transaction After obtaining the informa- the response to the attacker. The APIs might bring a nan- tion of the account owned by our Ethereum node and the bal- cial loss to our account are strictly forbidden. By doing so, ance of the account, attackers tend to steal the Ether by trans- our system protects the Ethereum node from being actually ferring it to accounts they controlled (malicious accounts). exploited, while at the same time, facilitates the informa- For instance, they could invoke the eth_sendTransaction tion logging of the requests since all the requests need to go method, which returns the hash value of a newly-created through the front-end. transaction. Attackers could check the return value of the However, there are several challenges that need to be ad- method invocation to get the status of Ether transfer. To dressed to make the system eective. For instance, the hon- make the attacker believe that the transaction is being pro- eypot should behave like a real Ethereum node. Otherwise, cessing, while not actually transferring any Ether from our ac- attackers could be aware the existence of our honeypot and count, we do not actually execute the eth_sendTransaction do not perform malicious activities. In the following, we will method. Instead, we log the parameters of this method invo- illustrate ways that our system leverages to attract attackers, cation, and return a randomly generated hash value to the and further describe how our honeypot works. attacker. Respond the probe requests The default port number Log RPC requests Our honeypot logs the attacker’s in- of the HTTP JSON-RPC service of an Ethereum node is 8545. voked methods, including the method name, parameters, Before launching a real attack, attackers usually send probe along with the metadata of the attack, such as the IP ad- requests to check whether this port is actually open. For in- dress and the time. All the data will be saved into a log le, stance,the attacker invokes the web3_clientVersion method which will be imported into a database. to check whether it is a valid Ethereum node. The front-end of our honeypot accepts any incoming JSON-RPC request, and responds with valid results, by relaying responses from the back-end Ethereum client. 3.2 Data Collection and Analysis

Advise the existence of our Ethereum node In order After capturing attacks and malicious account addresses, we to capture attacks, our system needs to attract attackers. will estimate prots gained by attackers. Our system lever- There are two options for this purpose. One option is that we ages transactions launched from these accounts to nd more passively wait for attackers by responding to the probing re- attacker-controlled accounts. For this purpose, we crawl the quest. However, this strategy is not ecient since the chance whole transactions from the Ethereum network. that our honeypot is happened to be scanned is low, given Our system rst downloads Ethereum transactions, then the large space of valid IP addresses. The second option is to imports them into a database. After that, we can conveniently actively attract attackers. Specically, to make our Ethereum combine data captured by our honeypot and transactions node (or our honeypot) visible to attackers, we register it on from the Ethereum network to generate nal analysis results. public websites that provide the list of full Ethereum nodes. The original purpose of this list is to speed up the discovery process of Ethereum nodes in the P2P network. However, this list provides valuable information to attackers, since 4 Attack Analysis they can nd potential targets without performing time- and resource-consuming port scanning process. It turns out this In this section, we will illustrate the data we collected, the strategy is really eective. Our honeypot receives incoming grouping process of attackers, and detailed information about probe requests shortly after being registered on the list. attacks to steal the Ether and ERC20 tokens.

50 22nd International Symposium on Research in Attacks, Intrusions and Defenses USENIX Association // Date: Jul 1 20:44:09 GMT+08:00 2018 // Source IP: 89.144.25.28 { "jsonrpc": "2.0", "method": "eth_sendTransaction", "params": [{ //The account address of our honeypot. "from": "0xa33023b7c14638f3391d705c938ac506544b25c3", //Attacker's account address. "to": "0x63710c26a9be484581dcac1aacdd95ef628923ab", "gas": "0x5208", "gasPrice": "0x199c82cc00", "value": "0x2425f024b7fd000", }], "id": 739296 }

Listing 1: The captured attack and the associated account address in the to eld. Figure 2: The number of daily RPC requests and distinct IP addresses captured by our honeypot. The seven red circles and 146.57.249.99) belong to the University of Minnesota. mean the data in these days is incomplete, either because We further used the reverse DNS lookup command to ob- our system was accidentally shut down, or the network was tain the domain name associated with these IP addresses. not stable. It turns out that all of them are associated with the Plan- etLab [13]. For instance, the domain names of the pre- 4.1 Data Overview vious two IP addresses are planetlab1.dtc.umn.edu and planetlab2.dtc.umn.edu, respectively. Requests from these We deployed our system on a virtual machine in Alibaba IP addresses are not performing malicious activities, e.g., cloud, and collected the data in a period of six months, i.e., transferring the Ether to other accounts. Most of them are from July 1 to August 31 in the year 2018, and November merely probing our honeypot for information collection, e.g., 1, 2018 to February 28, 2019. Unfortunately, there are three invoking eth_getBlockByNumber. Though the exact inten- days (from July 24 to July 26) when the virtual machine was tion of collecting such information is unknown, we believe accidentally shut down, and four days (July 14, 15, 27 and these requests are mainly for a research purpose. 30) when the network was not stable. During these days, the data is either missing or incomplete. Figure 2 shows the Abuse of the Tor network and cloud services Attack- number of daily RPC requests and distinct IP addresses of ers are leveraging the Tor network and cloud services to these requests. hide their identities. For instance, some IP addresses belong In total, our system observed 308.66 million RPC requests to popular cloud services, e.g., , DigitalOcean and from 1,072 distinct IP addresses. In average, we received 1.72 etc. Among the 1,072 distinct IP addresses, 370 of them are million RPC requests each day (excluding the incomplete identied as Tor gateways [20](299 of them performed ma- data.) In terms of IP addresses, the average daily number is licious behaviors, e.g., trying to steal the Ether.) All these 62. This number reached its peak value (142) on December IP addresses belong to the second group (Section 4.2.) They 24, 2018. are from 104 dierent ISPs in 39 countries. Using the Tor Among these RPC requests, 9 dierent IP addresses are network to hide the real IP addresses make the tracing of main sources of attacks, given that they contribute most attackers more dicult. of RPC requests in our dataset. These 9 IP addresses sent . around 258 70 million requests in total, which counts around 4.2 Grouping Attackers Accounts 83.8% of all requests. It’s worth noting that, the IP address 89.144.25.28 sent the most RPC requests. It sent 101.73 mil- After collecting the data, our next step is to group attackers. lion requests in total, accounting for 33.0% of the requests However, due to the anonymity property of the Ethereum we received. We believe such an aggressive behavior is to network, it is hard to group them based on their identities. increase the possibility of stealing the Ether, since the time In this paper, we take the following ways to group attackers. window to transfer the Ether only exists when the user ac- First, we directly retrieve attackers’ Ethereum accounts count is unlocked. We also observed that attacks from this through the parameters of RPC requests, and group them IP address ceased in several days, e.g., from August 15 to 17. based on these accounts. For instance, the parameter to of RPC requests from universities worldwide Interest- the method eth_sendTransaction denotes the destination ingly, some RPC requests are from IP addresses that belong account of a transaction that the Ether will be transferred to universities. Specically, our honeypot received requests into. Attackers use this method to transfer (steal) the from 66 IP addresses of 39 universities in 13 countries or Ether to their controlled accounts. Listing 1 shows the regions. Among them, 37 IP addresses are from universities parameters (in the JSON format) of a captured malicious in the USA. For instance, two IP addresses (146.57.249.98 transaction launched by an attacker. The value of the to

USENIX Association 22nd International Symposium on Research in Attacks, Intrusions and Defenses 51 //The parameters of invoking eth_sendRawTransaction. { ing, we will use a real example to illustrate the steps. List- "jsonrpc": "2.0", "method": "eth_sendRawTransaction", ing 2 shows a captured attack of the RPC request to in- "params": ["0xf8a682125f8082ea60941a95b271b0535d15fa49932daba31ba6 eth_sendRawTransaction 12b5294680b844a9059cbb0000000000000000000000000fe07dbd voke the method . We rst de- 07ba4c1075c1db97806ba3c5b113cee00000000000000000000000 params to 00000000000000000000000000000000000bebc2001ca095e64177 code the eld to obtain the eld (its value is 86f699db2dc195f47662c412bb125b8419b9af030ac237d64c5a92 0x1a95b271b0535d15fa49932daba31ba612b52946 50a0357a79a314eecd583f9be2235fd627d85c9af8fe292f9e47d4 ). It turns fa261efc0487bc"], "id": 2 out that this address is not an EOA account, but a smart } //The decoded params field of the invocation. contract address of the Minereum token [8], whose peak { "nonce": 4703, market capitalization was more than 7 million US dollars "gasPrice": 0, data "gasLimit": 60000, in August 2017 [9]. We further decode the eld to re- "from":"0x00a329c0648769a73afac7f9381e08fb43dbea72", //This is a smart contract address. trieve the invoked function in the smart contract and its "to": "0x1a95b271b0535d15fa49932daba31ba612b52946", "value": 0, parameters. The result is shown in Listing 3. We can see the "data": "0xa9059cbb0000000000000000000000000fe07dbd07ba4c1075c1db9 7806ba3c5b113cee00000000000000000000000000000000000000000 malicious account address that receives the stolen ERC20 00000000000000000bebc200", 0x0fe07dbd07ba4c1075c1db97806ba3c5b113cee0 "v": 28, token is . "r": "0x95e6417786f699db2dc195f47662c412bb125b8419b9af030ac237d64c 5a9250", Third, we group the addresses from the previous two steps "s": "0x357a79a314eecd583f9be2235fd627d85c9af8fe292f9e47d4fa261efc 0487bc" based on their IP addresses. We recorded the source IP ad- } dress of each request, and if multiple attackers’ Ethereum account addresses are associated with a same IP address, then we combine these accounts into one group. However, the use Listing 2: A captured invocation of the method of the Tor network may make this strategy ineective, since eth_sendRawTransaction and the decoded params eld of their real IP addresses are unknown. In this case, we further the parameters. group them based on the parameters used in launching the attacks, e.g., the id eld in Listing 1. For instance, requests //Function prototype. Function: transfer(address _to, uint256 _value) from the Tor network have a same id eld. Based on this Method ID: 0xa9059cbb _to: 0x0fe07dbd07ba4c1075c1db97806ba3c5b113cee0 observation, we believe those requests were initiated from _value: 200000000(0xbebc200) attackers in a same group, i.e., the group 2 (Table1). Using on previous strategies, we group attackers into 36 Listing 3: The function to be invoked of the smart contract groups. The detailed information of each group is shown in and its parameters. The _to eld contains the attacker’s ac- Table 1. Among them, attackers from 34 groups (from the count address that the token will be transferred to. group 1 to the group 34) were stealing the Ether and other two groups were targeting ERC20 tokens (the group 35 and 36). In the following, we will present the detailed results of eld is 0x63710c26a9be484581dcac1aacdd95ef628923ab, attackers’ behaviors. which is the attacker-controlled account address. In our system, we monitor the to eld of methods 4.3 The Analysis of Ether Stealing eth_sendTransaction and eth_signTransaction that directly transfer the Ether, and other two methods The attackers from the group 1 to the group 34 are stealing eth_estimateGas and miner_setEtherBase. Note that, the Ether. They are following a three-steps pattern to perform though the eth_estimateGas is used to estimate the gas the attack. consumption of a transaction and does not actually send Step 1 - Probing potential victims The rst step to the transaction, executing this method on our honeypot launch the attack is to locate potential victims that have is still considered as a suspicious action since it usu- insecure HTTP JSON-RPC endpoints. Attackers could ob- ally follows a real transaction afterwards. The method tain potential victims by downloading a list of full Ethereum miner_setEtherBase is used to change the etherbase (or nodes, or performing a port scanning process to nd the ma- ) account of a miner. This is the address that will be chines with target port number (8545 in our study) opening. rewarded when a new block is mined by the miner node. By Then attackers issue RPC requests to determine whether the changing this address, the attacker can obtain the reward found IP address is an Ethereum node or even a miner node Ether, on behalf of the miner. that will be useful to send a zero gas transaction (we will Second, we indirectly retrieve attackers’ account addresses, discuss this type of transaction in Section 4.4). and use them for grouping. This is the case when attackers The mostly used commands for Ethereum node probing steal ERC20 tokens by calling the standard transfer() [17] are shown in Table 2. Specically, net_version is used to function dened in the ERC20 token standard [1]. These identify the client’s current network id to check whether the addresses are not in the parameters of the transaction. Ethereum node is running on the mainnet or a testnet. As the However, we can obtain them by retrieving parameters name indicates, the usage of the testnet is for testing purpose, of the smart contract method invocation. In the follow- and the Ether on this network has no value. By invoking this

52 22nd International Symposium on Research in Attacks, Intrusions and Defenses USENIX Association Table 1: The result of grouping attackers.

# Addresses # of IP # of RPC calls Max calls per day First capture date Last capture date Days of activity 1 0x6a141e661e24c5e13fe651da8fe9b269fec43df0 57 72,915,681 1,207,185 Jul. 1, 2018 Feb. 28, 2019 178 0x6e4cc3e76765bdc711cc7b5cbfc5bbfe473b192e 0x6ef57be1168628a2bd6c5788322a41265084408a 0x7097f41f1c1847d52407c629d0e0ae0fdd24fd58 0xe511268ccf5c8104ac8f7d01a6e6eaaa88d84ebb 2 0x581061c855c24ca63c9296791de0c9a1a5a44fcf 309 363,860 167,183 Jul. 2, 2018 Feb. 28, 2019 166 0x5fa38ab891956dd35076e9ad5f9858b2e53b3eb5 0x8cacaf0602b707bd9bb00ceeda0fb34b32f39031 0xab259c71e4f70422516a8f9953aaba2ca5a585ae 0xd9ee4d08a86b430544254ff95e32aa6fcc1d3163 0x88b7d5887b5737eb4d9f15fcd03a2d62335c0670 0xe412f7324492ead5eacf30dcec2240553bf1326a 3 0xd6cf5a17625f92cee9c6caa6117e54cbfbceaedf 14 13,315,318 365,878 Jul. 16, 2018 Feb. 25, 2019 78 0x21bdc4c2f03e239a59aad7326738d9628378f6af 0x72b90a784e0a13ba12a9870ff67b68673d73e367 4 0x04d6cb3ed03f82c68c5b2bc5b40c3f766a4d1241 1 101,731,595 4,802,304 Jul. 1, 2018 Nov. 5, 2018 63 0x63710c26a9be484581dcac1aacdd95ef628923ab 5 0xb0ec5c6f46124703b92e89b37d650fb9f43b28c2 6 326,154 9,711 Jul. 2, 2018 Dec. 3, 2018 64 6 0x1a086b35a5961a28bead158792a3ed4b072f00fe 3 6,791,438 3,346,904 Nov. 1, 2018 Feb. 28, 2019 21 0x73b4c0725c900f0208bf5febb36856abc520de26 0xaff4778d8d05e9595d540d40607c16f677c73cca 0xec13837d5e4df793e3e33b296bad8c4653a256cb 7 0x241946e18b9768cf9c1296119e55461f22b26ada 1 7,750,800 118,127 Jul. 2, 2018 Feb. 28, 2019 151 8 0x8652328b96ff12b20de5fdc67b67812e2b64e2a6 2 3,569,924 281,975 Jul. 1, 2018 Jul. 30, 2018 28 9 0xff871093e4f1582fb40d7903c722ee422e9026ee 1 3,522 1,128 Jul. 2, 2018 Aug. 31, 2018 27 10 0x6230599f54454c695b5cd882064071fc39e6e562 1 13 13 Jul. 5, 2018 Jul. 5, 2018 1 11 0x2c5129bdfc6f865e17360c551e1c46815fe21ec8 1 618 618 Jul. 5, 2018 Jul. 5, 2018 1 12 0xeb29921d8eb0e32b2e7106afca7f53670e4107e5 1 5 5 Jul. 29, 2018 Jul. 29, 2018 1 13 0xe231c73ab919ec2b9aaeb87bb9f0546aa47581b1 1 10 5 Jul. 4, 2018 Jul. 17, 2018 5 14 0x5c8404b541881b9999ce89c00970e5e8862f8e88 3 80 46 Jul. 10, 2018 Jul. 15, 2018 5 15 0x5e87bab71bbea5f068df9bf531065ce40a86ebe4 1 274 274 Jul. 11, 2018 Jul. 11, 2018 1 16 0x97743cc5a168a59a86cf854cf04259abe736006a 3 235,213 71,521 Jul. 10, 2018 Jul. 17, 2018 8 0x9d6d759856bfcabf6f405f308d450b79e16dd4e2 17 0x02a4347035b7ba02d79238855503313ecb817688 3 11,246,017 175,417 Nov. 13, 2018 Feb. 28, 2019 97 0xcb31bea86c3becc1f62652bc8b211fe1bd7f8aed 18 0xe128bb377f284d2719298b0d652d65455c941b5b 1 277 147 Nov. 12, 2018 Nov. 16, 2018 4 19 0xb744d5f73d27131099efee0b70062de6f770a102 2 237,481 64,462 Dec. 18, 2018 Feb. 14, 2019 17 20 0x0e0a930fb51c499b624d6ca56fdd9c95c5bf2e06 2 59,842 37,608 Aug. 4, 2018 Feb. 7, 2019 38 0x2c022e9a0368747692b7bd532c435c7a78dc447d 0x3334f7f8bcf593794b01089b6ff4dc63fe023dfe 0x884aa595c10b3331ce551c2d9f905e52e21fa0bb 0xef462edb8880c4fd0738e4d3e9393660b9c5ac72 21 0x9781d03182264968d430a4f05799725735d9844d 8 38,558 13,559 Aug. 28, 2018 Aug. 31, 2018 4 22 0x98c6428fbca6c0ff97570d822dd607f8a55080e5 6 270 140 Aug. 2, 2018 Aug. 5, 2018 2 23 0xa0b0209a04398cb61d845148623e68b3eff8f8cb 1 135 135 Jul. 9, 2018 Jul. 9, 2018 1 24 0x21d8976138a2b280d441fd7b12456a1193cb2baf 1 18,597 2,285 Aug. 10, 2018 Nov. 9, 2018 14 25 0xfed69981c21b96ff37fc52f9e19849126624ddfd 5 963 825 Aug. 13, 2018 Aug. 19, 2018 3 26 0x31c3ecd12abe4f767cb446b7326b90b1efc5bbd9 3 440,962 49,995 Nov. 1, 2018 Feb. 14, 2019 41 27 0x5f622d88cd745ebb8ff2d4d6b707204c65243438 1 2,782 113 Nov. 1, 2018 Feb. 28, 2019 116 28 0xf2565682d4ce75fcf3b8e28c002dfc408ab44374 1 9 3 Dec. 22, 2018 Jan. 11, 2019 5 29 0xc97663c1156422e2ad33580adab45cad33cf7698 1 3,298 3,297 Feb. 3, 2019 Feb. 10, 2019 2 30 0xc6c42a825555fbef74d21b3cb6bfd7074325c348 9 73,302 4,327 Nov. 4, 2018 Jan. 6, 2019 22 31 0x454d7320d5751de29074a55ac95bbde312dd7615 1 11 11 Feb. 5, 2019 Feb. 5, 2019 1 32 0x4e25e7e76dbd309a1ab2a663e36ac09615fc81eb 1 24 23 Jan. 15, 2019 Jan. 16, 2019 2 33 0xa6a21375ca42dcc26237f3e861d58f88fe72eab2 1 256 256 Nov. 27, 2018 Nov. 27, 2018 1 34 0xb703ae04fd78ab3b271177143a6db9e00bdf8d49 1 1,345 72 Dec. 30, 2018 Feb. 21, 2019 32 35 0x0fe07dbd07ba4c1075c1db97806ba3c5b113cee0 11 536,612 27,062 Jul. 1, 2018 Feb. 28, 2019 144 36 0xaa75fb2dcac2e3061a44c831baf0d4c2d4f92fd7 5 26,991 9,510 Jul. 16, 2018 Nov. 9, 2018 11 0xffecffe94c3e87987454f2392676ccdb98b926f8

USENIX Association 22nd International Symposium on Research in Attacks, Intrusions and Defenses 53 105 89001 Table 2: Most used commands for probing. Normal Attacker

11987 12968 Command # of IP addresses # of RPC requests 104

net_version 122 4,822,620 2792 3295 rpc_modules 81 3,815 1053 web3_clientVersion 103 4,495,312 103 eth_getBlockByNumber 325 1,190,445 417 252 eth_blockNumber 225 27,019,686 Gas Price (Gwei) eth_getBlockByHash 214 1,633 102

20

1 2 3 4 5 6 7 8 Table 3: Commands used to prepare attacking parameters. Group Number

Command # of IP addresses # of RPC requests Figure 3: The comparison of the gas price in the transactions eth_accounts 615 27,040,164 of attackers and normal users. The typical gas price is 21 eth_coinbase 64 87,442 Gwei, while the gas price of attackers’ transactions is much personal_listAccounts 11 95 personal_listWallets 5 173,243 higher. eth_gasPrice 21 63,133 eth_getBalance 493 93,585,372 eth_getTransactionCount 63 2,411,504 fee. In order to get the balance of the victim’s account, the method eth_getBalance is used. • gasPrice: The attacker could set a high gasPrice to in- method, the attacker could nd the right targets running the crease the chance of the transaction being executed (or Ethereum mainnet. The rpc_modules command returns all packed into a block by miners.) For instance, the at- enabled modules. By probing this information, attackers can tacker (0x21bdc4c2f03e239a59aad7326738d9628378f6af) get the information of enabled modules and then invoke the tends to use a much higher gasPrice in the transaction to APIs inside each module accordingly. Besides the previously steal Ether. Figure 3 shows the gas price of transactions discussed two methods, other ones shown in Table 2 are also from attackers and normal users. We will illustrate it later serving the purpose of collecting client information. in this section. Step 2 - Preparing aacking parameters After locating Step 3 - Stealing Ether In order to successfully send a potential victims, attackers need to prepare the necessary transaction, it needs to be signed using the victim’s private data to launch further attacks. In order to steal the Ether, the key. However, the private key is locked by default, and a attacker needs to send an Ethereum transaction with valid password is needed to unlock it. We observed two dier- parameters. Specically, each transaction needs from_address ent behaviors that are leveraged by attackers to solve this and to_address as the source and destination of a transaction, problem. and other optional ones including gas, gasPrice, value and nonce. In order to make the attack succeed, valid parameters • Continuously polling: Attackers continuously invoke the should be prepared to steal the Ether. methods, i.e., eth_sendTransaction or eth_signTransaction in the background. If a legitimate user wants to send a • from_address: The from_address in the transaction is the transaction at the same time, then he or she will unlock victim’s Ethereum account address. The attacker can ob- the account by providing the password. This leaves a small tain this value through invoking the following methods, time window that the attacker’s attempt to send a transac- including eth_accounts, eth_coinbase, personal_listAccounts, tion will succeed. personal_listWallets. However, in order to successfully launch the attack, there • to_address: The to_address in the transaction species the are still two challenges. First, the time window is really destination of the transaction. Attackers will set this eld small. Attackers should happen to invoke the method to to the account under their control. send a transaction at the same time when the user is un- locking the account. To increase the chance of a success- • value: This is the value of Ether that will be transferred ful attack, the operation to send the transaction should into the to_address. In order to maximize their income, be very frequent. That’s the reason of our observation the attacker tends to transfer all the Ether in the victim’s that some attackers are repeatedly invoking the previously account, leaving a small amount to pay the transaction mentioned methods at a very high frequency, nearly 50

54 22nd International Symposium on Research in Attacks, Intrusions and Defenses USENIX Association requests per-second. Second, since the attacker is sending zero gas transaction, which we observed in our dataset. the transaction at the same time with the user (i.e., when It exploits the packing strategy of some miners to send trans- the user is unlocking his account), his transaction may actions without paying any transaction fee. By using this fail if the user’s transaction is accepted by the miner at type of transaction, attackers could perform malicious ac- rst and the remaining balance of the account will not be tivities to steal ERC20 tokens from addresses with leaked sucient for the attacker’s transaction. In order to ensure private key, or exploit the AirDrop mechanism of ERC20 that his transaction will be accepted by miner in a timely smart contracts to gain extra bonus tokens with nearly zero fashion, the attacker will use a much higher value of the cost. gasPrice than normal transactions to bribe miners. Zero gas transaction Sending a transaction usually con- Figure 3 shows the gas price of transactions from attack- sumes gas (Section 2.3). The actual cost is calculated as the ers and normal users. Specically, we rst calculate the product of the amount of gas consumed and the current gas average gas price of captured transactions from the group price. The amount of gas consumed during a transaction de- 1 to 8 (the solid line in the Figure). Then we calculate pends on the instructions executed in the Ethereum virtual the average gas price of transactions of normal users in machine, while the gas price is specied by the user who six months (the line in the Figure). It turns out that sends the transaction. If it is not specied, a default gas price the gas price from attacker’s transactions is much higher will be used. (from 15 times to 4,500 times) than the value of a normal Interestingly, our honeypot captured many attempts of transaction. Setting a higher gas price can increase the sending transactions with a zero value in the gasPrice eld. speed that their transactions are packed into a block. This This brings our attention for a further investigation. We want strategy is very eective, and we have observed several to understand whether such transactions could be success- cases that the transaction with a higher gasPrice succeed, ful, and the intentions for sending such transactions. After while the ones with lower gasPrice failed [18, 19]. performing multiple experiments, transactions with a zero gas price received through the p2p network are not accepted • Brute force cracking: Besides the polling strategy, some by the miner, and will be discarded as invalid ones. However, attackers are leveraging the brute force attack to guess if such a transaction is created and launched on the miner the password. Specically, they try to unlock the account node itself (i.e., the node that successfully mines a new block using the password in a predened dictionary. Since the is sending a zero gas transaction), then the transaction will Ethereum client does not limit the number of wrong pass- be packed into the block by the miner and accepted by the word attempts during a certain time period, this attack is network. eective if the victim uses a weak password. For instance, This explains the existences of such transactions captured attackers from the group 11 leveraged this strategy and by our honeypot. In particular, attackers were launching tried a dictionary with more than 600 weak passwords, zero gas transactions on every vulnerable Ethereum node, e.g., qwerty123456, margarita and 192837465. Another at- in hope that the node is a miner node that is successfully tacker from the group 1 took the same way, but only tried mining a new block. Though the chance looks really slim, one password (ppppGoogle). The reason why this specic we found several successful cases in reality, e.g., the rst password was used is unknown. However, we think it may several transactions in the block 5899499 [3]. Most of the be the default password for some customized Ethereum transactions are transferring ERC20 tokens to the address clients. 0x0fe07dbd07ba4c1075c1db97806ba3c5b113cee0, which is a Interestingly, after a successful try to unlock the account, malicious account owned by the attacker in group 20. the attacker will set a relatively long timeout value by Aack I: stealing tokens from fisher accounts The invoking the personal_unlockAccount. By doing so, the rst type of attack is leveraging the zero gas transactions account will not be locked again in a long time period so to steal tokens from sher accounts. In order to understand that the attacker can perform further attacks much easier. this attack, we rst explain what the sher account is in the following. 4.4 The Analysis of ERC20 Token Stealing The sher account means that some attackers intentionally leak the private key of their Ethereum accounts on Internet. Attackers from two groups (group 35 and 36) are targeting They also transfer some ERC20 tokens to the accounts as the ERC20 tokens. ERC20 is a technical standard used by smart bait. Since the private key of the account is leaked, other users contracts on the Ethereum network to implement exchange- could use the private key to transfer out the ERC20 tokens. able tokens [1]. The ERC20 token can be viewed as a kind However, there is one problem in this process. In order to of cryptocurrency that could be sold on some markets, thus transfer the ERC20 tokens, the account should have some becoming valuable targets. Ethers to pay the transaction fee. As a result,one may transfer Before illustrating the detailed attacking behaviors, we some Ethers into this account, in hope to get the ERC20 will rst discuss an interesting type of transaction called tokens. Unfortunately, after transferring the Ether into this

USENIX Association 22nd International Symposium on Research in Attacks, Intrusions and Defenses 55 will be rewarded with 18,895 LEN tokens. Hence, the at- Table 4: The top ten ERC20 tokens that attackers are target- tacker could create a large number of new accounts, and ing. then transfer LEN tokens to the attacker’s address (address 0xffecffe94c3e87987454f2392676ccdb98b926f8 in group # of RPC Token ERC20 token addresses requests name 36). By doing so, the new account will receive a bonus token, which will be transferred to the attacker’s account, while at 0x1a95b271b0535d15fa49932daba31ba612b52946 11,788 MNE 0xee2131b349738090e92991d55f6d09ce17930b92 8,998 DYLC the same time the attacker’s account will also receive the 0x0775c81a273b355e6a5b76e240bf708701f00279 8,099 BUL bonus. We observed many attempts of such transactions 0xbdeb4b83251fb146687fa19d1c660f99411eefe3 7,735 SVD using zero gas price, with 7,058 dierent source account 0x0675daa94725a528b05a3a88635c03ea964bfa7e 7,359 TKLN addresses and one destination address (the attacker’s ad- 0x87c9ea70f72ad55a12bc6155a30e047cf2acd798 7,058 LEN 0x4c9d5672ae33522240532206ab45508116daf263 5,510 VGS dress). This transaction does not consume any gas, and the 0x23352036e911a22cfc692b5e2e196692658aded9 4,011 FDZ attacker could be rewarded with ERC20 tokens. By using 0xc56b13ebbcffa67cfb7979b900b736b3fb480d78 2,219 SAT this method, the attacker even becomes a large holder of this 0x89700d6cd7b77d1f52c29ca776a1eae313320fc5 1,708 PMD token (2.4%)[16]. account, the Ether will be transferred out to some accounts 5 Transaction Analysis immediately by attackers. That’s the reason why such an account is called the sher account. The main purpose of After capturing malicious accounts and analyzing the de- leaking the private key is to seduce others transferring Ether tailed attackers’ behaviors, we further estimate prots of into the sher account. attackers. Though we can directly get the estimation by cal- For instance, there is a sher account whose address culating the income of malicious accounts, attackers may is 0xa8015df1f65e1f53d491dc1ed35013031ad25034 [2]. The use other account addresses that have not been captured by attacker bought 75,000 ICX (a ERC20 token) as the shing our honeypot. We call these addresses that are potentially bait that values around 66,000 US dollars. Occasionally, the controlled by attackers as suspicious accounts. sher released the private key of that account on the Inter- In our system, we take the following steps to detect sus- net. Anyone who transfers the Ether into this account and picious accounts. The basic idea is if the attacker transfers hopes to obtain the ICX token will be trapped to lose the the Ether from a malicious account to any other account, it transferred Ether. is highly possible that the destination account is connected Interestingly, by leveraging the zero gas transaction pre- with the attacker. The attacker has no reason to transfer viously discussed, attackers could steal the ERC20 tokens the Ether to an account that has no relationship with. Note in the sher account. Specically, attackers could send the that, the attacker could transfer the Ether to a cryptocur- transactions to transfer the ERC20 tokens in the sher ac- rency market, where he can exchange it with other types of count with zero gas price. If the transaction is successful, cryptocurrencies or real currency. These markets should be 4 then the attackers will obtain the ERC20 tokens without any removed from suspicious accounts in our study . cost. To this end, we used a similar idea of the taint analysis [35] In our dataset, the user in group 35 (the address is to nd suspicious accounts. Specically, we treat malicious 0x0fe07dbd07ba4c1075c1db97806ba3c5b113cee0) was per- accounts captured by our system as the taint sources, and forming this type of attack. In total, the attacker sent 61,158 propagate the taint tags through the transaction ows until RPC requests, stealing 161 dierent types of ERC20 tokens. reaching the taint sinks, i.e., the cryptocurrency markets. We show the detailed information of the top ten ERC20 to- We also stop this process if the number of accounts tra- kens that this attacker is targeting in Table 4. We observed versed reaches a certain threshold. In our study, we use several dierent IP addresses (62.75.138.194, 77.180.167.78, 3 as the threshold. All the accounts in the path from the 77.180.200.1, 92.231.160.88, 92.231.169.137, 95.216.158.152 and taint source to the taint sink are considered tainted and sus- etc.) from this attacker. picious, as long as the endpoint is a cryptocurrency mar- ket. Other nodes are marked as unknown ones, since we Aack II: Exploiting the airdrop mechanism Airdrop do not have further knowledge about whether the nodes is a marketing strategy that the token holders would receive are suspicious or not. Figure 4 shows an example of this bonus tokens based on some criteria, e.g., the amount of total process to detect the suspicious accounts from the mali- tokens they hold. The conditions to send out bonus tokens cious one 0xe511268ccf5c8104ac8f7d01a6e6eaaa88d84ebb. depend on the individual token maintainer. In this gure, the cryptocurrency market nodes are marked Some attackers are leveraging the zero gas transaction in the house symbol, and the original attacker we captured is to obtain the free LEN tokens. Specically, the LEN token has an airdrop strategy that if a new user A sends any 4We obtained the addresses of cryptocurrency markets from the Ether- amount of LEN token to the user B, then both A and B scan website [6].

56 22nd International Symposium on Research in Attacks, Intrusions and Defenses USENIX Association Table 5: Our estimation of attackers’ prots in Ether and US dollars. The price of one Ether is around 139 US dollars (March, 2019). We remove the addresses with zero prot from the table.

Malicious Plus Suspicious # Addresses Ether USD Ether USD 1 0x6a141e661e24c5e13fe651da8fe9b269fec43df0 116.91 $16,280.23 814.45 $113,412.00 0x6e4cc3e76765bdc711cc7b5cbfc5bbfe473b192e 56.16 $7,820.34 794.67 $110,657.59 0x6ef57be1168628a2bd6c5788322a41265084408a 37.79 $5,261.74 1,420.06 $197,743.19 0x7097f41f1c1847d52407c629d0e0ae0fdd24fd58 281.44 $39,191.07 1,331.74 $185,444.98 0xe511268ccf5c8104ac8f7d01a6e6eaaa88d84ebb 152.26 $21,201.86 1,332.53 $185,554.52 0x8652328b96ff12b20de5fdc67b67812e2b64e2a6 37.75 $5,256.18 1,066.31 $148,483.43 0xff871093e4f1582fb40d7903c722ee422e9026ee 0.00 $0.69 9.34 $1,300.01 2 0x5fa38ab891956dd35076e9ad5f9858b2e53b3eb5 48.24 $6,716.88 94.28 $13,129.14 0x8cacaf0602b707bd9bb00ceeda0fb34b32f39031 0.00 $0.14 10.66 $1,483.80 0xab259c71e4f70422516a8f9953aaba2ca5a585ae 2.53 $351.64 4.18 $581.92 0xd9ee4d08a86b430544254ff95e32aa6fcc1d3163 54.12 $7,535.80 55.72 $7,759.26 0x88b7d5887b5737eb4d9f15fcd03a2d62335c0670 0.24 $33.41 0.24 $33.41 0xe412f7324492ead5eacf30dcec2240553bf1326a 0.24 $33.96 0.24 $33.96 0x241946e18b9768cf9c1296119e55461f22b26ada 1.53 $213.74 1.53 $213.74 0x9781d03182264968d430a4f05799725735d9844d 50.32 $7,006.89 61.47 $8,560.18 4 0x04d6cb3ed03f82c68c5b2bc5b40c3f766a4d1241 2.38 $331.13 2.38 $331.13 0x63710c26a9be484581dcac1aacdd95ef628923ab 19.44 $2,706.79 38.88 $5,413.47 0xb0ec5c6f46124703b92e89b37d650fb9f43b28c2 0.87 $120.84 1.64 $227.89 6 0x1a086b35a5961a28bead158792a3ed4b072f00fe 80.22 $11,170.51 4,821.68 $671,419.10 0x73b4c0725c900f0208bf5febb36856abc520de26 1.10 $153.12 1.10 $153.12 0xec13837d5e4df793e3e33b296bad8c4653a256cb 1.62 $226.21 1.62 $226.21 11 0x2c5129bdfc6f865e17360c551e1c46815fe21ec8 113.93 $15,864.25 506.86 $70,580.16 15 0x5e87bab71bbea5f068df9bf531065ce40a86ebe4 0.05 $6.42 0.05 $6.42 17 0x02a4347035b7ba02d79238855503313ecb817688 4.30 $598.46 4.30 $598.46 0xcb31bea86c3becc1f62652bc8b211fe1bd7f8aed 0.21 $29.19 0.21 $29.19 0xd6cf5a17625f92cee9c6caa6117e54cbfbceaedf 2,030.19 $282,704.44 2,030.19 $282,704.44 0x21bdc4c2f03e239a59aad7326738d9628378f6af 357.78 $49,820.26 58,692.91 $8,172,988.20 0x72b90a784e0a13ba12a9870ff67b68673d73e367 558.32 $77,746.63 59,298.45 $8,257,309.40 26 0x31c3ecd12abe4f767cb446b7326b90b1efc5bbd9 0.10 $13.23 0.10 $13.23 28 0xf2565682d4ce75fcf3b8e28c002dfc408ab44374 173.99 $24,228.78 866.10 $120,604.60 0xb703ae04fd78ab3b271177143a6db9e00bdf8d49 8.02 $1,116.77 8.02 $1,116.77 30 0xc6c42a825555fbef74d21b3cb6bfd7074325c348 1.50 $208.36 1.50 $208.36 32 0x4e25e7e76dbd309a1ab2a663e36ac09615fc81eb 0.04 $6.27 0.05 $7.34 Total 4,193.58 $583,956.23 133,273.46 $18,558,328.61

USENIX Association 22nd International Symposium on Research in Attacks, Intrusions and Defenses 57 transaction. Since the transaction to send the Ether in our 50.7232 0x2ff5 0x97e1 49.9979 honeypot does not actually happen, the return value is an

62.7659 62.7686 0xc203 0xd9ba CryptoMarket invalid one (a randomly generated value). The attackers can

0xe511 17.3120 387.9470 also simply send some uncommon commands and observe 0x4a0d 0x0fd0 AttackAccount 21.6300 the return value to detect the honeypot. Nevertheless, it is 0x8d9c SuspiciousAccount 164.1287 an open research question to propose more eective coun-

0.3749 0xb850 0x5861 Unknown termeasures to improve the honeypot. 119.7998

394.5225 In this paper, we take a conservative way to detect suspi- 0x4ce9 cious accounts and estimate prots of attackers. Specically, 38905.8500 0xa23d 0x229b we leverage the knowledge of whether an address belongs to a cryptocurrency market and mark the tainted accounts whose Ether eventually ows into cryptocurrency markets Figure 4: One example of detecting suspicious accounts as suspicious. However, the knowledge of the mapping be- through the transaction analysis. The house ones are the tween addresses and cryptocurrency markets is incomplete, cryptocurrency markets, the one is the malicious ac- since these addresses are manually labelled. Some suspicious count. The box ones without background color are suspicious accounts may be identied as unknown ones, hence intro- accounts, while the ones with gray background are unknown ducing false negatives to our work. Moreover, our estimation accounts. is based on the attacker’s addresses collected by our honey- pot (in six months). There do exist attackers missed by our system, and prots of these attackers are not included in our marked as a circle. Box nodes without background color are estimation. We believe the total income of attackers in the the suspicious accounts we identied, and others with gray wild is much higher than our estimation. background color are unknown addresses. The line between In this paper, attackers are exploiting the unprotected two nodes denotes transactions between them. We also put JSON-RPC interface to launch attacks. Though it is simple the number of Ether transferred above the line. In total, we to x the problem by changing the conguration of the identied 113 suspicious addresses, and 936 unknown ones, Ethereum client, we are surprised by the fact that 7% of respectively. Ethereum nodes are still vulnerable. Specically, we per- After that, we estimate attackers’ prots. We rst calculate formed a port scanning to the 15,560 Ethereum public the lower bound of prots by only considering the income of nodes [5] and found that around 1,000 of them are reachable the malicious accounts. Since our honeypot observed their through the RPC port without any authentication 5. This fact behaviors of stealing the Ether, we have a high condence demonstrated the severity of this problem, and advocates that these malicious accounts belong to attackers. Then we the need to have a better understanding of this issue in the add the income of suspicious addresses into consideration. community (the purpose of our work.) Since these addresses are not directly captured by our hon- eypot, we do not have a hard evidence that they belong to attackers. However, they may be connected with or con- 7 Related work trolled by attackers. Table 5 shows the estimated prots. We remove the addresses with zero prot from the table (e.g., Honeypot Honeypot systems have been widely used to addresses in the group 10), and we do not count the attackers capture and understand attacks by capturing malicious ac- from the group 35 and 36 since they are targeting ERC20 tivities [22, 26, 33, 36]. For instance, HoneyD [33] is one of tokens, whose value are hard to estimate due to the dramatic the best-known honeypot projects. It can simulate the net- change of the token price. It’s worth noting that, the actual work stack of many operating systems and arbitrary routine income of attackers are far more than the value shown in topologies, thus making it a highly interactive one. Collapsar the table since there are many attacks in the wild that were can manage a large number of interactive honeypots, e.g., not captured by our system. Honeypot farms. The concept of honeypot has been adopted to detect attacks to IoT devices [23, 25, 29, 32] and mobile de- vices [39]. Our system is working towards Ethereum clients, 6 Discussion which have dierent targets with previous systems. The general idea of attracting attacks and logging behaviors are Though we have adopted several ways to make our hon- similar, but with dierent challenges. eypot an interactive one, cautious attackers can still detect the existence of our honeypot and do not perform malicious Security issues of smart contracts One reason that activities thereafter. For instance, the attacker can rst send Ethereum is becoming popular is its support for smart con- a small amount of Ether to a newly generated address and 5For ethical reasons, we did not perform any RPC calls that may cause then observe the return value (the transaction hash) of this damages to those nodes.

58 22nd International Symposium on Research in Attacks, Intrusions and Defenses USENIX Association tracts. Developers can use contracts to develop decentralized References apps (or Dapps), including the lottery game, or digital tokens. Since its introduction, security issues of smart contracts have [1] Erc20 token standard, 2018. https://theethereum.wiki/ been widely studied by previous researchers. Atzei et al. sys- w/index.php/ERC20_Token_Standard. tematically analyzed the security vulnerabilities of Ethereum smart contracts, and proposed several common pitfalls when [2] Ethereum address 0xa8015df1f65e1f53d491dc1ed350130 programming the smart contracts [21]. For instance, the stack 31ad25034, 2018. https://etherscan.io/address/0xa8015d size of the Ethereum virtual machine is limited, attackers f1f65e1f53d491dc1ed35013031ad25034. could leverage this to hijack the control ow of the smart [3] Ethereum block 5899499, 2018. https://etherscan.io/ contracts. A system called teEther [28] is proposed to au- block/5899499. tomatically generate the exploits to attack the vulnerable smart contracts. The evaluation showed that among 38,757 [4] Ethereum json rpc, 2018. https://github.com/ethereum/ unique smart contracts, 815 of them could be automatically wiki/wiki/JSON-RPC. exploited. To mitigate the threats, some tools to analyze the smart [5] Ethernodes, 2018. http://www.ethernodes.org. contracts [10, 24, 27, 31, 37, 40] or x the vulnerable con- tracts [34] have been proposed. For instance, Oyente [30] is a [6] Etherscan, 2018. https://etherscan.io. system that can automatically detect smart contracts vulnera- [7] Go ethereum, 2018. https://github.com/ethereum/go- bilities using the symbolic execution engine. Moreover, it can ethereum. make the smart contracts less vulnerable by proposing some new semantics to the Ethereum virtual machine. Sereum [34] [8] Minereum, 2018. https://www.minereum.com/. automatically xes the reentrancy vulnerabilities in smart contracts by modifying the Ethereum virtual machine. Er- [9] Minereum market capital, 2018. https://coinmarketcap. ays [40] is a tool to analyze the smart contracts without the com/currencies/minereum/. requirement of the source code. In particular, it translates the to the high-level code that is readable for manual [10] Mythril, 2018. https://github.com/ConsenSys/mythril. analysis. Securify [37] automatically proves smart contract behaviors as safe or unsafe. Other similar tools to analyze [11] Over $20 million stolen in ethereum by hackers from un- smart contracts include Mythril [10] and Maian [31]. secured nodes, 2018. https://sensorstechforum.com/20- million-stolen-ethereum-hackers-unsecured-eth- nodes/.

8 Conclusion [12] Parity, 2018. https://www.parity.io/.

In this paper, we performed a systematic study to understand [13] Planetlab, 2018. https://www.planet-lab.org/. the cryptocurrency stealing on Ethereum. To this end, we [14] Solidity, 2018. https://solidity.readthedocs.io/. rst designed and implemented a system that captured real attacks, and further analyzed the attackers’ behaviors and es- [15] There’s some intense web scans going on for and timated their prots. We report our ndings in the paper and ethereum wallets, 2018. https://www.bleepingcomputer release the dataset of attacks (https://github.com/zjuicsr/eth- .com/news/security/theres-some-intense-web-scans- honey) to engage the whole research community. going-on-for-bitcoin-and-ethereum-wallets/. Acknowledgements The authors would like to thank the [16] Token learnchain, 2018. https://etherscan.io/token/ anonymous reviewers for their insightful comments that 0x87c9ea70f72ad55a12bc6155a30e047cf2acd798?a= helped improve the presentation of this paper. Special thanks 0xece94c3e87987454f2392676ccdb98b926f8. go to Siwei Wu, Quanrun Meng for their constructive sugges- tions and feedbacks. This work was partially supported by [17] Transfer token balance, 2018. https://theethereum.wiki/ Zhejiang Key R&D Plan (Grant No. 2019C03133), the National w/index.php/ERC20_Token_Standard#Transfer_Token_ Natural Science Foundation of China under Grant 61872438, Balance. the Fundamental Research Funds for the Central Universi- ties, the Key R&D Program of Shaanxi Province of China [18] Ethereum transaction 0x8d95864cc7142ef883148d45697 under Grant 2019ZDLGY12-06. Any opinions, ndings, and b49be2c30a4275ebbcdbd2684acd809258b6, 2019. conclusions or recommendations expressed in this material https://web.archive.org/web/20190307141313/https:// are those of the authors and do not necessarily reect the etherscan.io/tx/0x8d95864cc7142ef883148d45697b49be views of funding agencies. 2c30a4275ebbcdbd2684acd809258b64.

USENIX Association 22nd International Symposium on Research in Attacks, Intrusions and Defenses 59 [19] Ethereum transaction 0xdc4fe5301b9544892fdd97f476 [31] Ivica Nikolic, Aashish Kolluri, Ilya Sergey, Prateek Sax- c1ec7d3da3de5ab75972866b87b7991cafedf6, 2019. ena, and Aquinas Hobor. Finding the greedy, prodigal, https://web.archive.org/web/20190307141627/https:// and suicidal contracts at scale. In CoRR abs/1802.06038, etherscan.io/tx/0xdc4fe5301b9544892fdd97f476c1ec7d3 2018. da3de5ab75972866b87b7991cafedf6. [32] Yin Minn Pa Pa, Shogo Suzuki, Katsunari Yoshioka, and [20] Whackersforhackers.com - real time ip block list (ipbl), Tsutomu Matsumoto. Iotpot: Analysing the rise of 2019. https://whackersforhackers.com/share/tor.txt. iot compromises. In Proceedings of the 9th USENIX Workshop on Oensive Technologies, 2015. [21] Nicola Atzei, Massimo Bartoletti, and Tiziana Cimoli. A survey of attacks on ethereum smart contracts (sok). In [33] Niels Provos. A virtual honeypot framework. In Pro- Principles of Security and Trust, pages 164–186. Springer, ceedings of the 12nd USENIX Security Symposium, 2004. 2017. [34] Michael Rodler, Wenting Li, Ghassan O. Karame, and [22] Paul Baecher, Markus Koetter, Thorsten Holz, Maximil- Lucas Davi. Sereum: Protecting existing smart contracts lian Dornseif, and Felix Freiling. The nepenthes plat- against re-entrancy attacks. In Proceedings of the 26th form: An ecient approach to collect malware. In Network and Distributed System Security Symposium, Proceedings of the 9th Recent Advances in Intrusion De- 2019. tection, 2006. [35] Edward J. Schwartz, Thanassis Avgerinos, and David [23] Alex Bredo. Honeypot farms. https://github.com/alex Brumley. All you ever wanted to know about dynamic bredo/honeypot-camera. taint analysis and forward symbolic execution (but [24] Ting Chen, Zihao Li, Yufei Zhang, Xiapu Luo, Ang Chen, might have been afraid to ask). In Proceedings of the Kun Yang, Bin Hu, Tong Zhu, Shifang Deng, Teng Hu, IEEE Symposium on Security and Privacy, 2010. Jiachi Chen, and Xiaosong Zhang. Dataether: Data [36] Lance Spitzner. Honeypot farms. https://www.syman exploration framework for ethereum. In Proceedings tec.com/connect/articles/honeypot-farms. of the 39th IEEE International Conference on Systems, 2019. [37] Petar Tsankov, Andrei Dan, Dana Drachsler-Cohen, Arthur Gervais, Florian Bünzli, and Martin Vechev. Se- [25] Juan Guarnizo, Amit Tambe, Suman Sankar Bhunia, curify: Practical security analysis of smart contracts. In Martín Ochoa, Nils Tippenhauer, Asaf Shabtai, and Yu- Proceedings of the 25th ACM Conference on val Elovici. Siphon: Towards scalable high-interaction and Communications Security, 2018. physical honeypots. In Proceedings of 3rd ACM Work- shop on Cyber-Physical System Security, 2017. [38] . Ethereum: A secure decentralised gener- [26] Xuxian Jiang and Dongyan Xu. Collapsar: A vm-based alised transaction ledger. Ethereum project yellow paper, architecture for network attack detention center. In 2014. Proceedings of the 13th USENIX Security Symposium, [39] Matthias Wählisch, Sebastian Trapp, Christian Keil, 2005. Jochen Schönfelder, Thomas C. schmidt, and Jochen [27] Sukrit Kalra, Seep Goel, Mohan Dhawan, and Subodh Schiller. First insights from a mobile honeypot. In Sharma. Zeus: Analyzing safety of smart contracts. In Proceedings of the ACM SIGCOMM 2012 conference on Proceedings of the 25th Annual Network and Distributed Applications, technologies, architectures, and protocols System Security Symposium, 2018. for computer communication, 2012. [28] Johannes Krupp and Christian Rossow. Teether: Gnaw- [40] Yi Zhou, Deepak Kumar, Surya Bakshi, Joshua Mason, ing at ethereum to automatically exploit smart con- Andrew Miller, and Michael Bailey. Erays: Reverse tracts. In Proceedings of the 27th USENIX Security Sym- engineering ethereum’s opaque smart contracts. In posium, 2018. Proceedings of the 27th USENIX Security Symposium, 2018. [29] Tongbo Luo, Zhaoyan Xu, Xing Jin, Yanhui Jia, and Xin Ouyang. Iotcandyjar: Towards an intelligent- interaction honeypot for iot devices. In Blackhat, 2017. [30] Loi Luu, Duc-Hiep Chu, Hrishi Olickel, Prateek Saxena, and Aquinas Hobor. Making smart contracts smarter. In Proceedings of the 23rd ACM Conference on Computer and Communications Security, 2016.

60 22nd International Symposium on Research in Attacks, Intrusions and Defenses USENIX Association