A Blockchain-Based Spatial Data Trading Framework liu hui Beijing Normal University WeiPeng Tai Anhui University of Technology Yaofei Wang Beijing Normal University Wang Shenling (  [email protected] ) Beijing Normal University

Research

Keywords: spatial data, data trading, data pricing, Blockchain

Posted Date: January 23rd, 2021

DOI: https://doi.org/10.21203/rs.3.rs-150049/v1

License:   This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License 1 Liu et al. 2 3 4 5 6 RESEARCH 7 8 9 A Blockchain-Based Spatial Data Trading Frame- 10 11 12 work 13 1,2 14 Hui Liu 2 1 1* 15 , WeiPeng Tai , Yaofei Wang and Shenling Wang 16 17 *Correspondence: [email protected] 18 1School of Artificial Intelligence, Abstract Beijing Normal University, Beijing, 19 China With the increasing utilization of space related data, the demand for spatial big 20 Full list of author information is data and trading is growing rapidly, which promotes the emergence of 21 available at the end of the article spatial data market. However, in conventional data markets, both data buyers and 22 data sellers have to use a centralized trading platform which might be dishonest. 23 Blockchain is a decentralized distributed data storage technology, which uses the 24 traceability and unforgeability to confirm and record each transaction, can solve 25 the disadvantages of the centralized data market, however, it also introduces the 26 problems of security and privacy. 27 To address this issue,in this paper,we propose a blockchain-based spatial data 28 29 trading framework with Trusted Execution Environment to provide a trusted de- 30 centralized platform, including data storage, data query, data pricing and security 31 computing. Based on this framework, a spatial data trading demonstration system 32 was implemented and its feasibility and security were verified. 33 Keywords: spatial data; data trading; data pricing; Blockchain 34 35 36 37 38 Introduction 39 40 With the increasing popularity of all kinds of sensors and the wide application 41 of mobile positioning technology, the amount of spatial data is increasing rapidly. 42 Spatial data has become a key asset in our economy, the utilization of spatial data 43 44 can bring huge economic benefits or help us make better decision. In order to ensure 45 the normal circulation and use of data, many emerging institutions about spatial 46 data sharing and trading have emerged in recent years. In addition to the traditional 47 48 way of data circulation, there is also a big data trading market, which facilitates 49 data transactions by matching data demand with data sources. 50 In a conventional data market [1–4], data seller sends the data to a centralized 51 52 trading platform. Although the data trading platform accelerates the sharing and 53 circulation of data, there are still many problems in the data trading platform at 54 present: 55 56 1)The dishonest data buyer may resell the data seller’s data to others after ob- 57 taining the data seller’s data, thus damaging the data seller’s interests. At present, 58 the data circulating on the data trading platform is the source data without data 59 analysis, and the data has the feature of ”read it and have it”. That is to say, ma- 60 61 licious data buyers can cache the source data and resell them to others at a lower 62 price than the data seller after purchasing the data. 63 64 65 1 Liu et al. Page 2 of 15 2 3 4 5 6 2)The centralized trading platform may be dishonest who may cache and resell the 7 source data without the permission of the data seller, thus damaging the interests 8 of both parties to the transaction. 9 3)When data trading platform is a centralized system with low fault tolerance for 10 11 faults or attacks, once a fault occurs or is attacked by an attacker, the entire data 12 trading platform may be paralyzed, which will affect the data transaction between 13 the data buyer and the data seller, and even cause the leakage or loss of key data 14 of the platform. 15 16 In addition, the centralized trading platform lacks effective information commu- 17 nication channel between data buyer and data seller, which leads to low efficiency 18 of data transaction [5]. 19 In order to avoid the disadvantages in centralized data market, such as data 20 21 security and privacy, data copyright protection and sharing and circulation perfor- 22 mance bottlenecks, decentralized data market was born. Decentralized data market 23 architecture can get rid of single point of failure and single point of performance 24 bottleneck, and improve the transparency and credibility of data transactions. How- 25 26 ever, due to the lack of centralized management in decentralized data market, its 27 system design and security assurance will be more difficult than centralized data 28 market. For example, double payment has always been the difficulty of distributed 29 system [6, 7]. 30 31 Blockchain is a distributed public ledger first introduced by Nakamoto in 2008 [8]. 32 It maintains a continuously growing list of ordered records called blocks. Each block 33 in the chain is linked to the previous block through a cryptographic hash. All nodes 34 in the network the same copy of digital ledger. Information stored in the 35 36 blockchain is open to everyone, making the actions of nodes transparent. 37 The introduction of blockchain into the data market system will enable data 38 sellers to enter into transactions with the data buyers directly without relying on 39 any third party, so that sellers can maintain the ownership of data and ensure the 40 41 openness and transparency of the transaction process, can solve the disadvantages 42 of the centralized data market, however, it also introduces the problems of security 43 and privacy [9–12], as explained below: 44 1)Data files are stored safely. In the scenario of data trading, the data files to be 45 46 traded by both parties need to be properly stored. First of all, data files cannot be 47 stored in the blockchain. The design of the blockchain system is only suitable for 48 storing transaction data with a small amount of data, but not suitable for storing 49 data files with a large amount of data. Secondly, the data file stored procedure 50 51 cannot disclose identity information. Finally, it is necessary to ensure that the data 52 buyer can easily obtain the data, and the buyer and the seller can not identify the 53 real identity of the other party. 54 2)Privacy protection. Based on the business requirements of blockchain and data 55 56 transaction, it is necessary to research and design privacy protection scheme in 57 data transaction process, to protect the user’s real identity from being disclosed 58 and ensure the security of transaction data under the condition of ensuring the 59 normal operation of data transaction. 60 61 To address this issue, we propose a blockchain based spatial data trading frame- 62 work which takes advantages of the blockchain to build a decentralization platform 63 64 65 1 Liu et al. Page 3 of 15 2 3 4 5 6 for data trading, including data storage, data query, data pricing and security com- 7 puting. 8 To tackle the first challenge, we use IPFS (InterPlanetary File System) to store 9 data files, which is a distributed file storage system. When a file is uploaded to 10 11 IPFS, it is available to all peers in the IPFS network. the user will receive a hash 12 index, which will allow the user to retrieve the file later. This index will replace the 13 data stored in the smart contract, saving the burden of the entire system. 14 To address the second challenge, we adopt smart contract to realize the data 15 16 transaction, and the combination of cryptography technology and Ethereum account 17 mechanism is used to solve the privacy protection and data security problems in 18 the transaction. 19 In our design, data buyers and data sellers conduct transactions directly on the 20 blockchain, which avoids risks caused by centralized trading platforms. All pay- 21 22 ment records and reviews generated by data buyers are faithfully recorded in the 23 blockchain through consensus protocol, and can be protected securely with Trusted 24 Execution Environment. The rest of this paper is organized as follows: In Section 1, 25 we introduce the background and related work. In Section 2, we design the frame- 26 27 work of spatial data trading based on blockchain. In Section 3-4, we verify feasibility 28 and security of a spatial data trading system. At last, we summarize our results and 29 make some concluding remarks. 30 31 1 Background Knowledge 32 33 Due to its decentralized immutable and traceable characteristics, blockchain tech- 34 nology is used in data trading market in recent years, which has attracted great 35 attention of the industry. Some of these companies directly sell their collected data 36 sets, while others collect personal data from the public and sell it to individual 37 users. There are also some domestic examples of using blockchain technology to 38 39 build data market. For example, Shanghai Data Trading Center [13] uses alliance 40 chain to store transaction related information in blockchain nodes to ensure data 41 transaction security, efficiency and credibility. 42 Zyskind [14]used blockchain to protect the privacy of personal data, transform- 43 44 ing the blockchain into an automatic access control manager that does not rely on 45 trusted third parties to clarify the ownership of data and ensure that users con- 46 trol their data. However, this work only discusses the storage and sharing of data. 47 Crowdbc [15] is a group intelligence perception system constructed by blockchain. 48 The author focuses on the characteristics and processing methods of image data in 49 50 transaction. Baig [16] constructed a data market based on blockchain, and intro- 51 duced a trusted intermediary in the transaction between the buyer and the seller. 52 Although this makes the transaction between the buyer and the seller easier, it also 53 reduces the security of the system a lot. Dai proposed SDTE [17], a blockchain- 54 55 based data trading ecosystem. In SDTE, buyers of data cannot directly access the 56 original data they purchased, and can only obtain the analysis results of the data, 57 which is generated from Intel SGX (Software Guard Extensions). However, if there 58 are too many malicious data sellers in a transaction, honest data sellers will not be 59 60 able to get reasonable compensation. 61 Spatial data in this paper includes [18]: remote sensing, mapping and other raw 62 data, such as low-resolution satellite images, medium resolution satellite images, 63 64 65 1 Liu et al. Page 4 of 15 2 3 4 5 6 high-resolution satellite images, sub meter high-resolution satellite images, radar 7 data, aerial photogrammetry data, series scale vector data, terrain data, and other 8 types of original spatial information data. As shown in Figure 1, after obtaining 9 spatial data through various devices such as mobile phones and satellites, the data 10 11 owners upload them to the cloud storage platform through network, and then sell 12 them to the data buyers through data trading platform. 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 Figure 1: Spatial data trading system 34 35 36 The spatial data trading system based on blockchain designed in this paper con- 37 sists of three main components: a smart contract in Ethereum blockchain, 38 held by system participants and a point-to-point data transmission network. When 39 40 a data buyer needs some specific data to calculate a specific task, he will use the 41 data query module and inform the smart contract. The whole system locates some 42 qualified data through the way of security calculation. After the interaction between 43 the data buyer and the data seller, the data pricing module will determine the sale 44 45 data and its price. After the payment is completed, the system runs the buyer’s 46 calculation task in the way of security calculation, and returns the results to the 47 buyer, and the transaction is completed. 48 49 1.1 Data query 50 51 Data buyers usually don’t need all the data of the seller, but care about the specific 52 data. Buyers can organize their own data requirements into a logical expression or 53 a mathematical function and store them in the blockchain for the seller to query 54 and judge [19, 20]. 55 56 The query conditions of these data are generally relatively simple, so it is not ap- 57 propriate to upload the buyer’s data query requirements directly to the blockchain, 58 which obviously exposes the privacy of buyers and sellers. According to the form 59 of data query, it is easy for attackers to infer the buyer’s data requirements and 60 61 obtain the privacy of the buyer and the seller. If a location of interest to the buyer 62 is disclosed, the attacker may infer the buyer’s range of activities, and the device 63 64 65 1 Liu et al. Page 5 of 15 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Figure 2: Data trading platform 19 20 21 22 held by the seller close to the location is also exposed. Therefore, in order to protect 23 the privacy of system participants, we should query the target data while hiding 24 the query conditions. The most intuitive solution is function encryption, that is, the 25 26 buyer with decryption key can obtain the data query function value of ciphertext 27 data , but will not obtain any information about original text. 28 29 1.2 Data storage 30 31 The mainstream public blockchain has restrictions on the number and space of 32 transactions in the block. The limit is the size of the block (bitcoin) or the upper 33 limit of ”natural gas” consumed in the block (Ethereum). For the data trading 34 market, it is not feasible to store massive data directly on the blockchain. Recently 35 36 emerged a distributed data storage InterPlanetary File system (IPFS) is introduced 37 for storing the shared data in various other domains like healthcare, cloud comput- 38 ing, IoT, and agriculture. 39 IPFS [21–23]is a peer to peer, content-addressable, distributed file storage system, 40 41 using a swarm of computers connected. When a file is uploaded to IPFS, it is avail- 42 able to all peers in the IPFS network. The uploaded file is divided into chunks, that 43 are assigned a unique cryptographic hash. Thus, data added to IPFS is addressed by 44 using this unique cryptographic hash, which makes it content addressable. It uses 45 46 distributed hash tables (DHTs) to find locations of files. Thus, storing traded data 47 in IPFS and the hashes returned by IPFS are stored in blocks, which reduces the 48 huge cost of storage space. In summary, IPFS provides high throughput with secure 49 storage model that supports concurrent access of data with high storage capacity. 50 51 The cloud storage service of the spatial big data trading platform is mainly to 52 establish a storage space station, which uses computer, Internet, Internet of things 53 and other technologies to carry out daily storage management on the products and 54 services traded by the platform and various information generated in the transaction 55 56 process, and can quickly and accurately complete the statistical summary of product 57 and service transaction information. Taking the advantages of cloud storage service, 58 such as rapid retrieval, convenient search, high reliability, large amount of storage 59 and good confidentiality, a large number of data can be safely saved [24]. 60 61 Although it is impractical to store the complete data in the blockchain, the data 62 digest bound with specific data can be stored in the blockchain, and the data can be 63 64 65 1 Liu et al. Page 6 of 15 2 3 4 5 6 stored in IPFS. When the data is successfully stored in IPFS, the user will receive a 7 hash index, which will allow the user to retrieve the file later. This index will replace 8 the data stored in the smart contract, saving the burden of the entire system. 9 10 11 1.3 Data pricing 12 13 In the data market, the design of data selling form and the setting of price has 14 always been an active research field. In this paper, we consider the design of pricing 15 mechanism in the environment of game theory. Each data holder has a private 16 17 valuation for their data, that is, the loss of privacy of the data seller is leaked; the 18 data buyer also has a valuation for the data he will buy, that is, the value of the data 19 to the buyer. Bidders may choose to dishonestly report their valuation of a piece of 20 data. The solution in game theory is to design incentive compatible mechanism so 21 22 that each bidder can get the highest return when reporting its real valuation. The 23 fact that bidders report their true valuations makes it much easier to design pricing 24 mechanisms. 25 26 As a commodity, data has some unique properties [25, 26] , which make data 27 pricing need to consider some additional issues. First, the marginal cost of data is 28 extremely low, or there is no marginal cost at all. Marginal cost refers to the cost of 29 30 making a copy of a good after it is produced. The marginal cost of data is basically 31 zero, so that once the data buyer obtains the data of the data seller, he may resell 32 the data. Second, the value of data is not necessarily related to the amount of data. 33 For example, for a person who needs remote sensing images, a pile of face images 34 35 is of little value. The third is the quantification of data value. The value of data is 36 difficult to quantify, and it is difficult for data sellers to estimate the value of data. 37 At the same time, valuations vary greatly among different buyers. 38 39 According to the characteristics of data, the data trading mode in the data market 40 has also changed. The traditional data market will directly trade the user’s data, and 41 dishonest buyers can resell their purchased data sets without the seller’s knowledge, 42 43 so as to obtain benefits. Many studies have found that most data consumers only 44 need some statistical results or advanced features behind a large amount of data, 45 such as calculating the average value of data sets, or training data for machine 46 learning models, rather than the data itself. As a result, data markets can collect 47 48 data from data sellers and then serve the computing tasks in the hands of data 49 consumers. The buyer provides a specific task, whose input is the right to use 50 multiple copies of data purchased, while the output is the result the buyer wants. 51 52 In this way, the data itself is isolated from the data consumer. 53 Although the value of the data itself is difficult to quantify, the seller’s task cal- 54 culation results are easy to quantify. Since buyers often need data from multiple 55 56 sellers, it is necessary to distinguish the value of each data in a transaction. When 57 calculating the value of data, Shapley value can be used to calculate the contribu- 58 tion of single data [27] . In game theory, calculating Shapley value is a solution to 59 distribute benefits and costs equally to multiple participants. The computation com- 60 61 plexity of Shapley value increases exponentially with the increase of data quantity, 62 so approximate algorithm is often used in practical application. 63 64 65 1 Liu et al. Page 7 of 15 2 3 4 5 6 1.4 Security computing 7 Blockchain is an open, transparent and decentralized data storage technology. All 8 information entering the blockchain system is open, and the execution of all trans- 9 10 actions or scripts is transparent. At the same time, the computing power of the 11 blockchain smart contract is very weak. Due to the limitation of block size and gas, 12 the calculation of a smart contract is very small, but the cost is high. Therefore, 13 it is not safe to run the trading transactions issued by the buyer directly in the 14 15 smart contract. Similarly, such problems also exist in the process of data query and 16 data pricing. Bitcoin, Ethereum and other blockchain public chain nodes are not 17 trusted with each other, but also completely open and transparent, which makes 18 the privacy protection work produce new problems. 19 Trusted hardware, such as trusted execution environment (TEE) [28–30], is a com- 20 21 mon method to implement secure computing. TEE ensures that the code and data 22 loaded in it are protected in terms of confidentiality and integrity. Assuming that 23 some users have some TEE hardware, they are considered as secure users. The seller 24 will send their data to the TEE equipment of the secure user, and the calculation 25 26 task of the buyer will be calculated in the TEE and the result will be returned to the 27 buyer in a safe way. Only trusted applications running in TEE can access the full 28 functionality of the device’s main processor, peripherals, and memory. Hardware 29 isolation protects data and computing content from user installed applications run- 30 31 ning on the main operating system. The typical hardware technologies supporting 32 tee implementation are arm TrustZone and Intel SGX software guard Extensions. 33 Intel Software Guard Extensions (SGX) [31] provides a widely used TEE imple- 34 mentation for general-purpose computation, which is known as enclaves in SGX. 35 36 Code running inside a enclaves has a protected address space. When data from a 37 enclave moves off the processor to memory, it is transparently encrypted with keys 38 only available to the processor. Thus the operating system, hypervisor and other 39 users cannot access the enclaves memory. In the enclave, the code and data are 40 41 measured at the startup stage and the measurement is signed into an attestation 42 report based on a hardware-based root of trust. The report can be verified to show 43 the unmodified enclave code logical, by which users can confirm the security of en- 44 clave and provision the secret into the enclave at runtime, which is called remote 45 46 attestation protocol. 47 In this paper, we use Intel SGX as the exemplary implementation to build a 48 trusted exchange by using Trusted Execution Environment, assisting in the fair 49 payment of the transactions. 50 51 52 2 The Proposed Framework 53 We have implemented a spatial data trading system based on Ethereum private 54 chain, including system participants holding desktop application clients, smart con- 55 56 tracts in Ethereum network and data transmission network. The desktop client is 57 written by JavaScript, Ethereum smart contract is written by solidness, and the JS 58 interface of IPFS is directly used for data transmission. 59 We simplify the system into one buyer and multiple potential sellers to trade. One 60 61 buyer only needs one data. The pricing mechanism adopts the second price auction, 62 that is, the transaction data adopts the lowest bid data of the seller, and the data is 63 64 65 1 Liu et al. Page 8 of 15 2 3 4 5 6 paid at the second lowest price. The buyer’s computing task is set to be simple, and 7 it can be ensured by homomorphic encryption or secure multi-party computing. 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Figure 3: Data trading framework based on blockchain 24 25 26 27 The data transaction process is shown in Figure 3 The data seller adds a new data 28 information in the smart contract, and the data buyer adds an order containing data 29 requirements. The seller gives a quotation after matching the legal data according 30 to the conditions in the order. The system will run the pricing mechanism and 31 32 calculate the data for sale. The data trading platform will calculate the buyer’s 33 task and deliver the data through the security calculation method. 34 To describe how the framework works, we introduce the main workflow of the 35 system. 36 37 Step1: Regist data. To buy and sell data in the data market, the user must first 38 register an Ethereum account, and the client also contains a simple Ethereum ac- 39 count management function. At the same time, the user also needs to register an 40 account in the smart contract. For example, the user can register an account of a 41 42 sensor device in the smart contract, which needs to disclose the type, model and 43 other information of the sensor device. 44 Step2: Add data. After generating some data that he wants to sell, the data 45 46 seller can encrypt and upload the data to IPFS, and register the data in the smart 47 contract account, including the storage address, hash value and registration time 48 of the data. Data buyers are encouraged to set data registration time requirements 49 when adding data orders to purchase data. Only data sellers who meet the data 50 51 registration time requirements are eligible to bid for the order, so as to improve the 52 timeliness and reliability of data. 53 Step3: Issue orders. If a data buyer wants to buy a right to use specific data in the 54 market, he will add an order to the smart contract. The order contains the buyer’s 55 56 demand for data (including data type, data selection function, data price limit, data 57 quantity limit, etc.) and the calculation task for the buyer to use the data. 58 Step4: The data seller provides a hash of their bid. Data sellers choose one or 59 more pieces of data that meet the needs of data buyers and set a package price. 60 61 Step5:The Data Buyer notifies the smart contract to obtain the real bid of the 62 market data seller. The Data Buyer notifies smart contract to stop accepting users 63 64 65 1 Liu et al. Page 9 of 15 2 3 4 5 6 from participating in the transaction, and begins to accept users who have partici- 7 pated in the transaction to disclose their bids. 8 Step6:Data sellers announce real bids. The data seller publishes its bid, which 9 needs to match the previous hash value. 10 11 Step7:Pricing mechanism. The classic Vickrey-Clarke-groves auction mechanism 12 is used for data selection and price determination. The VCG auction mechanism 13 guarantees authenticity, so each bidder has an incentive to make a true evaluation 14 of his personal data. 15 16 Step8:Deliver data. The system will use homomorphic encryption to calculate the 17 buyer’s calculation task, and then send the encrypted results to the data buyer to 18 complete the order. 19 20 3 System analysis 21 22 3.1 privacy analysis 23 The identity privacy and data privacy of system participants are protected in the 24 system. Identity privacy is based on the anonymity ability of blockchain, while data 25 privacy is fully protected. Data encryption, data selection mechanism and secure 26 27 computing make attackers unable to obtain any additional information of data from 28 the execution of transactions. 29 30 3.2 pricing analysis 31 Since data pricing is implemented on the Ethereum blockchain through smart con- 32 33 tracts, its computational cost cannot be ignored. In order to evaluate the data 34 pricing cost of the system, the data pricing based on VCG auction is used to test 35 the gas consumption when a large number of sellers bid in the transaction. Figure 4 36 shows the relationship between the gas consumption of VCG data pricing and the 37 38 number of bids received for data orders. We find that when a large number of data 39 sellers bid for the same data order, the data pricing cost will be very high. If we 40 want to realize the data market based on Ethereum blockchain, the high cost of data 41 pricing will become the obstacle of system application. Therefore, in the follow-up 42 study, reducing the cost of data pricing is one of the future research directions. 43 44 45 46 47 48 49 50 51 52 53 54 55 Figure 4: Relationship between gas consumption and bid quantity 56 57 58 59 4 Experimental Methods 60 61 We have implemented a spatial data trading system based on Ethereum private 62 chain, smart contracts in Ethereum network and data transmission network. All the 63 64 65 1 Liu et al. Page 10 of 15 2 3 4 5 6 tests of this system are completed on PC with 4 vcores(3 GHz Intel Xeonr Platinum 7 8124M) and 8 GB RAM. The desktop client is written by JavaScript, Ethereum 8 smart contract is written by solidness, and the JS interface of IPFS is directly used 9 10 for data transmission. The online transaction module of data transaction system is 11 implemented in Python language. 12 This paper mainly tests the system throughput, storage space occupation and fault 13 tolerance. Among them, for the system throughput, we mainly test the amount of 14 15 data that can be processed per unit time under the condition of different number of 16 nodes and different data through concurrency test; for the storage space, we mainly 17 test the disk space required for data link under different number of nodes; for the 18 fault tolerance performance, we compare the impact of node failure on the overall 19 20 throughput in different modes. 21 In Consortium BlockChain such as fabric, node throughput is limited by commu- 22 nication delay, transaction verification time and hash operation time. In a system 23 N T 24 with nodes and an average block of transactions, assuming the node commu- 25 nication delay tc, transaction verification time tv, and one hash operation time th, 26 the throughput of the whole system is shown in equation: 27 T TPS = 2 . 28 N ×tc+T ×tv +2×T ×th 29 In order to test the storage requirement for packaging transactions throughout 30 the system, this paper compares the storage reduction after packaging the chain 31 by constructing the same number of transaction requests. In an N-node blockchain 32 T 33 system, the number of included transactions is , assuming that the average space 34 occupied by each transaction is st, and the space occupied by a hash result is sh. 35 then we can estimate the disk space occupied as S = N×T × (St + Sh). 36 37 38 5 Experimental Evaluation 39 In order to test the running speed of the system, we list the running time tests of 40 sub key generation, data encryption and Merkel tree generation. The generation of 41 42 subkey and Merkel tree are based on binary tree, so their running time is closely 43 related to the depth of the tree. We use L = ⌈log2 n⌉ + 1 to calculate the depth of 44 the tree, where n represents the number of transaction data. The test results are 45 shown in Figure 5. 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 Figure 5: running time tests 62 63 64 65 1 Liu et al. Page 11 of 15 2 3 4 5 6 With the increase of the number of transaction data n, the depth L of the tree 7 increases, and the generation time of the final sub key and Merkel tree also increases. 8 The test results in Figure 5 show that when the number of transaction data is 9 10 60000, the depth of the sub key generation tree and Merkel tree is 17, while when 11 the number of transaction data is 70000, the depth of the tree reaches 18, and the 12 increase of the depth of the tree increases the running time of the sub key generation 13 and Merkel tree generation by about 0.14s. 14 15 With the increase of the number of nodes, transaction throughput of the system 16 increases significantly, which is higher than that of fabric system. The transaction 17 throughput of the traditional blockchain platform is relatively stable, while the 18 transaction throughput of the blockchain system described in this paper can increase 19 20 rapidly with the increase of the number of nodes. The blockchain system described in 21 this paper can effectively improve the transaction throughput, as shown in Figure.6. 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Figure 6: Transaction throughput tests 40 41 42 In order to test the storage requirement for packaging transactions throughout 43 the system, this paper compares the storage reduction after packaging the chain by 44 45 constructing the same number of transaction requests. 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 Figure 7: storage tests 62 63 64 65 1 Liu et al. Page 12 of 15 2 3 4 5 6 As shown in Figure. 7, when the number of nodes in a traditional block chain 7 increases, the total space occupied by transactions increases almost linearly. How- 8 ever, the block chain system described in this paper increases disk space slowly as 9 10 nodes increase. It can greatly reduce storage costs when dealing with large amounts 11 of data chained. 12 By comparing the success rate of malicious attack on the distributed deployment 13 of traditional centralized system, traditional federation chain system and the sys- 14 15 tem under different ratio of malicious nodes, this paper uses 120 block chain nodes 16 for convenience. As shown in Figure.8, in the centralized system of traditional dis- 17 tributed deployment, as long as there are malicious nodes, the attack success rate 18 is 100%. In the traditional PBFT based blockchain system [32, 33], if the number 19 20 of malicious nodes is less than 1/3, fault tolerance can be achieved smoothly, but if 21 it exceeds 1/3, it will be affected. This system can tolerate more node failures and 22 reduce the risk of transaction blocking. 23 24 25 26 27 28 29 30 31 32 33 34 35 36 Figure 8: Fault tolerance tests 37 38 39 40 6 Results and Discussion 41 42 The final test results show that the system described in this paper has better 43 throughput, less storage space occupation and better fault tolerance, and can adapt 44 to the high frequency and real-time requirements of data trading. 45 46 “Scalability Triple Difficulty” refers to the unavoidable contradiction among scal- 47 ability [34], decentralization and security of block chain systems. The system is a 48 completely decentralized system and does not rely on any trusted third party, so 49 scalability and security will be more challenging. In order to ensure security, the 50 51 system proposed in this paper sacrifices scalability. In the future deployment of 52 the data trading system, we will further explore the trade-off between scalability, 53 decentralization and security. 54 55 56 7 Conclusion 57 In this paper, we propose a blockchain based spatial data trading framework which 58 takes advantages of the blockchain to build a decentralization platform. 59 The platform provides data and platform interface, with petabyte level remote 60 61 sensing data storage and processing capabilities; without prior installation, users 62 can obtain the integrated services of remote sensing data, information, software and 63 64 65 1 Liu et al. Page 13 of 15 2 3 4 5 6 computing resources through the cloud service terminal; users can use the space in- 7 formation infrastructure anytime and anywhere, without purchasing data, software 8 and expensive computer equipment in order to use high resolution remote sensing 9 10 information conveniently and economically, it can greatly reduce the threshold of 11 remote sensing in terms of cost and maintenance. 12 13 Funding 14 15 This work was supported by National Natural Science Foundation of China under 16 Grants 61772080 and the Fundamental Research Funds for the Central Universities 17 under Grants 2020NTST32. 18 19 20 Abbreviations 21 IPFS: InterPlanetary File System 22 SDTE: Secure Blockchain-Based Data Trading Ecosystem 23 SGX: Software Guard Extensions 24 25 DHTs: distributed hash tables 26 TEE: trusted execution environment 27 VCG: Vickrey-Clarke-groves 28 PBFT: Practical Byzantine Fault Tolerance 29 30 31 32 Availability of data and materials 33 Data sharing is not suitable for this paper, because a lot of data in this paper is 34 35 personal privacy data, which needs relevant legal protection. 36 37 Competing interests 38 The authors declare that they have no known competing financial interests or per- 39 40 sonal relationships that could have appeared to influence the work reported in this 41 paper. 42 43 Authors’ contributions 44 45 Liu Hui carried out the design of the trading system in this paper drafted the 46 manuscript. WeiPeng Tai colloected the data, Yaofei Wang participated in the de- 47 sign of smart contract, Shenling Wang carried out the experimental evaluation 48 and performed the statistical analysis. All authors read and approved the final 49 50 manuscript. 51 Author details 52 1School of Artificial Intelligence, Beijing Normal University, Beijing, China. 2School of Computer Science and Tech- 53 nology, An Hui University of Technology, Maanshan, China. 54 References 55 1. Misura, K., Zagar, M.: Data marketplace for internet of things. In: 2016 International Conference on Smart 56 Systems and Technologies (SST) (2016) 2. Zhao, D., Ye, C., Zhang, B.: Data marketplace and its values in data trading. Library and Information Service 57 (2017) 58 3. Pang, J.Z.F., Fu, H., Lee, W.I., Wierman, A.: The efficiency of open access in platforms for networked cournot 59 markets. In: IEEE Infocom -ieee Conference on Computer Communications (2017) 60 4. Ramachandran, G.S., Radhakrishnan, R., Krishnamachari, B.: Towards a decentralized data marketplace for smart cities. In: 2018 IEEE International Smart Cities Conference (ISC2) (2019) 61 5. Nguyen, D.D., Ali, M.I.: Enabling on-demand decentralized iot collectability marketplace using blockchain and 62 crowdsensing. In: Global IoT Summit (2019) 63 64 65 1 Liu et al. Page 14 of 15 2 3 4 5 6 6. Su, G., Yang, W., Luo, Z., Zhang, Y., Zhu, Y.: Bdtf: A blockchain-based data trading framework with trusted 7 execution environment (2020) 7. Sadiq, A., Javaid, N., Omaji, S., Khalid, A., Imran, M.: Efficient data trading and storage in internet of vehicles 8 using consortium blockchain. In: 16th International Wireless Communications and Mobile Computing Conference 9 (IWCMC), 2020 (2020) 10 8. Nakamoto, S.: Bitcoin: A peer-to-peer electronic cash system. Online at https://bitco.in/pdf/bitcoin.pdf (2008) 11 9. Zheng, X., Mukkamala, R.R., Vatrapu, R., Ordieres-Mere, J.: Blockchain-based personal health data sharing 12 system using cloud storage, pp. 1–6 (2018) 13 10. Ruinian, L., Tianyi, S., Bo, M., Hong, L., Xiuzhen, C., Limin, S.: Blockchain for large-scale internet of things 14 data storage and protection. IEEE Transactions on Services Computing, 1–1 (2018) 11. Jiang, T., Fang, H., Wang, H.: Blockchain-based internet of vehicles: Distributed network architecture and 15 performance analysis. IEEE Internet of Things Journal 6, 4640–4649 (2019) 16 12. Liu, X., Huang, H., Xiao, F., Ma, Z.: A blockchain-based trust management with conditional privacy-preserving 17 announcement scheme for vanets. IEEE Internet of Things Journal 7(5), 4101–4112 (2020) 18 13. WANG, J., ZHENG, Z., WU, F., CHEN, G.: Blockchain based data marketplace. Big Data 6(03), 25–39 (2020) 14. Zyskind, G., Zekrifa, D.M.S., Alex, P., Nathan, O.: Decentralizing privacy: Using blockchain to protect personal 19 data. In: IEEE Security and Privacy Workshops (2015) 20 15. Li, M., Weng, J., Yang, A., Lu, W., Zhang, Y., Hou, L., Jia-Nan, L., Xiang, Y., Deng, R.: Crowdbc: A blockchain- 21 based decentralized framework for crowdsourcing. IEEE Transactions on Parallel and Distributed Systems, 1–1 22 (2018) 16. Baig, F., Wang, F.: Blockchain enabled distributed data management - a vision. In: 2019 IEEE 35th International 23 Conference on Data Engineering Workshops (ICDEW) (2019) 24 17. Dai, W., Dai, C., Choo, K.K.R., Cui, C., Zou, D., Jin, H.: Sdte: A secure blockchain-based data trading ecosystem. 25 IEEE transactions on information forensics and security 15, 725–737 (2020) 26 18. Anselin, L., Syabri, I., Kho, Y.: Geoda: An introduction to spatial data analysis. Geographical Analysis (2006) 19. Li, R., Liu, A.X., Wang, A.L., Bruhadeshwar, B.: Fast range query processing with strong privacy protection for 27 cloud computing. In: Proc. the VLDB Endowment (2014) 28 20. Kang, J., Yu, R., Huang, X., Wu, M., Maharjan, S., Xie, S., Zhang, Y.: Blockchain for secure and efficient data 29 sharing in vehicular edge computing and networks. IEEE Internet of Things Journal 6(3), 4660–4670 (2019) 21. Chen, Y., Li, H., Li, K., Zhang, J.: An improved p2p file system scheme based on ipfs and blockchain. In: 2017 30 IEEE International Conference on Big Data (Big Data) (2017) 31 22. Vimal, S., Srivatsa, S.K.: A new cluster p2p file sharing system based on ipfs and blockchain technology. Journal 32 of Ambient Intelligence and Humanized Computing (2) (2019) 33 23. Wu, X., Han, Y., Zhang, M., Zhu, S.: Secure Personal Health Records Sharing Based on Blockchain and IPFS, (2020) 34 24. Naz, M., Al-Zahrani, F.A., Khalid, R., Javaid, N., Qamar, A.M., Afzal, M.K., Shafiq, M.: A secure data sharing 35 platform using blockchain and interplanetary file system. Sustainability 11 (2019) 36 25. Sen, S., Joe-Wong, C., Ha, S., Chiang, M.: Pricing data: A look at past proposals, current plans, and future 37 trends. Acm Computing Surveys (2012) 26. Dziembowski, S., Eckey, L., Faust, S.: Fairswap: How to fairly exchange digital goods. In: Proceed- 38 ings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, 39 Toronto, ON, Canada, October 15-19, 2018, pp. 967–984. ACM, ??? (2018). doi:10.1145/3243734.3243857. 40 https://doi.org/10.1145/3243734.3243857 41 27. Shi, W., Wu, C., Li, Z.: A shapley-value mechanism for bandwidth on demand between datacenters. IEEE Transactions on Cloud Computing 6(1), 19–32 (2018) 42 28. Eltayieb, N., Elhabob, R., Hassan, A., Li, F.: A blockchain-based attribute-based signcryption scheme to secure 43 data sharing in the cloud. Journal of Systems Architecture 102, 101653 (2019) 44 29. Feng, Q., He, D., Zeadally, S., Liang, K.: Bpas: Blockchain-assisted privacy-preserving authentication system for 45 vehicular ad-hoc networks. IEEE Transactions on Industrial Informatics PP(99), 1–1 (2019) 30. Jeong, B.G., Youn, T.Y., Jho, N.S., Shin, S.U.: Blockchain-based data sharing and trading model for the con- 46 nected car. Sensors 20(11), 3141 (2020) 47 31. Hoekstra, M., Lal, R., Pappachan, P., Phegade, V., Cuvillo, J.D.: Using innovative instructions to create trust- 48 worthy software solutions. In: Proceedings of the 2nd International Workshop on Hardware and Architectural Support for Security and Privacy (2013) 49 32. Tang, J., Cui, Y., Li, Q., Ren, K., Liu, J., Buyya, R.: Ensuring Security and Privacy Preservation for Cloud Data 50 Services. ACM Computing Surveys 49(13) (2016) 51 33. Wang, G., Nixon, M.: Randchain: Practical scalable decentralized randomness attested by blockchain. In: 2020 52 IEEE International Conference on Blockchain (Blockchain) (2020). IEEE 34. Kraft, D.: Difficulty control for blockchain-based consensus systems. Peer-to-Peer Networking and Applications 53 (2016) 54 55 56 Figure 57 Figure 1: Spatial data trading system 58 Figure 2: data trading platform 59 Figure 3: data trading framework based on blockchain 60 61 Figure 4: Relationship between gas consumption and bid quantity 62 Figure 5: running time tests 63 64 65 1 Liu et al. Page 15 of 15 2 3 4 5 6 Figure 6: Transaction throughput tests 7 Figure 7: storage tests 8 Figure 8: Fault tolerance tests 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 Figures

Figure 1

Spatial data trading system

Figure 2 Data trading platform

Figure 3

Data trading framework based on blockchain Figure 4

Relationship between gas consumption and bid quantity Figure 5 running time tests Figure 6

Transaction throughput tests

Figure 7 storage tests

Figure 8

Fault tolerance tests