INNOVATION ACTION

H2020 GRANT AGREEMENT NUMBER: 825171

WP3 – Vision Materialisation and Technical Infrastructure D3.5 – Design and implementation of P2P infrastructure

Document Info Contractual Delivery Date: 31/12/2019 Actual Delivery Date: 31/12/2019 Responsible Beneficiary: INOV Contributing Beneficiaries: UoG Dissemination Level: Public Version: 1.0 Type: Final

This project has received funding from the European Union’s H2020 research and innovation programme under the grant agreement No 825171

`

DOCUMENT INFORMATION Document ID: D3.5: Design and implementation of P2P infrastructure Version Date: 31/12/2019 Total Number of Pages: 36 Abstract: This deliverable describes the EUNOMIA P2P infrastructure, its design, the EUNOMIA P2P APIs and a first implementation of the P2P infrastructure and the corresponding APIs made available for the remaining EUNOMIA modules to start integration (for the 1st phase). Keywords: P2P design, P2P infrastructure, IPFS

AUTHORS

Full Name Beneficiary / Organisation Role

INOV INESC INOVAÇÃO INOV Overall Editor University of Greenwich UoG Contributor

REVIEWERS

Full Name Beneficiary / Organisation Date

University of Nicosia UNIC 23/12/2019

VERSION HISTORY

Version Date Comments

0.1 13/12/2019 First internal draft 0.6 22/12/2019 Complete draft for review 0.8 29/12/2019 Final draft following review 1.0 31/12/2019 Final version to be released to the EC

Type of deliverable PUBLIC Page | ii H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5

EXECUTIVE SUMMARY

This deliverable describes the P2P infrastructure that has been implemented during the first phase of the EUNOMIA to provide decentralized support for storage, communication and security functions. It starts with a review of the main existing P2P technologies, where each one is analysed and a selection of candidates are selected to be used in the project. This set of technologies consolidated, are characterized, compared, and their features are matched with the project requirements, extracted from the user requirements (described in D2.4), the functional and non-functional technical requirements (described in D3.1), extended with additional non-functional support requirements. In the end it’s showed why IPFS is selected, among them, as the P2P shared stored solution selected for EUNOMIA. Given the selected technology, the anatomy of a EUNOMIA P2P node is also presented here, where the different layers are detailed and explained including the high-level storing API used by other node components and the low-level P2P components. Besides the description of the low- level P2P components that run a node, a global view is presented, explaining how nodes connect among them to form a global shared storage pool. Limitations and challenges stemming from the selection of technologies are also discussed, that impacts the design and implementation of security and privacy framework (described in D3.3). For some of the challenges described, some solutions are already presented in this document, while other ones will be solved along the duration of the Task 3.5.

@Copyright of EUNOMIA Consortium Page iii `

TABLE OF CONTENTS

DOCUMENT INFORMATION ...... ii

AUTHORS...... ii

REVIEWERS ...... ii

VERSION HISTORY ...... ii

Executive Summary ...... iii

Table of Contents ...... iv

LIST of Figures ...... v

LIST of Tables ...... vi

List of Acronyms and Abbreviations ...... vii

1. INTRODUCTION ...... 8 1.1 Scope and objectives of the deliverable ...... 8 1.2 Structure of the deliverable ...... 8 1.3 Relation to Other Tasks and Deliverables ...... 9

2. P2P Goals...... 10 2.1 EUNOMIA Data requirements ...... 10 2.2 Goals ...... 12

3. P2P State of the art ...... 13 Unstructured P2P ...... 13 Structured P2P ...... 14 3.1 Existing technologies ...... 16 3.2 Comparison matrices ...... 17 3.3 Selected technologies ...... 21

4. P2p Infrastructure ...... 24 4.1 P2P Node ...... 24 4.2 P2P Network ...... 26

5. P2P Node API ...... 28 5.1 Operations ...... 28 5.2 Overall integration ...... 28

Type of deliverable PUBLIC Page | iv H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5

6. Limitations and challenges ...... 30 6.1 Known limitations ...... 30 6.2 Possible solutions ...... 31

7. Conclusions ...... 33

8. References ...... 34 LIST OF FIGURES

Figure 1 : Overview of the P2P network architecture...... 24 Figure 2: Anatomy of a P2P EUNOMIA node...... 25 Figure 3: Content based addressing used in IPFS...... 26 Figure 4: PSP Network with 3 blocks with a replication factor of 2...... 27 Figure 5: EUNOMIA nodes...... 29 Figure 6: Storage data flow...... 29

@Copyright of EUNOMIA Consortium Page v `

LIST OF TABLES

Table 1: EUNOMIA P2P requirements...... 10 Table 2: Additional technical requirements...... 11 Table 3: P2P Storage functions...... 12 Table 4: Analysed P2P technologies...... 16 Table 5: P2P protocols classification...... 18 Table 6: P2P technologies comparison matrix...... 19 Table 7: Summary of P2P technologies...... 19 Table 8: Storage Server REST API...... 28 Table 9: P2P limitations and challenges...... 30 Table 10: Possible solutions for some of the P2P current issues...... 31

Type of deliverable PUBLIC Page | vi H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5

LIST OF ACRONYMS AND ABBREVIATIONS

Term Description AAA Authentication, Authorization and Accounting API Application Programming Interface CID IPFS Content Identifier CRUD Create, Read, Update and Delete DHT DNS Domain Name System FUSE Filesystem in Userspace IPFS InterPlanetary P2P Peer-to-peer PKI Public key infrastructure REST Representational state transfer SSL Secure Socket Layer

@Copyright of EUNOMIA Consortium Page vii `

1. INTRODUCTION

1.1 Scope and objectives of the deliverable

EUNOMIA adopts a a peer to peer (P2P) software architecture approach underpinning its decentralized and fully distributed nature, as well as to avoid having a single point of failure or a single entity with the capacity to manipulate the information stored, processed and communicated through it. This makes EUNOMIA nodes simultaneously contributors with information from their users and operators of the infrastructure. The P2P infrastructure which provides support to functions such as authorization, selection of nearby (network wide) servers, redundant storage and communication among EUNOMIA nodes was designed in order to support these functions. This document is the main deliverable of Task 3.5. It reflects the design and implementation of the P2P infrastructure. In this task, the modules that will allow each EUNOMIA component to access a distributed storage and messaging P2P is being designed and implemented. During the design phase, different P2P implementations were analysed and are being tested taking into consideration the user requirements (D2.4), technical requirements (D3.1) and use cases (D2.5). EUNOMIA does not aim to create a new P2P technology from scratch, but to select, adapt and integrate exiting P2P technologies and provide on top of it the functionalities required to support the EUNOMIA services. A new version of this document planned for the second year of the project will document the results and final implementation of the EUNOMIA P2P infrastructure, taking into consideration its evolution after the first round of integrated testing.

1.2 Structure of the deliverable

This document is organised as follows:

◼ Section 2 presents the P2P network goals, taking into account EUNOMIA’s architectural software components needed to satisfy the defined user requirements and the technical requirements (i.e., functional and non-functional requirements); ◼ Section 3 presents a synthesis of the P2P state of the art technologies, describing, comparing and characterizing each one, and in the end showing why a given P2P technology was selected among the existing ones; ◼ Section 4 describes the anatomy of a P2P EUNOMIA node, including the underlying components, as well as the provided methods to the upper services that need to use P2P storing and sharing functions; ◼ Section 5 describes the high level functions of the storage server, and the methods provided to the upper EUNOMIA layers; ◼ Section 6 shows the current limitations and some the possible approaches to address them; ◼ Section 7 concludes the document.

Type of deliverable PUBLIC Page | 8 H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5

1.3 Relation to Other Tasks and Deliverables

This report documents one of the results of WP3 (Vision Materialisation and Technical Infrastructure) and is one of outputs of the design phase of the peer to peer infrastructure (T3.5). It complements the specifications and architecture design (D3.2) focusing mainly on the P2P storage and communication needs. This document along with the design and implementation of security and privacy framework (D3.3), the design and implementation of the blockchain infrastructure (D3.4), specifies the EUNOMIA backend support infrastructure in order to accomplish the technical requirements and the user requirements already consolidated on the report on user needs and requirements (D2.4). Previous work on tools assessment (D3.1) provides the tools that will be checked against EUNOMIA requirement already specified in the over mentioned deliverables, the P2P technology that fits EUNOMIA will be selected to allow the final P2P node design and architecture specified in the document. Task 3.5 is dedicated to the setup of a peer to peer infrastructure and runs almost until the end of the project. This infrastructure will be slightly adapted during the several iterations of the development and testing phases, if limitations are encountered during these iterations, these can also trigger changes on the design and like so, changes on what is described in this document.

@Copyright of EUNOMIA Consortium Page 9 `

2. P2P GOALS

In this section, the goals of the P2P infrastructure are extracted from the user and technical requirements in the context of the EUNOMIA project. This allows to formulate a set of specific additional requirements related to the storage and sharing of data within the EUNOMIA platform.

2.1 EUNOMIA Data requirements

EUNOMIA P2P data requirements were already expressed in previous documentation, in the form of user, functional and non-functional technical requirements. For easier analysis, they are consolidated here (see Table 1). Table 1: EUNOMIA P2P requirements. ID Requirement Requirement Type FR-1 End-User must be able to create an account. Generic functional requirement. FR-2 End-User must be able to authenticate. Generic functional requirement. FR-3 End-User must be able to view the account Generic functional page. requirement. FR-4 End-User must be able to revoke the account. Generic functional requirement. FR-5 End-User must be able to maintain the Generic functional account. requirement. FR-6 End-User must be able to vote on content Generic functional trustworthiness on EUNOMIA users posts. requirement. FR-7 End-User should not be able to vote on own Generic functional post. requirement. FR-8 End-User must be able to view Generic functional trustworthiness indicators of EUNOMIA requirement. users’ posts. FR-53 A USER won’t able to message another Functional requirement. EUNOMIA USER (using EUNOMIA). FR-54 DC should safeguard all your EUNOMIA Functional requirement. related account data outside your mobile device. FR-55 EUNOMIA should have a simple algorithm to Functional requirement. assign USERS to NODES. FR-56 A USER should be able to connect to any Functional requirement. EUNOMIA node in order to access the EUNOMIA platform.

Type of deliverable PUBLIC Page | 10 H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5

FR-57 EUNOMIA off chain storage must store User Functional requirement. accounts. FR-58 EUNOMIA off chain storage should store Functional requirement. Information cascade related data. FR-59 EUNOMIA off chain storage should store Functional requirement. User-view related data of the information cascade (shared among users). FR-60 EUNOMIA off chain storage should store Functional requirement. machine learning models. FR-61 EUNOMIA off chain storage should store Functional requirement. machine learning training data. NFR-21 P2P storage must be always accessible. Non-functional.

Besides the aforementioned requirements, additional non-functional requirements are defined and evaluated, to ease the scalability, support and maintenance of EUNOMIA platform. These additional technical requirements are presented in Table 2. Table 2: Additional technical requirements. ID Requirement Requirement Type ANFR-01 P2P should be based on open Non-functional. source technologies. ANFR-02 Support can be possible by the Non-functional. community. ANFR-03 P2P should be based on open Non-functional. standards already stablished. ANFR-04 The selected P2P technology Non-functional. should be mature. ANFR-05 Scalability - the P2P should scale Non-functional. well with the number of users. ANFR-06 Resilience – the P2P should be Non-functional. able to tolerate a pre-defined number of failed nodes.

These requirements match the EUNOMIA general needs that the selected technology for the P2P infrastructure provides support to.

@Copyright of EUNOMIA Consortium Page 11 `

2.2 Goals

Taking into account the project needs, expressed by its requirements, P2P must implement the following low-level operations, that satisfy/support most of the previously mentioned requirements. Table 3 matches the P2P low-level storage functions with the functional and non-functional requirements. Table 3: P2P Storage functions. P2P Storage Functions Satisfied Requirements Notes Store arbitrary structured data FR-1, FR-2, FR-5, FR-6, FR-57, It can store arbitrary data FR-58, FR-59, FR-60,FR-61 with an arbitrary schema. Retrieve indexed arbitrary data FR-2, FR-3, FR-5 It must be possible to retrieve the data stored on the P2P infrastructure, based on an indexing key. Store data for an arbitrary time FR-2, FR-3, FR-5, FR-6, FR-57, Data should be stored for an FR-59, FR-60 arbitrary time. Delete arbitrary data FR-4 It will allow data deletion given the indexing key. Share the data available to FR-2, FR-6, FR-8, FR-54, FR-56, Data should automatically be other nodes FR-57, FR-58, FR-59, FR-60, FR- available to other nodes. 61 Must allow other nodes to join ANFR-05 Must allow to extend the in. network. Must allow to replicate data. ANFR-06 Must support a configurable replication factor in order to sustain node failures.

In the current implementation these low-level functions are generic enough to handle some unforeseen needs during project development, because the structure of data stored in the P2P infrastructure is not enforced by any means.

Type of deliverable PUBLIC Page | 12 H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5

3. P2P STATE OF THE ART

Related work Unlike the more traditional client-server architectures, P2P technology is a distributed application architecture, that partitions tasks, workloads and information among peers. In a P2P network, each node generally provides the same functions and services, it acts as a server and a client. While the concept is not new, one could argue that the original Internet, ARPANET, as it was initially designed by DARPA, where each node could freely send and receive packets of data to any other node in the network, was basically a distributed, decentralized P2P network. The need for distributed and decentralized data storage isn't new either, for example one of the first services running over the internet, the Domain Name Service (DNS), whose distributed database is a good example of data partitioning between a wide distributed set of nodes, although be generally used on a more hierarchical client-server style architecture. The concept of P2P networks as such, only really received interest from the market in the early 2000s with the birth of the likes of Napster, Gnutella and Kazaa, which really took to the masses P2P technology in the form of file sharing services, the research and proposals that allowed to design and implement these type of structures have been ongoing for a couple of years. Since then, several protocols have emerged based on the concept of a P2P overlay networks and have been widely surveyed [1]–[3]. P2P file sharing networks can be classified regarding the existence of a central repository that stores an index to all the files, centralized networks (e.g. Napster), or decentralized where the location of the files is obtained searching through the network using some search algorithm (e.g. Kazaa). EUNOMIA aims to be a decentralized platform. So, centralized P2P networks will be discarded from this analysis. Most decentralized P2P networks can be classified regarding their topology, which means how they route the requests for data (i.e. search), as structured or unstructured. Unstructured P2P Unstructured P2P file sharing networks usually resort to some flooding mechanism [4] broadcasting to nearby peers the requests for data, hoping it reaches the node which holds the searched data. Unlike to structured P2P networks, peers leaving or joining (i.e. churn) is not considered a problem. Serverless network file systems presented by Thomas Anderson et al. [5] based on xFS [6], proposes to use a table in which from the filename (index) one could find in which nodes the file was available (replicated). Later Ross Anderson proposed the concept of Eternity Service[7] as a communication channel resistant to denial of service attacks which would allow to store data for long time using a widely accepted, but still inexistent at the time, digital currency to support the infrastructure cost. Inspired in the Eternity Service, Usenet, DNS and the world wide web, with the aim of creating a distributed storage network, where anyone with a given key could easily retrieve the data corresponding to that key, Freenet [8]–[10] was proposed. It is an unstructured P2P overlay network that was built in order to allow anonymous access to and handling of data (publication, replication and retrieval) over wide network infrastructures. Content indexing was based on a proximity function to evaluate which node is more likely to have a given key, or have knowledge its whereabouts, supported by a controlled flooding mechanism, hashing the descriptive filename

@Copyright of EUNOMIA Consortium Page 13 ` into a key and local caching data. The algorithm was based on the small-worlds phenomenon [11], [12] drawing from the fact that people will first connect to nodes which they already know and nowadays provides an internet wide way for anonymously accessing data. Gnutella protocol and application [13], [14] is a file sharing system created in early 2000's by an AOL subsidiary for exchanging files using a P2P architecture in a similar way to Napster. It allows a node to connect to a set of other nodes and send messages to multiple nodes in order to locate a specific file. FastTrack [15] is a proprietary protocol widely used by file sharing applications such as KaZaa, in which nodes are organized in a two-tier hierarchy (supernodes and regular nodes) and each node maintains a local resource index which it shares with the associated supernode. Unstructured Multisource Multicast [16] aims at addressing lookup by means of multicasting. For that, UMM builds muilticast trees built using implicit information from flooding the overlay to get data about optimal path. The widely known BitTorrent protocol [17] uses a tit-for-tat method for seeking efficiency and fairness in contributing to the P2P network. Each file must be managed by a tracker node which keeps track of all the other nodes who may have parts of that file. Structured P2P In structured P2P networks, search is implemented through a specific distributed data structure to provide an efficient way to discover the node where data is stored [2]. Distributed hash tables (DHT) [18] are one of the mostly used data structures adopted by structured P2P storage networks, providing efficient ways to locate data from a specific search key. While optimized for search operations, structured P2P networks suffer from problems due to peer churn, resulting in rebalancing overheads to keep the data stored efficiently. [19] is one of the first distributed scalable lookup protocols. It uses DHT and consistent hashing [20], being able to perform a lookup in an N node network with O(log N) messages. Similarly Pastry [21] P2P overlay takes into account network locality and exploits prefix routing [22] to reduce average path length. Coral introduces the Distributed Sloppy Hash Tables (DSHT) [23] built on a layer on the Chord lookup service which lets nodes locate nearby copies of a file, avoiding overloading nearby hosts when a key becomes popular). DSHT allowed the introduction of a decentralized clustering algorithm by which nodes can find each other and form clusters of varying network diameters. Content addressable network (CAN) [24] aims self-organization, fault tolerance, low-latency and scalability and demonstrates it through simulation. It uses a DHT on a virtual d-dimensional coordinate space on a d-torus, dynamically portioned among all nodes in which each key is mapped into the coordinate space using a uniform hash function. A node in the CAN learns and maintains the IP addresses of those nodes that hold coordinate zones close to its own to support routing. Data lookup, path length, grows at a rate as of O(n^{1/d}). [25] is another DHT based P2P overlay, resorting to consistent hashing together with a XOR-based metric to compute the distance among keys resulting in a O(log n) lookup performance for a n node network. Being a well-known fact that DHT destroys string locality, SkipNet overlay network [26] tries to address this by using the string name of nodes as the data record keys, instead of plain hashes,

Type of deliverable PUBLIC Page | 14 H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5 and forming a double-linked ring instead of a list, while still achieving a O(log n) performance similar to Chord together with controlled data placement. P2P Solutions Several technological solutions implement these concepts of P2P storing such as: GNUnet, Resílio, Storj, BitTorrent-tracker and clients, Ias2peer, Barrel, ScuttleBot, TomP2P, IPFS, etc. GNUnet [27] is a framework for implementing decentralized, anonymous, P2P mesh network. It provides resources for peer discovery, routing, encrypted communications, resource management and content addressing. It includes a DHT based on Kademlia. Resílio Connect [28] is a commercial product that provides a scalable, P2P solution used for moving and syncing data, while enabling data to be shared across several peers. Storj [29] is a commercial solution that delivers to his customers using the blockchain technology, it can be used to store raw information or structured information like transactions. Existing since 2003, several open-source projects exists that provides all means and software needed to setup a BitTorrent network. The Tracker [30] and also several clients such as qBittorrent [31] and Transmission [32]. las2peer [33] is a Java-based Open Source framework for distributing community services in a peer-to-peer infrastructure and it is being developed at Aachen University. It provides distributed data storage and communication encryption and also allows to write RESTFull APIs in order to allow the integration and interaction with external systems. Barrel [34] is a modern document-oriented database in Erlang, with an open source licensing, focusing on data locality (put/match the data next to you) and P2P, allowing master-master replication, providing configurable redundancy. It provides an HTTP API allowing for external application integration. Scuttlebot [35] is an open source P2P log store. It can be used as a database, and identity provider, and provides also a messaging system. It features global replication, file-synchronization, and end-to-end encryption. Scuttlebot forms a global cryptographic social network with its peers. Each user is identified by a public key, and publishes a log of signed messages, which other users can follow socially. Scuttlebot searches the P2P mesh for new messages and files from followed users and from FoaFs (friend of a friend's). The messages and files are stored locally, indefinitely, for applications to read. It provides an API for external application integration and features global replication. TomP2P [36] is a DHT allowing storing multiple values for a key. Each peer keeps a table in disk or memory to store its values. The underlying communication framework uses Java NIO to handle many concurrent connections. In TomP2P there are indirect and direct replication mechanisms available, the direct replication can be described as peers constantly publishing their content and the indirect as peers are publishing content for others. P2FFS is a file system implemented on top of FUSE using the P2P Kademlia [37] protocol. The P2PFS is supported by the P2P Kadmilia implementation tomp2p that is a P2P library and a DHT implementation, providing a decentralized key-value infrastructure for distributed applications. XtreemFS [38] is a general purpose storage system that covers most of the storage needs in a single deployment. It is open-source and requires no special hardware or kernel modules. Can be used as a file system mounted on several OSs like: , Windows and OS X.

@Copyright of EUNOMIA Consortium Page 15 `

Similar to the Freenet, Tahoe-lafs is a free and open decentralized cloud storage system [39]. It distributes data across multiple nodes. Like P2PFS, it provides a file system transparent interface, allowing regular files to be accessed by regular application means, being completely transparent to the application that uses it. Ivy [40] is a multi-user read/write peer-to-peer file system based on Chord. It is suitable for small cooperative groups spread over large geographic areas. An Ivy file system consists solely of a set of logs, one log per participant. Ivy stores its logs in a distributed hash table and each participant finds data by consulting all logs, but performs modifications by appending only to its own log. This arrangement allows Ivy to maintain meta-data consistency without locking. Ivy users can choose which other logs to trust, an appropriate arrangement in a semi-open peer-to-peer system. It does support replication mechanisms allowing the same block to be stored on different peers. Known as the InterPlanetary File System (IPFS) [41] is a distributed file system similar to a Bittorrent swarm exchanging GIT objects aiming at providing a high-throughput content addressed block storage service. IPFS combines different proven techniques into cohesive solution. It uses S/Kadmlia (an extension of Kademlia [42]) crypto puzzles for generating node identifiers and uses DSHT based on Coral minimizing impact on hosts holding popular keys and allowing node clustering. It resorts to self-certifying pathnames from SFS [43] in which the 'filename' certifies the location of the data. The location of a specific object (file or other data structure) is not an address like a Uniform Resource Locator (URL) but instead, by its content (actually the hash of the stored data). This is called content-addressing, being one of its side effects avoiding unwanted duplication (if two files are the same, they have the same hash value, therefore they are the same object). Using the same approach as the popular version control system GIT, Merkle Directed Acyclic Graph are used to provide distributed version control for all stored objects. There is plenty of ongoing work, bringing together the blockchain and P2P technologies like the one presented in [44] in which a blockchain is being used to provide integrity of the data and metadata stored in a P2P network. Recently techniques to improve the efficiency of replication mechanisms, like GLARAS are being proposed. Or strategies to improve the response to node churn like [45]. In this section, an overview of different P2P protocols and technologies is presented. Each assessed technology, is characterized, taking into account the project requirements and needs already previously expressed.

3.1 Existing technologies

Due to their relevance in the field, the technologies depicted in Table 4 were analysed. Table 4: Analysed P2P technologies. Name Description Resílio [28] Resílio Connect provides a scalable, P2P solution called for moving and syncing data, while enabling data to be shared across several agents or peers. Storj [29] Storj delivers object storage to his customers using the blockchain technology, it can be used to store raw information or structured information like transactions.

Type of deliverable PUBLIC Page | 16 H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5

BitTorrent-Tracker and Client Existing since 2003, several open-source projects exists that [30], [31] provides all means and software needed to setup a BitTorrent network. Ias2peer [33] las2peer is a Java-based Open Source framework for distributing community services in a P2P infrastructure. Barrel [34] Barrel is a modern document-oriented database in Erlang focusing on data locality (put/match the data next to you) and P2P. Scuttlebot [35] Scuttlebot is an open source P2P log store used as a database, identity provider, and messaging system. It features global replication, file-synchronization and end-to-end encryption. TomP2P [36] TomP2P is a DHT with additional features, such as storing multiple values for a key. Each peer has a table (either disk- based or memory-based) to store its values. P2PFS [37] A file system implemented on top of FUSE1 using the P2P Kademlia protocol. XStreemFS [38] XtreemFS is a general purpose storage system and covers most storage needs in a single deployment. Tahoe-lafs [39] Similar to the Freenet, is a free and open decentralized cloud storage system IPFS [41] Known as the InterPlanetary File System (IPFS) is a distributed file system that aims to connect all computing devices with the same system of files, similar to a single bittorrent swarm exchanging GIT objects. Ivy [40] Ivy is a multi-user read/write peer-to-peer file system based on Chord.

The next section compares these technologies regarding the maturity level, the used protocols, the efficiency and how they present themselves to the upper layers.

3.2 Comparison matrices

These existing technologies differ on several levels. From the maturity perspective, some are just prototypes while others are full blown software components that may or may not share protocol implementations. In the case of the existing implementations, they can be presented as final commercial products while others are open source projects handled by the community.

1 Filesystem in userspace (FUSE) is an interface available in some based operating systems which allows a non-privileged user to implement a filesystem without requiring kernel level access.

@Copyright of EUNOMIA Consortium Page 17 `

These P2P networks are usually supported by the one or more of the following known protocols and algorithms, presented in Table 5, where each one is classified according to:

◼ type – the network topology (structured or unstructured); ◼ lookup – how the lookup time grows in terms of number of nodes (N) and other parameters; ◼ churn – how is the support with the nodes leaving and joining the network; ◼ security notes – some notes about the security mechanisms they implement. Table 5: P2P protocols classification. Designation Type Lookup2 Churn Security Performance drops and high Chord [19] S O(log N) overhead N/A Resilience is a factor of CAN3 [24] S (d/4)*n^1/d node degree N/A Robust for new Pastry[21] S O(log BN) departures N/A Good support through Kademlia [25] S O(log N) redundancy Old nodes more trusted Multiple alternate SkipNet[26] S O(log N) paths Controlled data placement Routing redundancy and small- Constrained world FreeNet[9] U (UD) Flooding topology Anonymity and privacy Flooding over Heartbeat multicasting monitoring UMM[16] U (UD) trees mechanism N/A

2 The lookup time is expressed as a function in Big O notation (https://en.wikipedia.org/wiki/Big_O_notation) and the variables taken into account are number of nodes (N), the number of zones (n) and the neighbours of a node (d) and the base of the identifier space (B). 3 d is the number of dimensional spaces configured

Type of deliverable PUBLIC Page | 18 H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5

Table 6, compares the different P2P technologies (implementations) regarding if they are available as open source projects or are commercial products and also if the provided APIs are able to be used as a traditional file system based or not. Table 6: P2P technologies comparison matrix. Name Type Open source Commercial File system (Product/Framework) based Resílio Product NO YES NO Storj Product NO YES NO BitTorrent- Tracker and Client Framework YES NO NO Ias2peer Framework YES NO NO Barrel Framework YES NO NO Scuttlebot Framework YES NO NO TomP2P Framework YES NO NO P2PFS Framework YES NO YES XStreemFS Framework YES NO YES Tahoe-lafs Framework YES NO YES IPFS Framework YES NO YES Ivy Framework YES NO YES

Table 7 exhibits a summary of the evaluated P2P technologies and the protocols they implement. The technologies are classified by a forecast adoption difficulty in the context of EUNOMIA. This adoption difficulty resulted from the analysis of the available information about each technology, including the programming language they are implemented, forums and on other information identified during our research. Notes column presents conclusions from from testing the different technology solutions, and the corresponding project requirements they met within EUNOMIA needs. Some of them demonstrated to be mere prototypes (very low TRL) without a stable implementation to be used in this project and testing provided no additional information (N/A). Table 7: Summary of P2P technologies. Designation Difficulty Protocols Language Notes Requirements Met Resílio Low (Payed) BitTorrent N/A N/A N/A Similar to Storj Low (Payed) Kademlia N/A N/A N/A

@Copyright of EUNOMIA Consortium Page 19 `

las2peer allows the development of a custom API. More oriented to web services. But by means of development it’s possible to implement the Low required Ias2peer (Open Source) Custom4 Java features. All Scuttlebot works as a publish- subscribe app, Low Bash & and allows the FR 53-56, Scuttlebot Custom5 (Open Source ) JavaScript development of NFR-21 a custom API Besides of the API already provided TomP2P uses DHT allowing multiple values Low Similar to TomP2P Java for each key, All Kademlia (Open Source ) and allows the development of a custom API. IPFS has multiple modules, including, OrbitDB, that Joins Git, has an available Low Bitorrent Go & IPFS REST API, All and JavaScript (Open Source ) which allow the Kademlia development of a custom API for the purposes of the project.

4 It uses a custom protocol not identified during our review. 5 It uses a custom protocol not identified during our review.

Type of deliverable PUBLIC Page | 20 H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5

P2PFS is an academic High Similar to project with no P2PFS Java N/A (Academic) Kademlia support, Which uses TomP2P Difficulties testing and High Similar to setting up. Poor XStreemFS (Low maturity) Freenet N/A documentation. N/A Ivy has scalability issues and conflicts may appear on High log records Ivy Chord N/A which can only N/A (Low maturity) be solved with a set of specific Tools, and the last distribution found of Ivy is from 2003.

3.3 Selected technologies

EUNOMIA decentralised architecture has by definition no central point, that implies that all nodes are equal in terms of the functionality provided, so there isn’t a central point governing all the nodes. This implies the usage of an unstructured peer-to-peer technologies or structured but dynamic, i.e. having mechanisms to redefine the structure of the overlay dynamically like it happens with Kademlia’s DHTs. Also, it is important to have a high level of support and to use a mature solution with enough documentation and community support and low complexity associated in terms of usage, integration and implementation. The following technologies were not selected due to: ◼ Resílio and Storj - being commercial products that could impact the EUNOMIA scalability and independence; ◼ Scuttlebot – It was not possible to evaluate the performance of the technology and lack of information, also it does not support storage of structured data types; ◼ P2PFS – Is a highly academic project with a low level TRL; ◼ XStreemFS – is also too immature, we have difficulties trying to setup a pilot during our testing; ◼ Ivy – have scalability issues, and the support is too low, we could not find activity since 2003.

@Copyright of EUNOMIA Consortium Page 21 `

From the information collected previously in Table 7, taking into account the additional non- functional requirements, the candidates selected that satisfy all the needed requirements are the following:

◼ las2peer; ◼ TomP2P; ◼ IPFS.

La2peer, allows the development of distributed web services. However, it does not directly provide a data oriented P2P network, but instead it is a highly reliable and secure platform for creating community information systems and community service. Indirectly it is possible to implement some of the needed mechanisms and that is why it is being considered here. Also no information was found related to the performance of the storage space, in terms of lookup time and scalability.

TomP2P is a DHT with additional features, such as storing multiple values for a key. Each peer has a table (either disk-based or memory-based) to store its values. A single value can be queried / updated with a secondary key. The underlying communication framework uses Java NIO to handle many concurrent connections. It uses Kademlia’s DHTs and it is a potential candidate also, but doesn’t provide any additional modules.

From these, and given the information that was possible to collect, the most well supported, mature, open source and with a large user is the IPFS. The underlying technologies used by IPFS, based on Bitorrent and Kademlia like it’s described in Table 5, provide a good efficiency in terms of lookup (O(log N)) and respects the distributed characteristics of EUNOMIA architecture. It was also possible to find a large and active community around it, with several auxiliary modules developed meanwhile, some of which that can be applied to EUNOMIA, namely:

◼ OrbitDB [46] – A document based database implemented on top of IPFS that allows to store arbitrary JSON documents; ◼ HTTP API for OrbitDB [47] – an HTTP API developed to allow an easier integration of foreign modules with the database using REST.

The IPFS project contains many other modules that can be eventually used. Another feature that might be useful in the future and it is already used internally by the cluster management component of IPFS, is the Publish-Subscribe mechanism. This communication mechanism is a pattern often used to handle events in large-scale networks. ‘Publishers’ send messages classified by topic or content and ‘subscribers’ receive the messages in the topics they have been subscribed to, all without direct connections between publishers and subscribers. This approach offers greater network scalability and flexibility to implement generic node messages that might be useful for some EUNOMIA features. In this case, a message broker with a publish-subscribe interface would be implemented as part of the Storage Server, providing the underlying messaging functionality to every service running on a EUNOMIA node.

Even though IPFS was selected based on the project requirements, this list can be revisited, if for some unforeseen reason, IPFS could not cover the required functionality for a robust EUNOMIA

Type of deliverable PUBLIC Page | 22 H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5 implementation during the execution of this task. The upper level APIs would remain the same, avoiding changes in the implemented services.

@Copyright of EUNOMIA Consortium Page 23 `

4. P2P INFRASTRUCTURE

The EUNOMIA decentralised architecture consists of several nodes interconnected with each other and an application to run on user devices which will communicate with the EUNOMIA Services Nodes. The P2P data storing component is just an additional service that is added horizontally to all the EUNOMIA nodes, responsible for the data storing and sharing.

Node 0 Node 1 Node … n

Storage Service Storage Service Storage Service

OrbitDB OrbitDB OrbitDB

IPFS IPFS IPFS

P2P Network

The topology of the P2P network nodes is completely flat and the connections between the nodes can be arbitrary. The EUNOMIA nodes form a global IPFS cluster, which share a specific key, without a given structure. No topology in the EUNOMIA network nodes is enforced by the IPFS storage functionality and it is assumed that from the P2P point of view, every node is equal and have the same functional components running in it. Not requiring a specific topology, with some nodes being privileged over others, is aligned with the decentralized nature of EUNOMIA and supports multiple governance models. Next sections describe the different components involved.

4.1 P2P Node

In this section, the P2P node is dissected, from the upper layer to the lower layer protocols. Please note that these components are only related to the P2P storage and communication functionality. Other EUNOMIA services that implement functions besides storage are out of scope of this document. Every EUNOMIA P2P node runs an instance of:

◼ Storage Server – The goal of this component is to provide an abstraction layer that implements the data storage and sharing functionally to other EUNOMIA services. It exposes a REST interface that is consumed by other EUNOMIA services. This includes an abstraction over the functions required to store data and the functions to ensure the integrity of stored data, the blockchain service;

Type of deliverable PUBLIC Page | 24 H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5

◼ OrbitDB HTTP API – An adaptor that provides a REST interface to the OrbitDB. This layer provides an easy integration with OrbitDB without changing the database storage capabilities and semantics; ◼ OrbitDB – An OrbitDB instance that implements a document-based storage engine on top of IPFS; ◼ IPFS Cluster – A daemon that implements the IPFS cluster functionality and runs the distributed protocol in order to maintain the data replication factor between the nodes and other cluster functions; ◼ IPFS Daemon – A low-level daemon that implements the IPFS on a given host. These components are layered on a given node as illustrated in Figure 2.

EUNOMIA Service 1 EUNOMIA Service 2 EUNOMIA Service n

Invokes EUNOMIA Layer Invokes Storage Server REST API Invokes

Storage Server

OrbitDB HTTP API

OrbitDB P2P Storage Layer IPFS Cluster

IPFS Daemon

Host Host Internal Storage Network Resources

Figure 2: Anatomy of a P2P EUNOMIA node. The green components shown above, correspond to the EUNOMIA layer, where all the functionality is built in. The blue components show what P2P infrastructure is providing, and it is this component that will use the host resources, represented by the grey boxes:

◼ storage – where replicated and cached data is maintained locally; ◼ network – where IPFS protocols and data are transferred and looked up between the nodes. P2P infrastructure will provide EUNOMIA with the following storage mechanisms:

◼ a configurable level of redundancy – where data is replicated between the nodes with a given configurable replication factor. Taking into account a replication factor of N, IPFS Cluster will ensure that the data is “pinned” on at least N physical nodes; ◼ a shared storage space supported by the local storage devices of every participating node; ◼ binding of the nodes that participate on a given P2P network;

@Copyright of EUNOMIA Consortium Page 25 `

◼ management of node failure detections and election of new nodes as the ones that will pin the shared data in order to respect the required replication factor; ◼ discovery and connection of new nodes management; ◼ communication between nodes using SSL channels; ◼ whitelisting of the nodes participating in the cluster.

4.2 P2P Network

The EUNOMIA decentralised architecture consists of several nodes interconnected with each other, with each node being an instance of the P2P node described previously extended with the rest of EUNOMIA components. Since there is no predefined topology, the network can assume an arbitrary graph with arbitrary connections. Also, IPFS assumes that the nodes can be transient, they can fail, reappear, and the underlying algorithms try to consider with all these scenarios. IPFS uses a gossip protocol [48] as a way to ensure that data is disseminated to all nodes of EUNOMIA in order to respect the replication factor. In IPFS a given block of data, such as file, is represented by a content identifier (CID). Contrary to what happens in the web or within a filesystem, where a given content is identified by an URL or file path, in an IPFS network, the content is uniquely identified by the content itself, this scheme is called content based addressing. This content scheme makes the address of a given block of information, independent from where it is stored. In IPFS when a node requests a given CID, it will search through its DHT part to find the node that can reach the requested content. When finally, the node that have the content is found, that content will be copied to the node that requested it, and it will be cached and served. This is depicted in Figure 3.

Figure 3: Content based addressing used in IPFS.

From the point of view of IPFS, the network can be seen as an arbitrary mesh of nodes, storing data blocks with a given CID, like the one described in the following picture:

Type of deliverable PUBLIC Page | 26 H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5

gossip

Node 2 gossip gossip CID1 Node 1 Node 6 CID1 gossip CID2

Node 3 gossip gossip CID2 gossip gossip Node 5 Node 4 CID3 CID3

Figure 4: PSP Network with 3 blocks with a replication factor of 2. Figure 4 depicts the gossip protocol which runs between the nodes, in order to exchange messages related to the cluster management, including guarantying that the content is pinned with a correct replication factor. Each node stores one content block, which in this particular case is replicated (pinned) at least twice. Please note that a given node can cache other data blocks as well, like the ones that are internally requested at any given moment by a local EUNOMIA service.

@Copyright of EUNOMIA Consortium Page 27 `

5. P2P NODE API

In this section, the P2P node API is described. Note that this refers to the upper API that is presented to the intranode EUNOMIA services. This API encapsulates all the P2P underlying complexity, providing a decoupling layer from the specific P2P technology that is being used. This decoupling will allow to reconsider other P2P technologies in the future.

5.1 Operations

EUNOMIA services invoke the storage server API, when they need to store, retrieve or update arbitrary data represented by an object. An object is defined by a given ID, type and content. The API follows the REST paradigm and allows CRUD operations of objects with arbitrary data structures and is depicted in the following table: Table 8: Storage Server REST API. Operation Path Input Description GET /objects Type – The type of List all objects of a the desired objects. specified type. POST /objects JSON containing Create a new arbitrary properties object. and a mandatory ID and Type. DELETE /objects/{ID} ID – The object ID. Deletes an existing object with a given ID. GET /objects/{ID} ID – The object ID. Gets an existing object with a given ID. PUT /objects/{ID} JSON containing the Updates an existing new properties of object with new the object. property values.

Please note that not all arguments are represented in the operations of the table above. Since all the operations are authenticated, they are always invoked passing the EUNOMIA’s access token argument, retrieved from the AAA authentication API, as described in the specifications and architecture design (D3.2). This API represents the current Storage Server API implementation, but can be extended or changed in the future, with additional operations, in order to support needed functionalities not currently implemented in this version of the EUNOMIA platform.

5.2 Overall integration

In the previous sections, a detailed P2P node was described in terms of its low-level components. In this section it is described how the system is used as a whole. It was decided to have a one to one relationship between a P2P node and a EUNOMIA node, basically, the P2P node is just one

Type of deliverable PUBLIC Page | 28 H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5 of the internal components living inside a EUNOMIA node. Arbitrary connections can exist between the nodes, and from the point of view of the functionalities provided, they are essentially equal, being the global state maintained in the distributed storage space, formed by the individual local storage of the nodes that constitute the entire P2P network. Every node shares the same distributed storage space, acting like a global database that is available to all the nodes, highly redundant and dynamic. The digital companion uses the services that are running on a given node, and can connect to an arbitrary node, the following figure illustrates the global high-level architecture:

Node 2 Node 3 User

Node 1 Shared Storage Node 4

Figure 5: EUNOMIA nodes. In Figure 5, the “User” is connected to “Node 2” but if it becomes unavailable or if the user connects to another node, the shared storage contains the needed data, including the cascade information associated with the posts that the user is seeing on the digital companion running in his or her terminal. All this data is redundantly replicated in the common shared storage. The resilience of the storage is a function of the number of available nodes. With more nodes, the network becomes more resilient and its capacity also increases.

Digital Companion

EUNOMIA Services

Storage Service

Shared Storage Space

Figure 6: Storage data flow.

@Copyright of EUNOMIA Consortium Page 29 `

The data flow is depicted in Figure 6, in which the digital companion running on the device of the user, connects to one or more services running in the EUNOMIA nodes, which in turn, use the local storage service to access the shared external storage. 6. LIMITATIONS AND CHALLENGES

A P2P architecture poses several limitations and challenges, some inherited from the selected technology (IPFS) and some from the strongly distributed and highly decentralized architecture of EUNOMIA. Some of the most prominent challenges regarding the EUMOMIA architecture are related to the security of the platform as a whole. These security challenges are addressed in the design and implementation of the security and privacy framework (D3.3). This section will focus on the limitations regarding the access and availability of data, but it also touches some of the security challenges mentioned in D3.3.

6.1 Known limitations

Table 9 depicts some of the foreseen limitations: Table 9: P2P limitations and challenges. ID Limitation/Challenge Description L1 IPFS stores the data in plain text, Without additional mechanisms the locally in the nodes. information is store in plain text in all the nodes belonging to the EUNOMIA network. This can pose a risk, if sensitive information is stored. L2 IPFS does not allow the information IPFS uses content addressing, from this to change. implicates that a change in data, is, in reality new data with a newly created address. L3 IPFS robustness is based in the The bigger the number of nodes that number of nodes. constitute the network the bigger is the robustness to node failures. L4 Without additional mechanisms data This poses a security risk, when a can be viewed and changed by any misbehaved or a malicious node join participating node. the network. L5 There is no mechanism in place to do IPFS does not automatically erase data data garbage collection that is no that is no longer used. longer used. L6 OrbitDB implements a strongly Data consistency between the nodes eventual consistent database. can be an issue if the functionality implemented is strongly transactional. L7 There is no central authority to This can be a problem taking into coordinate nodes provision. account the aforementioned issues.

Type of deliverable PUBLIC Page | 30 H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5

L8 Storage capacity limitations. Storage capacity can be an issue, if there are not enough nodes in the network. L9 The cluster secret key is shared The secret key is shared, that means among the IPFS peers. that if this key is compromised it must be changed across the cluster.

6.2 Possible solutions

This section presents some of the proposed solutions for the aforementioned limitations. These solutions are shown in Table 10. Some of the challenges are security related and are addressed in D3.3. Table 10: Possible solutions for some of the P2P current issues. ID Solution and comments L1 IPFS/OrbitDB stores information in clear text, this is often not an issue when the information is public, but it can be an issue if personal data is involved. Therefore, data must be classified accordingly to its privacy level, and a scope for usage must be defined. The Storage Server can then encrypt data before being stored at the IPFS level. Proper key management mechanisms need to be designed and implemented. L2 This is addressed by OrbitDB. For every document change, a new block is created, and in a cluster scenario where data replication is involved, conflicts are handled by the use of an append log, that is an operation- based conflict-free replicated data structure. L3 The number of nodes is crucial to the robustness of the platform. The strength of the different distributed technologies not only P2P but also the blockchain steams from the number of participants, the nodes that verify or store the content, and it is also a concept that is inherent on EUNOMIA. Therefore, the larger the number of users, the more resilient the platform will be. Simulation and testing will be required to define the minimal operational number of users. L4 This issue is similar to L1, and can be solved by cryptographic means in conjunction with the blockchain. This requires keeping history of data changed, at least during enough time to ensure the integrity of new data. L5 Old data can be removed by a garbage collector type of service, that would remove blocks no longer used or that violate some usage criteria. This garbage collector can eventually run on every P2P node, but the criteria will depend on the semantics of the data (e.g. personal data) being removed and on the necessity to preserve it for integrity purpose.

@Copyright of EUNOMIA Consortium Page 31 `

L6 Most of the functions of EUNOMIA are not inherently transactional6, so it is not predicted that this will be an issue. L7 This is an open question that touches the ongoing discussions on the governance model L8 Store capacity can be mitigated by means of a garbage collector that can run if the total storage capacity becomes an issue. L9 If the cluster secret key is compromised, it is possible to trigger a secret key change in all the nodes. The way how this can be implemented will depend on the governance model.

6 At the current stage, EUNOMIA services do not strongly depend on the order in which operations are executed in the shared storage.

Type of deliverable PUBLIC Page | 32 H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5

7. CONCLUSIONS

In this document, the P2P infrastructure design and current implementation is presented, beginning with an analysis of the state of the art of existing P2P technologies and solutions. These technologies were described and evaluated taking into account a set of attributes, relevant to their application within the EUNOMIA project and goals. The project requirements were analysed and described during the report on user needs and requirements (D2.4), relevant to the P2P storage were also consolidated in this report, and cross matched with the features already implemented by existing technologies. For better selection some non-functional requirements were added to the set, in order to extend EUNOMIA future support and development. From this set a technology was selected that maximized this match and IPFS was elected. In terms of design and implementation the P2P node anatomy was defined in terms of its layered internal architecture. Its upper layer interface was described in terms of operations and arguments, subjected to changes that will, for sure, occur during task development. IPFS is a mature P2P technology, with a vibrant and active community, with a strong online support. It is sponsored by activists and freedom seekers with a well stablished structure within the project. But, as with any technology it has its flaws or limitations. In this document some of these flaws were addressed, notably some security related ones, that will be addressed during the project development. Some of these related issues steams from the distributed EUNOMIA architecture and from the P2P technology used. Taking into account that this is a work in progress, some changes and new mechanisms will be introduced along the way, in order to mitigate the issues found, and of course, extend the current version of EUNOMIA implementation in order to reach the project goals. New services and components will be implemented in the EUNOMIA node, in order to implement the needed functionality and also to deal with the security issues that were identified in D3.3 and some of them, can impact the already defined P2P storage components. To support EUNOMIA governance model, the P2P infrastructure will be extended. Furthermore, some of the solutions identified in section 6 will be implemented in conjunction with measures on the Blockchain and Privacy and Security framework.

@Copyright of EUNOMIA Consortium Page 33 `

8. REFERENCES

[1] B. Pourebrahimi, K. Bertels, and S. Vassiliadis, ‘A Survey of Peer-to-Peer Networks’, p. 8, 2005. [2] J. Risson and T. Moors, ‘Survey of research towards robust peer-to-peer networks: Search methods’, Comput. Netw., vol. 50, no. 17, pp. 3485–3521, Dec. 2006. [3] A. Malatras, ‘State-of-the-art survey on P2P overlay networks in pervasive computing environments’, J. Netw. Comput. Appl., vol. 55, pp. 1–23, Sep. 2015. [4] H. Barjini, M. Othman, H. Ibrahim, and N. I. Udzir, ‘Shortcoming, problems and analytical comparison for flooding-based search techniques in unstructured P2P networks’, Peer--Peer Netw. Appl., vol. 5, no. 1, pp. 1–13, Mar. 2012. [5] T. E. Anderson, M. D. Dahlin, J. M. Neefe, D. A. Patterson, D. S. Roselli, and R. Y. Wang, ‘Serverless Network File Systems’, p. 21, 1995. [6] R. Y. Wang and T. E. Anderson, ‘xFS: a wide area mass storage file system’, in Proceedings of IEEE 4th Workshop on Workstation Operating Systems. WWOS-III, Napa, CA, USA, 1993, pp. 71–78. [7] R. Anderson, ‘The eternity service’, in Proceedings of PRAGOCRYPT, 1996, vol. 96, pp. 242– 252. [8] I. Clarke and D. . Mellish, ‘A Distributed Decentralised Information Storage and Retrieval System’, University of Edinburgh, Edinburgh, 1999. [9] I. Clarke, O. Sandberg, B. Wiley, and T. W. Hong, ‘Freenet: A Distributed Anonymous Information Storage and Retrieval System’, in Designing Privacy Enhancing Technologies, vol. 2009, H. Federrath, Ed. Berlin, Heidelberg: Springer Berlin Heidelberg, 2001, pp. 46–66. [10] I. Clarke, O. Sandberg, M. Toseland, and V. Verendel, ‘Private communication through a network of trusted connections: The dark freenet’, Network, 2010. [11] L. A. Adamic, ‘The Small World Web’, in Research and Advanced Technology for Digital Libraries, vol. 1696, S. Abiteboul and A.-M. Vercoustre, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 1999, pp. 443–452. [12] J. Kleinberg, ‘The Small-World Phenomenon: An Algorithmic Perspective’, InProceedings The32nd ACM Symp. Theory Comput. STOC, p. 12, 2000. [13] M. Ripeanu, ‘Peer-to-peer architecture case study: Gnutella network’, in Proceedings First International Conference on Peer-to-Peer Computing, Linkoping, Sweden, 2002, pp. 99–100. [14] ‘gnutella_protocol_0.4.pdf’. [Online]. Available: https://courses.cs.washington.edu/courses/cse522/05au/gnutella_protocol_0.4.pdf. [Accessed: 15-Dec-2019]. [15] J. Liang, R. Kumar, and K. W. Ross, ‘The FastTrack overlay: A measurement study’, Comput. Netw., vol. 50, no. 6, pp. 842–858, Apr. 2006. [16] A. Iamnitchi, ‘UMM: A Dynamically Adaptive, Unstructured, Multicast Overlay’, Serv. Manag. Self-Organ. IP-Based Netw., Dec. 2019. [17] B. Cohen, ‘Incentives build robustness in BitTorrent’, Workshop Econ. PeertoPeer Syst., vol. 6, Jun. 2003. [18] R. Devine, ‘Design and implementation of DDH: A distributed dynamic hashing algorithm’, p. 14, 1993. [19] I. Stoica et al., ‘Chord: a scalable peer-to-peer lookup protocol for internet applications’, IEEEACM Trans. Netw. TON, vol. 11, no. 1, pp. 17–32, 2003. [20] D. Karger, E. Lehman, T. Leighton, R. Panigrahy, M. Levine, and D. Lewin, ‘Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide

Type of deliverable PUBLIC Page | 34 H2020 Grant Agreement Number: 825171 Document ID: WP3 / D3.5

Web’, in Proceedings of the Twenty-ninth Annual ACM Symposium on Theory of Computing, New York, NY, USA, 1997, pp. 654–663. [21] A. Rowstron and P. Druschel, ‘Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems’, in Middleware 2001, 2001, pp. 329–350. [22] C. G. Plaxton, R. Rajaraman, and A. W. Richa, ‘Accessing Nearby Copies of Replicated Objects in a Distributed Environment’, p. 40, 1999. [23] M. J. Freedman and D. Maziéres, ‘Sloppy Hashing and Self-Organizing Clusters’, in Peer-to- Peer Systems II, vol. 2735, M. F. Kaashoek and I. Stoica, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2003, pp. 45–55. [24] S. Ratnasamy, P. Francis, M. Handley, S. Shenker, and R. Karp, ‘A Scalable Content- Addressable Network’, SIGCOMM, pp. 161–72, 2001. [25] P. Maymounkov and D. Mazières, ‘Kademlia: A Peer-to-Peer Information System Based on the XOR Metric’, in Peer-to-Peer Systems, 2002, pp. 53–65. [26] N. J. A. Harvey, J. Dunagan, M. Theimer, M. B. Jones, A. Wolman, and S. Saroiu, ‘SkipNet: A Scalable Overlay Network with Practical Locality Properties’, p. 38, 2002. [27] K. Bennett, C. Grothoff, T. Horozov, I. Patrascu, and T. Stef, ‘Gnunet-a truly anonymous networking infrastructure’, presented at the In: Proc. Privacy Enhancing Technologies Workshop (PET, 2002. [28] ‘Resilio: Fastest and Most Reliable Way to Move Data - P2P File Transfer and Synchronization’, 13-Mar-2018. [Online]. Available: https://www.resilio.com/. [Accessed: 13-Mar-2018]. [29] ‘Storj - Decentralized Cloud Storage’, Storj - Decentralized Cloud Storage, 13-Mar-2018. [Online]. Available: https://storj.io. [Accessed: 13-Mar-2018]. [30] bittorrent-tracker: Simple, robust, BitTorrent tracker (client & server) implementation. WebTorrent, 2018. [31] ‘qBittorrent Official Website’, 14-Mar-2018. [Online]. Available: https://www.qbittorrent.org/. [Accessed: 14-Mar-2018]. [32] ‘Transmission’, 14-Mar-2018. [Online]. Available: https://transmissionbt.com/. [Accessed: 14- Mar-2018]. [33] R. Klamma, D. Renzel, P. D. Lange, and H. Janßen, ‘las2peer – A Primer’, 2016. [34] ‘barrel - Distributed Database for the modern world’, 14-Mar-2018. [Online]. Available: https://barrel-db.org/. [Accessed: 14-Mar-2018]. [35] ‘Scuttlebot peer-to-peer log store’, 14-Mar-2018. [Online]. Available: https://scuttlebot.io/. [Accessed: 14-Mar-2018]. [36] ‘TomP2P, a P2P-based key-value pair storage library’, 14-Mar-2018. [Online]. Available: https://tomp2p.net/. [Accessed: 14-Mar-2018]. [37] A. F. Campos, p2pfs: Simple P2P file system using FUSE, built on top of Kademlia (tomp2p implementation). 2014. [38] ‘XtreemFS - Fault-Tolerant Distributed File System’, 13-Mar-2018. [Online]. Available: http://www.xtreemfs.org/. [Accessed: 13-Mar-2018]. [39] ‘Tahoe-LAFS’, 14-Mar-2018. [Online]. Available: https://tahoe-lafs.org/trac/tahoe-lafs. [Accessed: 14-Mar-2018]. [40] A. Muthitacharoen, R. Morris, T. M. Gil, and B. Chen, ‘Ivy: a read/write peer-to-peer file system’, 2002, p. 31. [41] J. Benet, ‘IPFS-content addressed, versioned, P2P file system’, ArXiv Prepr. ArXiv14073561, 2014.

@Copyright of EUNOMIA Consortium Page 35 `

[42] I. Baumgart and S. Mies, ‘S/Kademlia: A practicable approach towards secure key-based routing’, in 2007 International Conference on Parallel and Distributed Systems, Hsinchu, Taiwan, 2007, pp. 1–8. [43] D. Mazières and M. F. Kaashoek, ‘Escaping the evils of centralized control with self-certifying pathnames’, in Proceedings of the 8th ACM SIGOPS European workshop on Support for composing distributed applications - EW 8, Sintra, Portugal, 1998, pp. 118–125. [44] J. Li, J. Wu, and L. Chen, ‘Block-secure: Blockchain based scheme for secure P2P cloud storage’, Inf. Sci., vol. 465, pp. 219–231, Oct. 2018. [45] X. Qi, M. Qiang, and L. Liu, ‘A balanced strategy to improve data invulnerability in structured P2P system’, Peer--Peer Netw. Appl., Aug. 2019. [46] orbitdb/orbit-db. OrbitDB, 2019. [47] phillmac, phillmac/orbit-db-http-api-dev. 2019. [48] ‘Gossip | SpringerLink’. [Online]. Available: https://link.springer.com/chapter/10.1007%2F978- 3-642-17348-6_7. [Accessed: 13-Dec-2019].

www.eunomia.social

Type of deliverable PUBLIC Page | 36