On the Design of Autonomic, Decentralized VPNs

David Isaac Wolinsky, Kyungyong Lee, P. Oscar Boykin, Renato Figueiredo University of Florida

Abstract—Decentralized and P2P (peer-to-peer) VPNs (virtual whereas existing decentralized solutions require manual con- private networks) have recently become quite popular for con- figuration of links between peers, which is beyond the scope necting users in small to medium collaborative environments, of Archer’s target users. Current P2P VPN approaches either such as academia, businesses, and homes. In the realm of VPNs, there exist centralized, decentralized, and P2P solutions. Central- lack scalability or proper security components to be useful for ized systems require a single entity to provide and manage VPN VPN approaches. server(s); decentralized approaches allow more than one entity to We began our original foray into user-friendly VPN ap- share the management responsibility for the VPN infrastructure, proaches with IPOP [2]. Previous work on IPOP focused on while existing P2P approaches rely on a centralized infrastructure the routing mechanisms and address allocation with multiple but allow users to bypass it to form direct low-latency, high- throughput links between peers. In this paper, we describe a virtual networks (VNs) sharing a single P2P overlay. Though novel VPN architecture that can claim to be both decentralized a shared has significant drawbacks as misconfigured or ma- and P2P, using methods that lower the entry barrier for VPN licious peers could disable the entire overlay, rendering all deployment compared to other VPN approaches. Our solution VPNs useless, and the system would have to be recreated as extends existing work on IP-over-P2P (IPOP) overlay networks to there exists no methods to remove the peer from the overlay. address challenges of configuration, management, bootstrapping, and security. We present the first implementation and analysis Without a shared overlay, each VPN would need to deploy of a P2P system secured by DTLS (datagram transport layer a P2P infrastructure and security system prior to the VPN security) along with decentralized techniques for revoking user system, at which point, users may reconsider the P2P approach access. and prefer more traditional centralized approaches. To make an easy-to-use, fully decentralized P2P VPN, we I.INTRODUCTION extend the IPOP concept to support bootstrapping from public A Virtual (VPN) provides the illusion of infrastructures and overlays into private P2P overlays whose a local area network (LAN) spanning a wide area network membership is limited to an individual VPN user base. Our (WAN) by creating secure1 communication links amongst work is based upon Castro et al. [3], suggesting that a single participants. Common uses of VPNs include secure access to overlay can be used to bootstrap service overlays. We present enterprise network resources from remote/insecure locations, a practical implementation and evaluation of this concept. connecting distributed resources from multiple sites, and estab- We then consider security in the overlay and present the lishing virtual LANs for multiplayer video games and media first implementation and evaluation of an overlay with secure sharing over the . communiation both between end points in the P2P overlay The architecture described in this paper addresses usage sce- (e.g. VPN nodes) as well as between nodes connected by narios where VPNs are desired but complexity in deployment overlay edges. Security requires a means for peer revocation; and management limits their applicability. These include col- however, current revocation techniques rely on centralized laborative academic environments linking individuals spanning systems such as certificate revocation lists (CRLs). Our pro- multiple institutions, where coordinated configuration of net- posed approach allows revocation using scalable techniques work infrastructure across different sites is often impractical. provided by the P2P overlay itself. We call the completed Similarly, small/medium business (SMB) environments often system and the interface used to administrate it GroupVPN, desire the ability to securely connect desktops and servers a novel decentralized P2P VPN. across distributed sites without incurring the complexity or The rest of this paper is organized as follows. IPOP along management costs of traditional VPNs. Such a VPN could P2P overlays are introduced in Section II. Throughout the be used to enable extended families to share media among paper, there are two techniques used to evaluate our ap- themselves, such as family videos and pictures, where existing proaches, simulation and real system deployments, these are VPNs may be too complicated and where hosting by central- described in Section III. Section IV describes our techniques ized service may be undesirable for privacy reasons. that allow users to create their own private overlays from The model of a VPN for collaboration considered in this pa- a shared public overlay in spit of NATs. Use of security per is motivated from our Archer [1] project. Archer provides protocols has been assumed in many P2P works though a dynamic and decentralized Grid environment for computer without consideration of implementation and overheads, we architecture researchers to share and access voluntary compute investigate implementation issues and overheads of security cycles with each other. Use of centralized systems would in P2P with emphasis on P2P VPNs in Section V. Without limit the scope of Archer and require dedicated administration, revocation, use of security is limited, in decentralized systems, use of centralized revocation methods do not work, we present 1 For the remainder of this paper, unless explicitly stated otherwise, security novel mechanisms for decentralized revocation in Section VI. implies encryption and mutual authentication between peers. The complete system, GroupVPN, is presented in Section VII. punching [7]. When performing hole punching, peers first Section VIII compares and contrasts our work with related obtain mappings of their private IP address and port to their work. Section IX concludes the paper. public IP address and port and then exchange them over a shared medium, in this case the P2P overlay. The peers attempt II. BACKGROUND to simultaneously form connections with each other, tricking This section describes the core organization of IPOP, a struc- NATs and firewalls into allowing inbound connections, be- tured P2P virtual network, including background on structured cause the NAT believed an outbound flow already exists. In overlays, address allocation and discovery, and connectivity. case peers cannot establish direct connectivity, messages can A. P2P Overlays be relayed through the P2P overlay to each other, though with added latency and reduced throughput. The type of P2P overlay chosen for a VPN has an effect This approach enables peers behind NATs and firewalls to on how easy the VPN is to program, deploy, and secure, seamlessly connect to each other, without requiring peers to on its efficiency, and on its scalability. The two primary host their own bootstrap servers. If a peer were to host their infrastructures for P2P overlays are unstructured and structured own bootstrap servers, they first need a public IP address and systems. Unstructured systems use mechanisms such as global bind the application to a port on that system. At which point, knowledge, broadcasts, or stochastic techniques [4] to search they could share the IP, port pair with other peers in the VPN. the overlay. As the system grows, maintaining and searching However, if they were to go offline or their IP address changes this state may not scale. Alternatively, structured approaches the P2P infrastructure, new users would be unable to join.

provide guaranteed search times typically with a lower bound

of , where N is the size of the network. In terms of C. Network Configuration complexity, for small systems, unstructured systems may be In the context of VPNs, structured overlays can handle easier to implement, but as the system grows it may become organization of the network space, address allocation and inefficient. discovery, decentrally through the use of a DHT. Approaches IPOP uses a structured P2P framework named Brunet [5], along these lines have been proposed in [8], [9]. Membership which is based upon Symphony [6], a one-dimensional ring in the VPN includes a matching membership in the structured with a harmonic distribution of shortcuts to far nodes. Struc- overlay, thus all VPN peers have a P2P address. To address tured systems are able to provide bounds on routing and the challenges of having multiple VPNs in the same over- lookup operations by self-organizing into well-defined topolo- lay, each IPOP group has its own namespace, reducing the gies, such as a one-dimensional ring or a hypercube. Links in likelihood of overlap. To enable scalable and decentralized the overlay can be made to guarantee efficient lookup and/or address allocation and discovery, peers store mappings of

routing times (e.g. Brunet automatically creates links between IP address to P2P address into the DHT, typically of the

peers that communicate often to achieve efficiency in IP-over- form . Thus a peer P2P communication). attempting to allocate an address will insert this (key, value) A key component of most structured overlays is support pair into the overlay. The first peer to do this will be the owner for decentralized storage/retrieval of information such as a of the IP address allocation. Therefore the DHT must support distributed hash table (DHT). The DHT builds upon the atomic writes. existence of a P2P address space. All peers in a structured Mechanisms to self-configure the IP address and network system have a unique, uniformly distributed P2P address. A parameters of the local system can be provided by DHCP (de- DHT maps look up values or keys (usually by a hashing centralized host configuration protocol), manually configuring function) into the P2P address space. While there are various the IP address, or the VPN hooking into O/S APIs. Address forms of fault tolerance, a minimalist DHT stores values at the discovery is initiated when an outgoing packet for a remote node whose address is closest to the value’s key. DHTs can peer arrives at the VPN software. At which point, the VPN will be used to coordinate organization and discovery of resources, query the DHT with the IP address to obtain the owner’s P2P making them attractive for self-configuration and organization address and forward the packet to the destination. Discussion in decentralized collaborative environments. As explained in on both these topics is covered in more depth in our previous the next section, IPOP, uses a DHT to coordinate decentralized work [10]. organization. III. EXPERIMENTAL ENVIRONMENT B. Connecting to the VPN Throughout this paper, our quantitative evaluation environ- To connect to IPOP, a peer needs only to connect to an ment uses both real deployments on PlanetLab and simulation. existing Brunet infrastructure. Many IPOP systems can coexist The various evaluations dictate which mechanism we use. sharing a single overlay. The motivation for doing so is that When the perspective of a single node is useful, PlanetLab bootstrapping a P2P system can be challenging, requiring users provides good results, though when attempting to measure to have access to public IP addressable nodes or being able to detailed behavior of the entire system using PlanetLab can configure a router or firewall to enable inbound connections. result in significant noise. A peer connected to IPOP’s P2P infrastructure can take IPOP uses Brunet as the underlying P2P infrastructure for advantage of its support for NAT traversal through hole connectivity. Brunet has been in active development for the DHT Entry for Private Overlay

Overlay Link

Public / Private Overlay Mapping Overlay Communication Public

Private 2) Rendezvous – peer queries DHT for Private overlay peers.

1) Reflection – new peer joins the public overlay

4) Peer is connected to public 3) Relay – Send connection overlay messages through the public overlay Fig. 1. Bootstrapping a private overlay using Brunet past 5 years. Brunet is routinely run on PlanetLab [11] for Bootstrapping a P2P system requires expertise in network experiments and tests. PlanetLab consists of of nearly 1,000 administration. To enable users to bootstrap their own private resources distributed across Earth. In practical applications, overlays, we previously investigated means by which a public though, roughly 40% of the resources are unavailable at any overlay could be used to bootstrap a private overlay. given time and the remaining behave somewhat unpredictably. Our approach for bootstrapping private systems requires PlanetLab deployment takes approximately 15 minutes for an overlay to support methods for peers to discover each all resources to have Brunet installed and connect to the other, relay messages, and obtain their public address mapping overlay and then much more time to observe certain behav- as described in [12]. Examples of other potential bootstrap iors, making regression and verification tests complicated. To overlays include popular and well established P2P systems, address this, we have extended Brunet to support a simu- such as , , and Kademlia. Our initial work lation mode. The simulator inherits all of the Brunet P2P supports bootstrapping from XMPP (Jabber) systems and our overlay logic but uses simulated virtual time based upon own P2P overlay, Brunet. In this paper, we focus on user an event-driven scheduler instead of real time. Furthermore, perceptions of these technologies with emphasis on bootstrap the simulation framework uses a specialized transport layer times and performance overheads, unlike our previous work, to avoid the overhead of using TCP or UDP on the host which verified its utility for bootstrapping private overlays. system, both of which are limited resources and can hamper To bootstrap from an existing Brunet overlay, peers first in-

the ability to simulate large systems. The specialized transport sert their public overlay node address into the key represented

uses datagrams to pass messages between nodes, thus from by and continue to do the node’s perspective, it is very similar to a UDP transport so regularly until they disconnect, so as to not let the entry and can simulate both latency and packet dropping. Latency become stale and disappear. Peers attempting to bootstrap into between all node pairs is set to 100 ms. the private overlay can then query this key and obtain a list Both simulation and real system evaluation provide unique of public overlay nodes that are currently acting as proxies advantages. Simulations allow faster than real time execution into the private overlay. By using the public overlay as a of reasonable sized networks (up to a few thousand) using a transport, similar to UDP or TCP, the private overlay node single resource, while enabling easy debugging. In contrast, forms bootstrapping connections via the public overlay. At deployment on real systems, in particular PlanetLab, presents which point, overlay bootstrapping proceeds as normal. The opportunities to add non-deterministic, dynamic behavior into entire process is represented in Figure 1. the system which can be difficult to replicate, such as network In a small private overlay, there is a possibility that none glitches and long CPU delays on processing. of the members have a public address, making it difficult to provide overlay based NAT traversal. Rather than having a IV. TOWARDS PRIVATE OVERLAYS special case for NAT traversal for private overlays, our model has the private overlay share TCP and UDP sockets with the Many users of IPOP begin by using our shared overlay and, public overlay. This mechanism, referred to as “pathing”, once comfortable, move towards hosting their own infrastruc- allows multiplexing a single UDP socket and listening TCP ture. Some are successful without assistance from us, while socket by many overlays. This is only possible due to the a majority are not. The most common issue preventing users generic transports library of the Brunet P2P overlay, which from hosting their own independent IPOP systems is the result does not differentiate UDP, TCP, or even relayed links. Pathing of network configuration issues. In short, users were able to works as a proxy, intercepting a link creation request from a easily join the shared overlay, but similar attempts to construct local entity, mapping that to a path, and then requesting from their own were hindered. the remote entity a link for that path. The underlying link is then wrapped by pathing and given to the correct overlay For instance, assuming a network size of 512 public and 8 node, resulting in a completely transparent multiplexing of a private, a node should be connected within 87 hops. TCP and UDP sockets, thereby enabling the NAT traversal To evaluate our implementation for GroupVPN, we use both in one overlay to benefit the other. Once a link has been PlanetLab and the simulator. 100 tests were run for various established, the pathing information is irrelevant, limiting the network sizes. Though due to difficulty in controlling network overhead into the system to a single message exchange during sizes in PlanetLab, we set each PlanetLab node to randomly link establishment. decide if it would connect to the private overlay. The network sizes were then used in the simulator and the analytical model. The average public network size for each of these tests was A. Time to Bootstrap a Private Overlay 600. Our results are presented in Figure 2 2. 1.0 This experiment focuses on the overheads in bootstrapping a private overlay using our techniques mentioned in the previous 0.8 section. The time to bootstrap can be derived analytically by considering the minimum steps for a node to join the public 0.6 overlay, obtain private overlay peers from the public overlay CDF DHT, and then connect to the private overlay. In Brunet, peers 0.4 planetlab.68 begin by forming leaf or bootstrapping connections and use planetlab.147 these to communicate with the neighbor or peer in the P2P 0.2 simulator.68 network nearest to their P2P address. The process to form a simulator.147 connection can be done in as few as 4 messages and up to 6, 0.0 10-1 100 101 102 103 if the peers only know each other’s P2P address, which is the Time in seconds case for neighbor connections. Fig. 2. CDF of the time to bootstrap a private overlay node in a private Assuming a peer already has IP address information for overlay of the size stated in the legend using a public overlay consisting of another, the peer initiates connection establishment by sending 600 nodes. Using a 100 ms delay like the simulator results in 9.2 and 9.3 to the remote peer a message expressing the desire for a seconds for the analytical model for private network sizes of 68 and 147, connection. The remote node responds by either rejecting the respectively. request or committing to the connection. In the next exchange, Based upon the results presented in Figure 2, the boot- the initiating peer commits to forming the connection and the strapping time for the implementation performs better than remote peer acknowledges. The two phase commit process the analytical model, due to the simplicity of the analytical is used to handle the complexity that ensues when multiple model and the small network sizes. It is of interest that while simultaneous connection attempts occur in parallel. All these the simulator results tend to be in a well defined range, the PlanetLab results have a few outliers with long bootstrap messages take hop, since they are direct links between peers. When peers only have each other’s P2P address and/or the times. Some of the expected causes for this are churn in the initiating peer is behind a NAT, it may take fifth and sometimes system and state machine timeouts in Brunet, though we have a sixth message. These messages are requests for the remote not considered this in this in much depth in this work. peers IP addresses as well as asking the peer to connect with B. Overhead of Pathing the initiating peer, addressing the case where the remote peer

is behind a NAT and cannot handle inbound messages. These Much like the previous experiment, this verifies that the

messages are routed over the overlay taking hops, pathing technique has negligible overheads for VPN usage.

where is the network size of the public overlay. To determine the overheads, two GroupVPNs are deployed Private overlay bootstrapping follows a similar process, on resources on the same gigabit LAN. To measure latency

though, first, the peer acquires P2P addresses of other partici- and throughput, netperf experiments are run for 30 seconds,

£ pants through the public DHT, an operation taking 5 times each on an unutilized network switch. Other speci- hops. In the private overlay, the leaf connections do not fications of the machine are ignored as the system without

communicate directly; rather, they use the public overlay, pathing is used as the baseline. The results, Table I, indicate

causing some of the hop operations above to take that the use of pathing presents negligible overhead for both

hops. Finally, the finding the nearest remote peer in the private throughput and latency, justifying the use of this approach to

overlay takes , where is the network size transparently deal with NAT and firewall traversal. of the private overlay. Latency (ms) Throughput (Mbit/s) Given this model, each operation takes the following hop Standard .303 225.27

Pathing .308 224.36

counts: public overlay bootstrapping , DHT

TABLE I

£

operations , and private overlay bootstrapping PATHING OVERHEADS

£

. The cumulative operation takes

£ hops. The dominating overhead 2We performed measurements for many more private network sizes, but all

in bootstrapping the private overlay is the time it takes to the results were so similar that it did not introduce anything of interest and

perform overlay operations on the public overlay ( ). are omitted from our plots to improve clarity. V. SECURITYFORTHE OVERLAY AND THE VPN manager handles session establishment, garbage collection of Structured overlays are difficult to secure and a private expired sessions, and revocation of peers. overlay is not secure if it provides no means to limit access to the system. Malicious users can pollute the DHT, send bogus messages, and even prevent the overlay from functioning, rendering the VPN useless. To address this in means that make sense for VPNs and common users, we have employed a public key infrastructure (PKI) to encrypt and authenticate both communication between peers as well as communication across the overlay, called point-to-point (PtP) and end-to-end (EtE) communication, respectively. Use of a PKI motivates from the ability to authenticate with- out a third party, ideal for P2P use, unlike a key distribution centers (KDC) used by other VPNs. A PKI can use either pre- exchange public keys or a certificate authority (CA) to sign public keys, i.e., certificates. Thus peers can exchange keys Fig. 3. An example of the abstraction of senders and receivers using a EtE and certificates without requiring a third-party to be online. secured chat application. Each receiver and sender use the same abstracted The reasons for securing PtP and EtE are different. Secur- model and thus the chat application requires only high-level changes, such as verifying the certificate used is Alice’s and Bob’s, to support security. ing PtP communication prevents unauthorized access to the overlay, as peers must authenticate with each other for every Certificate embed identity of the owner, thus a signed cer- link created. Though once authenticated, a peer can perform tificate states that the signer trusts that the identity is accurate. malicious acts and since the overlay allows for routing over In network systems, the certificate uses the domain name to it, the peer can disguise the origination of the malicious acts. uniquely identify and limit the use of a certificate. When a CA By also employing EtE security, the authenticity of messages signs the certificate, by including the domain name, it ensures transferred through an overlay can be verified. Though EtE that users can trust that a certificate is valid, while used to security by itself, will not prevent unauthorized access into secure traffic to that domain. Communication with another the overlay. By employing both PtP and EtE, overlays can domain using the same certificate will raise a flag and will be secured from uninvited guests from the outside and can result in the user not trusting the certificate. In environments identify malicious users on the inside. Implementing both with NATs, dynamic IP addresses, or portable devices, typical leads to important questions: what mechanisms can be used to of P2P systems, assigning a certificate to a domain name implement both and what are the effects of both on an overlay will be a hassle as it constrains mobility and the type of and to a VPN on an overlay. users in the system. Furthermore, most users are unaware of their IP address and changes to it. Instead, a certificate is A. Implementing Overlay Security signed against the user’s P2P address and unique user name There are various types of PtP links; for example, there as delegated by the CA. The purpose of the former is for are TCP and UDP sockets, and relays across nodes and efficiency of revocation as discussed in Section VI. During overlays. EtE communication is datagram-oriented in IPOP. the formation of PtP links or while parsing EtE messages, Traditional approaches of securing communication such as the two nodes discover each other’s P2P addresses. If the IPsec are not convenient due to complexity, i.e., operating addresses do not match the address on the verified certificate, system specific, portability constraints, and lack of common the communication need not proceed further. APIs. Security protocols that rely on reliable connections, Prior to trusting the security filter, the core software or the such as SSL or TLS are undesirable as well as they would security filter must ensure that the P2P address of the remote require a userpace implementation of reliable streams (akin to entity matches that of the certificate. In our system, we did TCP). As such, we have implemented an abstraction akin to a this by means of a callback, which presents the underlying security filter as presented in Figure 3, which enables nearly sending mechanism, EtE or PtP, and the overlay address stored transparent use of security libraries and protocols. To this in the certificate. The receiver of the callback can attempt to date, we have implemented both a DTLS [13] filter using the cast it into known objects. If successful, it will compare the OpenSSL implementation of DTLS as well as a protocol that overlay address with the sender type. If unsuccessful, it ignores reuses cryptographic libraries provided by .NET that behaves the request. If any callbacks return that the sender does not similarly to IPsec. match the identifier, the session is immediately closed. Thus A security filter has two components: the manager, and the security filter need not understand the sending mechanism individual sessions or filters. While the individual sessions and the sending mechanism need not understand the security could act as filters by themselves, by combining with a filter. manager, they can be configured for a common purpose and The last consideration comes in the case of EtE communi- security credentials. This approach enables the use of security cation that provides an abstraction layer. For example, in the to be transparent to the other components of the system as the case of VPNs, where a P2P packet contains an IP packet and thus a P2P address maps to a VPN IP address, a malicious In the simulation, a new overlay is created and afterward a peer may establish a trusted link, but then hijack another users new node joins, this is repeated 100 times. The cumulative IP session. As such, the application must verify that the IP distribution functions obtained from the different experiments address in the IP packet matches the P2P address of the sender are presented in Figure 5. of the P2P packet. In general, an application address should 1.0 be matched against a P2P address, consider chat programs, for example. 0.8

B. Overheads of Overlay Security 0.6 When applying an additional layer to a P2P system, there CDF 0.4 are overheads in terms of time to connect with the overlay. plab.nosec Other less obvious effects will be throughput, latency, and plab.dtls processing overheads, assuming that the P2P system will be 0.2 sim.nosec used over a wide area network, where the latency and through- sim.dtls 0.0 put limitations between two points will make the overhead 10-1 100 101 102 of security negligible. Though bootstrapping will be affected Time in seconds due to additional round trip messages used for forming secure Fig. 5. Time in seconds for a single node to join a secure (dtls) and insecure connections. (nosec) structured overlay, using both PlanetLab (plab) and the Simulator (sim). 2) Bootstrapping an Overlay: The purpose of this exper- iment is to determine how quickly an overlay using DTLS can bootstrap in comparison to one that does not given that there are no existing participants. Nodes in this evaluation are randomly given information about 5 different nodes in the overlay and then all attempt to connect with each other at the same time. The evaluation completes after the entire overlay has all nodes connected and in their proper position. For each network size, the test is performed 100 times and the average Fig. 4. DTLS handshake result is presented in Figure 6. The DTLS handshake as presented in Figure 4, which 100 consists of 6 messages or 3 round trips. PtP security may very well have an effect on the duration of overlay bootstrapping. 80 There even exists a possibility that with more messages during bootstrap, the probability one drops is higher, which could, in 60 turn, also have an effect, though possibly negligible, on time to connect. To evaluate these concerns, we have employed both 40 simulation and real system experiments. Time in seconds The following experiments use both simulation and Plan- 20 nosec etLab deployment to evaluate time to connect a new node to dtls 0 an existing resource. Then another experiment is performed to 0 50 100 150 200 250 300 evaluate how long it takes to bootstrap various sized overlays Network size if all nodes join at the same time. This experiment is only Fig. 6. Time in seconds for a secure (dtls) and insecure (nosec) structured feasible via simulation as attempting to reproduce in a real overlay to bootstrap, given that all nodes bootstrap simulataneously. system is extremely difficult due to how quickly the operations complete. C. Discussion 1) Adding a Single Node: This experiment determines how Both evaluations show that the overhead in using security long it takes a single node to join an existing overlay with and is practically negligible, when an overlay is small. In the case without DTLS security. The experiment is performed using of adding a single node, it is clear that the simulation and both simulation and PlanetLab. After deploying a set of nodes deployment results agree, as the difference between bootstrap- without security and with security on PlanetLab, the network ping into an overlay with and without security remains nearly is crawled to determine the size of the network. In both cases, the same. Clearly this motivates the use of security if time to the overlay maintained an average size of around 600 nodes. connect is the most pressing question. At which point, we connected a node 1,000, each time using The time to bootstrap a secure overlay was not significantly a new, randomly generated P2P address, thus connecting to more than that of an insecure overlay. What we realized is that a different point in the overlay. The experiment concludes as complex connection handshaking, as implemented in Brunet, soon as the node has connected to the peers in the P2P overlay seems to dominate connection establishment time. For exam- immediately before and after it in the P2P address space. ple, in Brunet, two peers must communicate via the overlay prior to forming a connection, and the system differentiates between bootstrapping connections and overlay connections. Thus even though a peer may have a bootstrapping connection, it will need to go through the entire process to form an overlay connection with a peer. While this may lead to inefficiencies, this simplification keeps the software more maintainable and easier to understand. Fig. 7. Broadcast performing a complete overlay broadcast VI. HANDLING USER REVOCATION To perform a broadcast, each node performs the following Unlike decentralized systems that use shared secrets, in recursive algorithm: which the creator of the overlay becomes powerless to control BROADCAST(start, end, message): malicious users, PKIs enable their creators to effectively RECEIVE(message) remove malicious users. Typical PKIs either use a certificate for in length(connections) do

revocation list (CRL) or online certificate verification protocols

℄ n start ADDRESS(connections )

such as Online Certificate Status Protocol (OCSP). These

if n start 6¾ start, end then approaches are orthogonal to decentralized systems as they continue require a dedicated service provider. If the service provider is end if

offline, an application can only rely on historical information

℄ n end ADDRESS(connections )

to make a decision on whether or not to trust a link. In a

if n end 6¾ start, end then decentralized system, these features can be enhanced so not n end end to rely on a single provider. In this section, we present two end if mechanisms of doing so: storing revocations in the DHT and msg (BROADCAST, n start, n end, message)

performing overlay broadcast based revocations.

℄ SEND(connections , msg) end for A. DHT Revocation with “connections” as a circular list of connections in non- A DHT can be used to provide revocation similar to that decreasing order from the perspective of the node performing of OCSP or CRLs. Revocations, a hash of the certificate and the current recursive, broadcast step. a time stamp signed by the CA, are stored are stored in the In this algorithm, the broadcast initiator uses its own address DHT at the key formed by the hashing of the certificate. In as the start and end, thus the broadcast will span the entire doing so, revocations will be uniformly distributed across the overlay after completing recursive calls at each connected overlay, not relying on any single entity. node. A recursive end, “n end”, must be inside the region The problem with the DHT approach is that it does not between “start” and “end”, thus if the connection following

provide an event notification for members currently commu-

℄ the current sending connection, “connections ”, is not nicating with the peer. While peers could continue to poll in that region, it will only broadcast up to “end” and not the the DHT to determine a revocation, doing so is inefficient. address specified by that connection. Finally, nodes, who have Furthermore, a malicious peer, who has a valid but revoked a connection to the malicious peer, will end the connection certificate could force every member in the overlay to query prior to accidentally forwarding the message to the peer by the DHT, negatively affecting the DHT nodes storing the receiving and acting upon the revocation prior to forwarding revocation. the message. To summarize, the overlay is recursively parti- tioned amongst the nodes at each hop in the broadcast. By B. Broadcast Revocation doing so, all nodes receive the broadcast without receiving Broadcast revocation can be used to address the deficiencies duplicate broadcast messages. of DHT revocation. As a topic of previous research works [14], [15], structured overlays can be used without additional state C. Evaluation of Broadcast to perform efficient broadcasts from any point in the overlay We performed an evaluation on the broadcast using the to the entire overlay. The form of broadcast can be used simulation to determine how quickly peers in the overlay to perform to notify the entire overlay immediately about a would receive the message. The tested network sizes ranged

new revocation. In these papers, analysis and simulations have from 2 to 256 in powers of 2. The tests were evaluations

¾

shown that the approach can be completed in time. were performed 100 times for each network size. The CDF Our modified algorithm as illustrated in Figure 7 utilizes of hops for each node are presented in Figure 8. The results

the organization of a structured system with a circular address make it quite clear that the broadcast can efficiently distribute

space that requires peers be connected to those whose node a revocation much more quickly than time. addresses are the closest to their own, features typical of one-dimensional structured overlays including Chord [16], D. Discussion Pastry [17], and Symphony. Using such an organization, it In contrast to the DHT solution, broadcast revocation occur is possible to do perform a broadcast with no additional state. only once and leave no state behind. Thus the broadcast is not P2P Node 10.0.1.2 is 4) Query overlay for 10.0.5.251 IP P2P VPN at Node W is at Node X DHT Entry 1) Request / obtain Message 6) Form secure, group certificate direct link with peer via overlay

Node X User) Join group, 3) Bootstrap 10.0.5.251 obtain credentials into overlay and P2P information

Node W Welcome 10.0.1.2 to Node Z GroupVPN 10.0.123.248

5) Route IP over link 10.0.123.248 is at Node Z

Fig. 9. Process in bootstrapping a new GroupVPN instance. 1.0 message. That would prevent those from colluding together to prevent each other’s revocations. Furthermore, while not 0.8 emphasized above, revocation in our system revokes by user name and not individual certificates. Combined these two 2 0.6 components limit sybil attacks against broadcast. 4

CDF 16 0.4 VII. MANAGING AND CONFIGURING THE VPN 32 64 While the PKI model applies to P2P overlays, setting up, 0.2 128 deploying, and then maintaining security credentials can easily 256 become too complex to manage, especially for non-experts. 0.0 0 2 4 6 8 10 12 14 16 18 Most PKI-enabled systems require the use of command-line Hop count utilities and lack methods for assisting in the deployment of Fig. 8. Overlay broadcast time CDF. certificates and policing users. In order to facilitate use in real a complete solution, as new peers connected to the overlay systems with non-experts, it is important to have an easy to use or those who missed the broadcast message will be unaware framework. Our solution to this issue is a partially-automated of a revocation. Furthermore, if an overlay is shared by many PKI reliant on a group-based Web interface distributable in VPNs, it may prevent overlay broadcasting or itself may be forms of Joomla add-ons as well as a virtual machine appli- inefficient. ance. In this environment, groups can share a common Web The DHT solution by itself may also not sufficient as site, while each group has their own unique CA. Although this revocations may be lost over time as the entries must have does not preclude other methods of CA interaction, experience their leases renewed in the DHT. To address this condition, has shown that it provides a model that is satisfactory for many each peer maintains a local CRL and the owner of the overlay use cases. can occasionally send updates to the CRL through an out of A group-based Web 2.0 site enables low overhead config- band medium, such as e-mail. A better long term solution may uration of collaborative environments. The roles in a group be the use of a gossip protocols so that peers can share their environment can be divided into administrators and users. lists with each other during bootstrapping phases. Users have the ability to join and create groups; whereas A key assumption in using these is that a Sybil [18], or administrators define network parameters, can accept or deny collusion attack, is difficult in the secured overlay. If a Sybil join requests, remove users, and promote other users to admin- attack is successful, both a DHT and broadcast revocation istrators. By applying this to a VPN, the group environment may be unsuccessful, though peers could fix this problem by provides a simple to use wrapper around PKI, where the obtaining the CRL out of band. In addition, previous work [19] administrators of the group act as the CA and the members has described decentralized techniques to limit the probability have the ability to obtain signed certificates. of such attacks from occurring. In our approach, the use of Elaborating further, when a user joins a group, the admin- central authority to review certificate requests can be used to istrator can enable automatic signing of certificates or require limit a single user from obtaining too many certificates as well prior review; and when peers have overstayed their welcome, as ensuring uniform distribution of that user’s P2P addresses, an administrator can revoke their certificate by removing them further hampering the likelihood of a Sybil attack. The ability from the group. Revocations are handled as described in to automate this is left as future work. Section VI. In the context of GroupVPN systems, a user One way to mitigate sybil attacks using the broadcast ap- revocation list as opposed to a CRL simplifies revocation, since proach is to bundle colluding offenders into a single revocation users and not individual certificates will be revoked. Registered users who create groups become administrators discovery of peers. The key differences between Hamachi and of their own groups. When a user has been accepted into a Campagnol is that Campagnol is free and does not provide group by its administrator, they are able to download VPN a service; users msut deploy their own rendezvous service. configuration data from the Web site. In addition to IP address The authors of Campagnol also state that the current approach range, namespace, and security options, the configuration limits the total number of peers sharing a VPN to 100 so not to data also contains the group website’s address and a shared overload the rendezvous service. The current implementation secret. Once downloaded, configuration data is loaded by does not support a set of rendezvous nodes, though doing so the GroupVPN during its configuration process. The shared would make the approach much more like ours. In addition, secret uniquely identifies the user, so that the Web site can the system relies on traditional distribution of a CRL to handle automatically sign the certificate (or enqueue it form manual revocation. signing, depending on the group’s policy). Certificate requests [23] is a mature, open source VPN, dating back to the consist of a public key and the user’s shared secret and are early part of 2000. Tinc is a decentralized VPN, which requires sent over HTTPS to the Web server. Upon receiving the that users manually organize an overlay and the Tinc uses its signed certificate, peers are able to join the private overlay and own routing protocols to determine optimum paths. In com- GroupVPN, enabling amongst the VPN parison to our approach, Tinc cannot be used to allow peers to peers. The entire bootstrapping process, including address bootstrap private VPNs from an existing VPN, nor does Tinc resolution and communication with a peer, is illustrated in automatically handle churn in the VPN. If a node connecting Figure 9. two separate pieces of the VPN overlay goes offline, the There are many ways of implementing and hosting the Web VPN will be partitioned until a user manually creates a link site. For example, Google offers free hosting of Python web connecting the pieces. Furthermore, Tinc does not form direct applications through Google Apps, an option available if the connections for improved latency and throughput reasons, thus user owns a domain. Alternatively, the user could host the members acting as routes in the overlay incur the price of group site on a public virtual network. In this case, peers acting as packet forwarders. interacting with the GroupVPN would need to connect with The last VPN, we discuss is the most similar like ours, its the public virtual network in order to create an account, get the called [24]. N2N uses unstructured p2p techniques to configuration data, and retrieve a signed certificate, at which form an Ethernet based VPN. While their approach has built- point they could disconnect from it. This does not preclude in NAT traversal, like ours, it requires that users deploy their the use of other social mediums nor a central site dedicated own bootstrap and limits security to a single pre-shared key to the formation of many GroupVPNs. Many GroupVPNs can for the entire VPN, thus users cannot be revoked. Since N2N share a single site, so long as the group members trust the site provides Ethernet, users must provide their own mechanism to host the CA private key. for IP address allocation, while discovery utilizes overlay broadcasting. Thus there are concerns that as systems get VIII. RELATED WORK larger, N2N may not be very efficient. A. VPNs B. P2P Systems Hamachi [20] is a centralized P2P VPN provider. Peers authenticate with the website, use it to discover remote peers, [25], a P2P data sharing service, supports stream and then connect with them. The website itself provides a encryption between peers sharing files. The purpose of Bit- central group management environment, which can be used to Torrent security is to obfuscate packets to prevent traffic control access to the VPN. While the Hamachi protocol claims shaping due to packet sniffing. Thus BitTorrent security uses to support various types of security [21], the implementation a weak stream cipher, RC4, and lacks peer authentication appears to only support the key distribution center (KDC). This as symmetric keys are exchanged through an unauthenticated requires that all peers establish trusted relationship through the Diffie-Hellman process. central website. The Hamachi approach makes it easy for users Skype [26] provides decentralized audio and video commu- to deploy their own services, but places limitations on network nication to over a million concurrent users. While Skype does size, uses a proprietary security stack, and does not allow users not provide documentation detailing the security of its system, to fork off from Hamachi and maintain their own system, if researchers [27], [28] have discovered that Skype supports they so desire. In contrast, our approach presents a completely both EtE and PtP security. Though similar to Hamachi, Skype decoupled environment, which allows peers to start using our uses a KDC and does not let users setup their own systems. shared system to bootstrap private overlays and migrate away As of December 2009, the FreePastry group released an SSL without cost if need be. Our approach also has the benefit enabled FreePastry [17]. Though relatively little is published of not relying on a centralized server to establish VPN links, regarding their security implementation, the use of SSL pre- as the only process that is centralized is the signing of the vents its application for use in the overlay and for overlay certificate. In Hamachi, if the central server goes offline, no links that do not use TCP, such as relays and UDP. Thus new peers can join the VPN. their approach is limited to securing environments that are Campagnol VPN [22] provides similar features to Hamachi: not behind NATs and firewalls that would prevent direct TCP a P2P VPN that relies on a central server for rendezvous or links from forming between peers. IX. CONCLUSIONS for computer architecture research and education,” in CollaborateCom, November 2008. This paper overviews the architecture implementation of [2] A. Ganguly, A. Agrawal, O. P. Boykin, and R. Figueiredo, “IP over GroupVPN, a system that is the first to demonstrate the P2P: Enabling self-configuring virtual IP networks for grid computing,” practical feasibility of using structured overlays as a basis in International Parallel and Distributed Processing Symposium, 2006. [3] M. Castro, P. Druschel, A.-M. Kermarrec, and A. Rowstron, “One ring for easy-to-use, group-oriented, P2P VPNs. Explicitly, we to rule them all: Service discover and binding in structured peer-to-peer have taken common structured overlays and explored orga- overlay networks,” in SIGOPS European Workshop, Sep. 2002. nization, public overlays for connectivity, and private over- [4] M. Castro, M. Costa, and A. Rowstron, “Debunking some myths about structured and unstructured overlays,” in NSDI’05: Proceedings lays for security and then described our GroupVPN which of Symposium on Networked Systems Design & Implementation. binds them the components together to create collaborative [5] P. O. Boykin, J. S. A. Bridgewater, J. S. Kong, K. M. Lozev, B. A. environments for configuration and management of VPNs. Rezaei, and V. P. Roychowdhury, “A symphony conducted by brunet,” http://arxiv.org/abs/0709.4048, 2007. This paper extends upon the IPOP virtual network to support [6] G. S. Manku, M. Bawa, and P. Raghavan, “Symphony: distributed user-friendly approaches for users to create and manage their hashing in a small world,” in USITS, 2003. own virtual private networks. To accomplish this, each IPOP [7] J. Rosenberg, “Interactive connectivity establishment (ICE): A protocol for network address translator (NAT) traversal for offer/answer proto- system bootstraps into its own unique, secure P2P overlay. cols,” http://tools.ietf.org/html/draft-ietf-mmusic-ice-19, October 2008. This approach not only enables secure communications in [8] A. Ganguly, D. Wolinsky, P. Boykin, and R. Figueiredo, “Decentralized IPOP deployments but also enables for more efficient overlay dynamic host configuration in wide-area overlays of virtual worksta- tions,” in International Parallel and Distributed Processing Symposium, multicast and broadcast. March 2007. The use of service overlays significantly improves perfor- [9] I. Stoica, D. Adkins, S. Zhuang, S. Shenker, and S. Surana, “Internet in- mance and maintenance. Peers can easily control member- direction infrastructure,” IEEE/ACM Transactions on Networking, 2004. [10] D. I. Wolinsky, Y. Liu, P. S. Juste, G. Venkatasubramanian, and ship in the overlay and it presents unique opportunities for R. Figueiredo, “On the design of scalable, self-configuring virtual decentralized revocation. A DHT approach allows results to networks,” in IEEE/ACM Supercomputing 2009, November 2009. be stored on the overlay instead of using centralized CRLs [11] B. Chun, D. Culler, T. Roscoe, A. Bavier, L. Peterson, M. Wawrzoniak, and M. Bowman, “Planetlab: an overlay testbed for broad-coverage and broadcast to immediately notify active participants of a services,” SIGCOMM Comput. Commun. Rev., 2003. revocation. Ongoing work include investigating slow bootstrap [12] D. I. Wolinsky, P. St. Juste, P. O. Boykin, and R. Figueiredo, “Addressing times and determining security concerns of the decentralized the P2P bootstrap problem for small overlay networks,” in 10th IEEE International Conference on Peer-to-Peer Computing, 2010. revocation techniques. Furthermore, we plan on investigating [13] E. Rescorla and N. Modadugu. (2006, April) RFC 4347 datagram the use of overlay broadcasting for IP broadcasting and mul- . ticasting, though the current approach places an unfair burden [14] S. El-Ansary, L. Alima, P. Brand, and S. Haridi, “Efficient broadcast in structured p2p networks,” in 2nd International Workshop on Peer-to- on the first few hops of a broadcast. Peer Systems, 2003. Without the functionality of GroupVPN, projects like the [15] V. Vishnevsky, A. Safonov, M. Yakimov, E. Shim, and A. D. Gelman, Grid Appliances and its flagship application, Archer [1], would “Scalable blind search and broadcasting over distributed hash tables,” in Computer Communications, vol. 31, no. 2, 2008. be impractical. Archer consists of over 500 resources from 5 [16] I. Stoica, R. Morris, D. Liben-Nowell, D. R. Karger, M. F. Kaashoek, different universities, including University of Florida, Florida F. Dabek, and H. Balakrishnan, “Chord: a scalable peer-to-peer lookup State University, Northeastern University, University of Min- protocol for internet applications,” vol. 11, no. 1, 2003. [17] A. Rowstron and P. Druschel, “Pastry: Scalable, decentralized object lo- nesota, and University of Texas. In the past year, since Archer cation and routing for large-scale peer-to-peer systems,” in International came online, over 100 unique users have contributed and Conference on Distributed Systems Platforms, November 2001. taken advantage of the voluntary computing cycles. The use of [18] J. R. Douceur, “The sybil attack,” in IPTPS ’01: Revised Papers from the First International Workshop on Peer-to-Peer Systems. Springer-Verlag, existing decentralized VPNs in Archer and the Grid Appliance 2002, pp. 251–260. would be severely limiting. In the case of N2N, at least one [19] M. Castro, P. Druschel, A. Ganesh, A. Rowstron, and D. S. Wallach, peer would have to maintain the bootstrap service and address “Security for structured peer-to-peer overlay networks,” in Symposium on Operating Systems Design and Implementaion (OSDI’02), December allocations for the VPN. Tinc would require users to manually 2002. configure their networking overlays. While Compagnol and [20] LogMeIn. (2009) Hamachi. https://secure.logmein.com/products/ Hamachi require the use of well known centralized peers. hamachi2/. [21] LogMeIn, Inc. (2009) LogMeIn hamachi2 security. The GroupVPN has been used as the virtual network for the [22] F. Bondoux, “Campagnol : distributed vpn over udp/dtls,” http:// Grid Appliance, enabling the creation of decentralized, collab- campagnol.sourceforge.net, 2010. orative environments for computing grids. Recently, grids at La [23] G. Sliepen. (2009, September) tinc. http://www.tinc-vpn.org/. [24] L. Deri and R. Andrews, “N2N: A layer two peer-to-peer vpn,” in AIMS Jolla Institute for Allergy and Immunology and two in Eastern ’08: Proceedings of the 2nd international conference on Autonomous Europe went live using GroupVPN without receiving any Infrastructure, Management and Security, 2008. technical support from us. Researchers at Clemson University [25] (2007, December) Message stream encryption. http://www.azureuswiki. com/index.php/Message Stream Encryption. and Purdue have opted for this approach over centralized [26] S. Limited. Skype. http://www.skype.com. VPNs as the basis of their future distributed compute clusters [27] D. Fabrice. (2005, November) Skype uncovered. http://www.ossir.org/ and have actively tested networks of over 700 nodes. windows/supports/2005/2005-11-07/EADS-CCR Fabrice Skype.pdf. [28] S. Guha, N. Daswani, and R. Jain, “An experimental study of the skype peer-to-peer voip system,” in IPTPS’06, 2006. REFERENCES [1] R. J. Figueiredo, P. O. Boykin, J. A. B. Fortes, T. Li, J. Peir, D. Wolinsky, L. K. John, D. R. Kaeli, D. J. Lilja, S. A. McKee, G. Memik, A. Roy, and G. S. Tyson, “Archer: A community distributed computing infrastructure