Itap: In-Network Traffic Analysis Prevention Using Software-Defined
Total Page:16
File Type:pdf, Size:1020Kb
iTAP: In-network Traffic Analysis Prevention using Software-Defined Networks https://itap.ethz.ch Roland Meier David Gugelmann Laurent Vanbever ETH Zürich ETH Zürich ETH Zürich [email protected] [email protected] [email protected] ABSTRACT CCS Concepts Advances in layer 2 networking technologies have fostered •Security and privacy ! Pseudonymity, anonymity and the deployment of large, geographically distributed LANs. untraceability; Network security; •Networks ! Network Due to their large diameter, such LANs provide many van- privacy and anonymity; Programmable networks; tage points for wiretapping. As an example, Google’s inter- nal network was reportedly tapped by governmental agen- Keywords cies, forcing the Web giant to encrypt its internal traffic. While using encryption certainly helps, eavesdroppers can Anonymous communication; wiretapping; SDN still access traffic metadata which often reveals sensitive information, such as who communicates with whom and 1. INTRODUCTION which are the critical hubs in the infrastructure. Since the Snowden revelations, it is well-known that net- This paper presents iTAP, a system for providing strong work eavesdropping was (and probably still is) performed in anonymity guarantees within a network. iTAP is network- the Internet core, particularly on undersea cables [22]. While based and can be partially deployed. Akin to onion rout- worrying, these threats can be mitigated to a large degree ing, iTAP rewrites packet headers at the network edges by by hiding connection metadata (e.g., using Tor [14]) and by leveraging SDN devices. As large LANs can see millions relying on pervasive encryption (e.g., using VPNs). of flows, the key challenge is to rewrite headers in a way However, network eavesdropping is not limited to the that guarantees strong anonymity while, at the same time, Internet backbone. As enterprise networks become bigger in scaling the control-plane (number of events) and the data- terms of users and physical reach, they, too, become suscep- plane (number of flow rules). iTAP addresses these chal- tible to wiretapping. Indeed the advent of new layer 2 tech- lenges by adopting a hybrid rewriting scheme. Specifically, nologies such as TRILL [7], Shortest Path Bridging [30], or iTAP scales by reusing rewriting rules across distinct flows SDN-based solutions [34] enables network administrators to and by distributing them on multiple switches. As reusing build large LAN zones that can easily span several thousand headers leaks information, iTAP monitors this leakage and devices and users. Due to their large physical diameter, adapts the rewriting rules before any eavesdropper could such networks inevitably exhibit many vantage points for provably de-anonymize any host. wiretapping. Actually, the majority of the attacks is now We implemented iTAP and evaluated it using real network performed by insiders, i.e. malicious insiders or inadvertent traffic traces. We show that iTAP works in practice, on actors [4] that act from within the network, rather than by existing hardware, and that deploying few SDN switches is remote attackers. As an example, Google’s Wide-Area Net- enough to protect a large share of the network traffic. work (WAN) – the private network connecting Google’s data centers – was reportedly tapped by governmental agencies, forcing the Web giant to encrypt its internal traffic [36]. While encrypting internal traffic protects the payload of Permission to make digital or hard copies of all or part of this work for personal connections, an in-network attacker can still monitor and or classroom use is granted without fee provided that copies are not made or analyze the unencrypted packet headers, i.e., the metadata. distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work In particular, the MAC addresses and, in case of SSL/TLS owned by others than ACM must be honored. Abstracting with credit is per- application layer encryption, also the IP addresses of the mitted. To copy otherwise, or republish, to post on servers or to redistribute to source and destination hosts along with the source and des- lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. tination ports used for the communication. SOSR ’17, April 03-04, 2017, Santa Clara, CA, USA By analyzing these unencrypted header fields, an attacker c 2017 ACM. ISBN 978-1-4503-4947-5/17/04. $15.00 can gain useful information about: i) which hosts commu- DOI: http://dx.doi.org/10.1145/3050220.3050232 nicate; ii) the topology of the network; iii) the addresses of lenging. From an anonymity viewpoint, the best rewriting .32 255. ... solution consists in assigning one random ID per TCP/UDP flow. Yet, not only does this require millions of forwarding rules in the data plane, but it also mandates a purely reactive controller which has to be involved in each connection es- .16 tablishment. From a scalability viewpoint, the best solution destination IP destination IP would be to assign one random ID per host and reuse it for all flows pertaining to that host. Doing so would only require two forwarding rules per host at the edge, i.e., a marginal 10.0.0.0 10.0.0.16 10.0.0.32 0.0.0.0 255. ... .255 increase with respect to today’s networks, and could even source IP source IP be done proactively so that the controller does not have to (a) Regular network (b) iTAP-enabled network be involved in any connection establishment. However, this would not provide strong anonymity, e.g., an attacker could Figure 1: Example of information leakage. Observed source still easily count the number of devices in the network. and destination IP addresses without (a) and with iTAP (b). iTAP solves the scalability and privacy problems of these approaches while retaining their strengths by employing a hybrid approach. In particular, iTAP reuses IDs across dis- important hosts driving a lot of traffic (e.g., data storage) or tinct flows so as to maintain the overall number of forward- a lot of flows (e.g., authentication server). An eavesdropper ing rules to an acceptable level. Reusing rules leaks some can use the collected metadata to extract privacy-relevant information out to potential eavesdropper, namely which bits information, or to identify valuable targets during the re- in the IDs identify hosts and where these hosts are con- connaissance phase of an attack. To give some examples: nected. iTAP addresses this problem by continuously moni- First, an attacker, engaged by a competitor, could easily toring the amount of leaked information and by adapting the determine whether a certain person (e.g., the CEO) is cur- rewriting scheme before any attacker could provably (relying rently working or on holidays by monitoring network traffic. on information theory and the notion of the unicity distance) Her absence might indeed be a good occasion to launch a learn enough information to break the used scheme. new product or a campaign against the attacked company. Reusing IDs enables iTAP to maintain O (# hosts) for- 2 Second, an attacker can foresee the future actions of the warding rules in the core and O # hosts at the edge. While company [41]. Indeed, the launch of a new product is often this number of forwarding rules is perfectly manageable for preluded by a significantly higher-level of communication users-facing edge switches (see §9), it can however be a between a broad range of departments. Third, an attacker can problem for Internet-facing edges which act as focus points infer the addresses of important or heavily-used hosts (e.g., for any flow directed to the outside. iTAP addresses this authentication and accounting servers) as well as who is problem by monitoring each edge switch (in terms of flow accredited to use them. In a following step, the attacker could rules or updates per second) and by offloading obfuscation then run a targeted attack against these clients or servers. rules for Internet destinations on other switches before any switch limit is reached. While offloading obfuscation rules This work. We present iTAP, a partially deployable network- reveals the actual Internet destinations on some links, the level system that enables anonymous communication within anonymity of internal hosts is still protected anywhere in the the premises of a network. The key insight is to leverage the network and an attacker cannot know who is communicating programmability offered by Software-Defined Networking with a particular destination. (SDN) to rewrite packet headers at the network edge to varying randomized identifiers. iTAP is useful with as few as Practical results. iTAP works in practice. We implemented two SDN devices and its anonymity guarantees grow linearly a prototype on top of Floodlight [43] and evaluated iTAP with the number of SDN devices from thereafter. using real traffic traces. Our results show that iTAP can With iTAP in place, in-network eavesdroppers only see handle the load of a large network using commodity SDN randomized IDs (which carry no information) as headers and hardware. cannot single out hosts unless they eavesdrop on all links Novelty. Communicating anonymously has been the focus of a path, which is unlikely in practice. As an illustration, of extensive research [12–14, 24]. Yet, we are not aware Fig.1 shows the distribution of source and destination IP of any other work which focuses on anonymizing internal addresses observed by a link-level eavesdropper with and communications without requiring host support. In partic- without iTAP. Without iTAP, it is evident that 18 clients ular, Tor [14] and Hornet [13] both require host support. and six servers are communicating with each other. With Recently, MACSec [5] has been developed to provide link- iTAP, the observed IP addresses are spread over the whole based encryption for traffic beyond L2 (MAC addresses address range and neither the real addresses nor the number are not encrypted).