The vAMP Attack: Taking Control of Cloud Systems via the Unified Packet Parser Kashyap Thimmaraju Bhargava Shastry Tobias Fiebig TU Berlin TU Berlin TU Berlin [email protected] [email protected] [email protected] Felicitas Hetzelt Jean-Pierre Seifert Anja Feldmann TU Berlin TU Berlin TU Berlin [email protected] [email protected] [email protected] Stefan Schmid TU Berlin/Aalborg University [email protected] ABSTRACT KEYWORDS Virtual switches are a crucial component of cloud operat- SDN; NFV; Virtual Switches; Data Plane Security; Cloud ing systems that interconnect virtual machines in a flexible Security; Open vSwitch; OpenStack manner. They implement complex network protocol parsing in the unified packet parser—parsing all supported packet 1 INTRODUCTION header fields in a single pass—and are commonly co-located Computer networks are becoming increasingly pro- with the layer. We find that this significantly grammable and virtualized. A key enabler for such a par- reduces the barrier for low-budget attackers to launch high adigm is the virtual switch. It is a piece of software that impact attacks in the cloud. This leads us to introduce the resides in the server’s virtualization layer (e.g., VMware’s virtual switch attacker model for packet-parsing, in short the ESXi or ’s Dom0), tasked with virtualizing the vAMP attack. Using OpenStack, a cloud , data plane of virtual machines. Hence, virtual switches are and Open vSwitch, a virtual switch, we demonstrate how cur- meant to provide network isolation among tenant’s virtual rent virtual switch designs cannot withstand vAMP. Thereby machines [1]. giving a weak attacker full control of the cloud in a matter Virtual switches such as Open vSwitch (OvS), Cisco Nexus of minutes. 1000V, VMware vSwitch, and Microsoft VFP, bring new flex- ibility to data centers in terms of [1], CCS CONCEPTS e.g., centralized control of all virtual switches, and layer 2 • Security and privacy → Network security; Distributed virtual private networks. systems security; Despite their popularity, the security implications of vir- tual switches have received little attention. In general, while much research has focussed on control plane security, the security of the physical and virtual data plane is often over- Permission to make digital or hard copies of all or part of this work for looked. This is worrisome as the data plane in general, and personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that packet parsers in particular process attacker controlled input. copies bear this notice and the full citation on the first page. Copyrights This paper is motivated by two key observations. First, for components of this work owned by others than the author(s) must by placing virtual switches in the (edge) servers, the attack be honored. Abstracting with credit is permitted. To copy otherwise, or surface has advanced one step closer to the attacker. There- republish, to post on servers or to redistribute to lists, requires prior specific fore, associated attacker models may significantly change permission and/or a fee. Request permissions from [email protected]. CCSW’17, October 2017, Dallas, Texas USA compared to non-virtualized data planes. Second, unified © 2017 Copyright held by the owner/author(s). Publication rights licensed packet parsing—parsing all the supported protocol fields of a to Association for Computing Machinery. packet, e.g., Ethernet, IP and TCP, in a single pass instead of ACM ISBN 978-x-xxxx-xxxx-x/YY/MM...$15.00 several independent passes—exacerbates the existing attack https://doi.org/10.1145/nnnnnnn.nnnnnnn surface of integrating, and juxtaposing complex network 1 lns[,4]. data (hardware/software) [3, usu- of planes than context larger the much in potentially considered be ally can impact attack the an demonstrates, such study of case with our exploited as be cloud.Besides resources, can inthe low which aVM surface rent attack the to increasing theattacker for sufficient to violate isolation. rules network flow essential manipulate e.g., logically to, the design leverage centralized can she controller, communication the direct From the channel. cases to most due in hosted network—and is hypervi- the cloud—controller the running of VM control the where take with sor can switch she virtual Next, the hypervisor. of phys- the co-location the to of control due taking machine in of ical results parser This packet switch. vulnerable virtual the from the packet to crafted (VM) a machine sends virtual attacker a An 2. as Fig. setup, in OpenStack illustrated entire the compromise subsequently vulnerabilities. security multiple oftheOvS uncovered 2-3% we merely base, is code which the OvS, tofuzz of parser fuzzer packet off-the-shelf unified an as Using OvS switch. and virtual system, the operating cloud popular a OpenStack, impact. huge a with attacker low-resource a the describes short: in switch packet-parsing, virtual for The model switches. attacker virtual for model attacker new used. commonly [2] are attack switches to virtual adversaries where for systems bar cloud the complexity lowers increasing parsers The packet 1. of Fig. growing in be depicted to as appears constantly of switches number virtual the for fact, protocols In parsed switch. virtual the with functionality 2009-2017. from switches virtual popular proto- two high-level in parsed cols of number total The 1: Figure

oeta oscesul xct h bv tak tis it attack, above the execute successfully to that Note to us enabled vulnerabilities those of one just Exploiting using study case a in model, attacker this demonstrate We a introduce to paper, this in us, lead observations These Parsed Protocols 10 20 30 40 50 0 Jan-2009 Jul-2009 Jan-2010 Jul-2010

pnvSwitch Open Jan-2011 Jul-2011 Jan-2012 Jul-2012 Time Jan-2013 Jul-2013

eu 1000V Nexus Jan-2014 Jul-2014 Jan-2015 Jul-2015 Jan-2016 APattack vAMP Jul-2016 Jan-2017 Jul-2017 , omcnpoaaet l h ytm yexploiting by systems the all to propagate can worm h aktpre stpclyitgae notecd base code the into integrated typically is parser packet The Contributions: a parser. attack, packet unified vAMP switch’s virtual the the of consequences a As 2: Figure fvrulsice:()Spotteprigo oeo new or more of parsing the Support (i) major requirements switches: two virtual of state- necessitates and This inspection, [5]. packet firewalls ful deep balancing, load as functions middlebox such implement are to switches virtualization virtual network driving Workload TCP/UDP. in to advancements up and Ethernet from increase i.e., Layer 4, from Layer are to switches 2 virtual by pro- parsed network commonly The tocols packets. receive, controlled can attacker it process Therefore, and interface. phys- network the from virtual the virtual receives or of it ical packets component network first process the to is switch It switch. virtual the of NEW A PARSING: PACKET UNIFIED 2 discoveries. our to assigned were CVE-2016-2074 CVE-2016-10377 Furthermore, and releases. these stable in their applied have fixes stakeholders other , and Redhat, Mirantis, , Suse, fixes. OvS the the to integrated have who findings team our on ourown disclosed have We findings infrastructure. our verified we businesses, of operation Considerations: Ethical Virtualization

• • • Layer TAKSURFACE ATTACK Kernel User sn aesuywt pntc n v,we OvS, attack. vAMP and the stand with- OpenStack cannot switches with virtual existing that study demonstrate case a Using model (vAMP). packet-parsing attacker virtual for switch for virtual the model namely, attacker switches, relevant a introduce sur- We switches. newattack virtual as a by introduced face increasingly parser packet and unified facing, complex, adversary the identify We VM 1 Switch Virtual Switch Virtual VM VM 2 Controller VM oaoddsutn h normal the disrupting avoid To Switch Virtual

IP MPLS TCP Controller ... VM 3 VM VM Switch Virtual Switch Virtual VM VM protocols; (ii) maintain if not improve existing forwarding Grobauer et al. [12] observed that virtual networking can be performance. attacked in the cloud without providing a particular attacker Unified packet parsing addresses the above requirements. model. All the supported header fields of a packet are extracted ina Jin et al. [3] accurately identified two threats faced by vir- single pass instead of multiple passes. This enables the virtual tual switches, namely, virtual switches are co-located with switch and other network functions to operate on packet the hypervisor, and guest VMs need to interact with the metadata instead of the actual packet. Packet operations are hypervisor. However, they stopped short of providing a con- left to the end of the packet processing pipeline. Naturally, crete threat model, and underestimated the impact of com- this approach performs better than parsing or re-parsing promising virtual switches. Indeed at the time, cloud systems headers on a need-to basis [6]. were burgeoning. Only recently, Alhebaishi et al. [13] pro- Although this centralizes the parsing code, and has the po- posed an updated approach to cloud threat modelling. The tential to ease verification [7], it is also the most likely piece virtual switch was identified as a component of cloud sys- of code to contain security vulnerabilities [2]. Furthermore, tems that needs to be protected. However, the severity, and a security vulnerability here, can compromise all the depen- multitude of threats that apply to virtual switches was over- dant mechanisms, and policies, e.g., firewalls and policies for looked. network isolation among tenants. Thus, the attack surface Gonzales et al. [14], and Karmakar et al. [15] accounted of the virtual switch increases with any new protocol that is for virtual switches, and the data plane, however, their work included in parsing. Recall from Fig. 1, that the number of was motivated by strong adversaries. Similarly Yu et al. [16] supported protocols in virtual switch implementations such assumed a strong adversarial model, with an emphasis on as OvS, and Cisco’s Nexus 1000V are on the rise. hardware switches, and the defender having sufficiently large resources. 3 ATTACKER MODELS Therefore, what we observe is that previous work have Having identified the unified packet parser as potentially either provided a general attacker model for the data plane, increasing the attack surface for virtual switches, we now fallen short of providing an accurate attacker model for vir- revisit existing attacker models to investigate whether virtual tual switches, underestimated the impact of a compromised switches are appropriately accounted for. In particular we virtual switch, or assumed strong attackers. We find this look at prior work starting from 2009 which is the year when to be inadequate given the important role virtual switches virtual switches emerged into the virtualization market [1]. play in the cloud, network (function) virtualization, and SDN. Subsequently, motivated by the shortcomings of existing Hence, what is needed is an accurate, and relevant attacker attacker models, we introduce a novel attacker model: the model for virtual switches, which we describe next. vAMP attack. 3.2 The Virtual Switch Attacker Model for 3.1 Prior Attacker Models for Virtual Packet-Parsing (vAMP) Switches Given the shortcomings of the above attacker models, we Virtual switches intersect with several areas of network se- now present a new attacker model for virtual switch based curity research: Data plane, network virtualization, software cloud network setups that use a logically centralized con- defined networking (SDN), and the cloud. Therefore, we troller. Contrary to prior work we identify the virtual switch conducted a qualitative analysis that includes research we as a critical core component which has to be protected identified as relevant to attacker models for virtual switches. against direct attacks, e.g., malformed packets. Furthermore, In the following we elaborate on that. our attacker is not supported by a major organization (she A general attacker model for the data plane was proposed is a “Lone Wolf”) nor does she have access to special net- by Chasaki et al. [8] that involves the attacker exploiting work vantage points. The attacker’s knowledge of computer the packet processor of routers. Keller et al. [4] proposed programming and code analysis tools is comparable to that an attacker model for router virtualization that assumed the of an average software developer. In addition, the attacker virtualization layer in general can be compromised. Qubes controls a computer that can communicate with the cloud OS [9] also makes a general assumption that the networking under attack. stack can be compromised. Similarly, Dhawan et al. [10] The attacker’s target is a cloud infrastructure that uses assumed that the Software Defined Network (SDN) data virtual switches for network virtualization. We assume that plane can be compromised. our attacker has only limited access to the cloud. Specifi- Paladi et al. [11] conservatively assumed the Dolev-Yao cally, the attacker does not have physical access to any of model for network virtualization in a multi-tenant cloud, and the machines in the cloud. Regardless of the cloud delivery model and whether the cloud is public or not, we assume the attacker can either rent a single VM, or has already compro- d) syscall mised a VM in the cloud, e.g., by exploiting a web-application 0x3b vulnerability [17]. f) pop rax ) NULL envp We assume that the cloud provider follows security best- h) pop rdx b) argv practices [18]. Hence, at least three isolated networks (phys- i) pop rsi NULL ical/virtual) dedicated towards management, tenants/guests, cmd a) j) pop rdi /bin/bash -i >& and external traffic exist. Furthermore, we assume that the /dev/tcp/ e) envp[0] / >&1 same software stack is used across all servers in the cloud. mov qword ptr [rbx], rax We consider our attacker as successful, if the attacker g) pop rbx argv[1] obtains full control of the cloud. This means that the attacker can perform arbitrary computation, create/store arbitrary xor rax, rax data, and send/receive arbitrary data to all nodes including argv the Internet. argv[0]

>&1\0 E 4 TAKING CONTROL OF THE CLOUD argv[0] X E Based on our analysis, we conjecture that current virtual C switch implementations are not robust to adversaries from U our attacker model. Indeed it is generally known that vul- T nerabilities exist, and will occur, even in well maintained I /bin/bas production software. However, the point we make here is O that finding vulnerabilities can be easily accomplished by argv[0] N amateurs, e.g., by fuzzing; moreover, a single vulnerability can have a devastating impact. For the purpose of our case study, we evaluated the virtual Figure 3: A visual representation of our ROP chain for switch Open vSwitch in the context of the cloud operating ovs-vswitchd to spawn a shell and redirect it to a re- system OpenStack against our attacker model. We opted for mote socket address. this combination as OpenStack is one of the most prominent cloud systems, with thousands of production deployments in large enterprises and small companies alike. Furthermore, vulnerability is a stack buffer overflow in the MPLS parsing according to the OpenStack Survey 2016 [19], over 60% of code of OvS. OvS deployments are in production use and over one third The stack buffer overflow occurs when a large MPLS la- of 1000+ core clouds surveyed use OvS. bel stack packet is parsed beyond the defined threshold. As Attack Methodology: We conducted a structured attack predicted, this attack has its root-cause in the unified packet targeted at the attack surface we identified, i.e., the unified parser. Indeed, we note that the specification of MPLS (see packet parser. We expected to find vulnerabilities in the RFC 3031 [20] and RFC 3032 [21]) does not specify how to unified packet parser code of OvS, which executes in the parse the whole label stack. Instead, it specifies that when a ovs-vswitchd process. Hence, our next step used an off-the- packet with a label stack arrives at a forwarding component, shelf coverage-guided fuzz tester, namely American Fuzzy only the top label must be popped to be used to make a for- Lop (AFL), on OvS’s unified packet parser code. Specifically, warding decision. Yet, OvS parses all labels of the packet even for our tests we used AFL version 2.03b and source code beyond the supported limit and beyond the pre-allocated of OvS version 2.3.2 recompiled with AFL instrumentation. memory range for that stack. If OvS would handle MPLS Following common best practice for fuzzing code, all crashes packets as per the specifications, it would only pop the top reported by the fuzzer were triaged to ascertain their root label, which has a static, defined size. Thus, there would be cause. no opportunity for a buffer overflow. We identified several vulnerabilities by fuzzing the unified The next step was to convert the vulnerability into an packet parser of OvS, these include one remote-code execu- exploit that gives us a shell on the remote system. We crafted tion, two denial of service attacks, and one access control list the exploit by creating a ROP [22] attack hidden in an MPLS bypass. In this paper we focus on only one of the vulnerabil- packet. By now, ROP attacks are well documented and can ities we found as it suffices to demonstrate the attack. The be created by an attacker who has explored the literature on implementing ROP attacks. Therefore, we focus on one of compute node with multiple VMs in addition to the fuel mas- the challenges we faced in creating the exploit. As per RFC ter node using the default Mirantis 8.0 configuration. Virtual 3032, MPLS label processing terminates if the S bit is set to 1. switching was handled by OvS. Therefore, to obtain a successful ROP chain, i.e., to prevent The attacker was given control of one of the VMs on the early termination of our exploit, we had to select appropriate compute server and could deploy the worm from there. It gadgets by customizing Ropgadget [23] and modifying the took less than 20 seconds until the worm compromised the shell command string accordingly. Figure 3 describes the controller. This means that the attacker has root shell (ovs- stack layout of ovs-vswitchd right after the buffer overflow, vswitchd runs as root) access to the compute node as well as overlaid with our ROP chain. the controller. This includes 3 seconds of download time for With a successful ROP chain, the next step was to create a patching ovs-vswitchd (OvS user-space daemon), the shell worm. We need multiple steps to propagate the worm. These script, and the exploit payload. Moreover, we added 12 sec- are visualized in Figure 2. In Step 1, the worm originates onds of sleep time for restarting the patched ovs-vswitchd on from an attacker-controlled (guest) VM within the cloud and the compute node so that attack packets could be forwarded. compromises the host operating system (OS) of the server Next, we added 60 seconds of sleep time to ensure that via the vulnerable packet processor of the virtual switch. the network services on the compromised controller were Once she controls the server, she patches ovs-vswitchd on restored. Since all compute nodes are accessible from the the compromised host, as otherwise the worm packet cannot controller, we could compromise them in parallel. This takes be propagated. Instead the packet would trigger the vulnera- less time than compromising the controller, i.e., less than bility in OvS yet again. 20 seconds. Hence, we conclude that the compromise of With the server under her control the remote attacker, a standard cloud setup can be performed in less than two in Step 2, propagates the worm to the hypervisor running minutes. the controller VM and compromises it via the same vulner- Attack Result: Our evaluation demonstrates how easily an ability. The centralized architecture of OpenStack requires amateur attacker can compromise the virtual switch, and the controller to be reachable from all servers via the man- subsequently take control of the entire cloud in a matter of agement network and/or guest network. By gaining access minutes. This can have serious consequences, e.g., amateur to one server we gain access to these networks and, thus, attackers can exploit virtual switches to launch ransomware to the controller. Indeed, the co-location of the data plane attacks in the cloud. This is a result of complex packet parsing and the controller, provides the necessary connectivity for in the unified packet parser, co-locating the virtual switch the worm to propagate from any of the servers to the con- with the virtualization layer, and incorrect attacker models. troller. Network isolation using VLANs and/or tunnels (GRE, VXLAN, etc.) does not affect the worm once the server is compromised. We note there, that the ROP chain had to be 5 CONCLUDING REMARKS slightly modified by including “NOP”s to accommodate for While this paper has focussed on virtual switches, the virtual- the increased packet size due to the use of VLAN tags and ization trend, and hence the relevance of our attacker model VXLAN encapsulation. is more general. In particular, additional network function- With the controller’s server also under the control of the ality, e.g., middleboxes, is migrated from hardware equip- remote attacker, the worm can now taint the remaining un- ment to commodity servers [5, 6], not only in the cloud compromised server(s) (Step 3). Thus, finally, after Step 3, all but also other infrastructure providers, e.g., Internet Service servers are under the control of the remote attacker. We auto- Providers [24]. Furthermore, as we move towards realizing mated the above steps using a shell script. Our worm exploit SDN 2.0 [25], network virtualization and higher layer proto- could also have been created by an attacker with average cols (Layer 4-7) will play pivotal roles. Hence, it is imperative programming skills who has some experience with this kind for data plane elements to be more robust to attacks, espe- of technique. This is in accordance with our attacker model, cially in vulnerable code segments such as the unified packet which does not require an uncommonly skilled attacker. parser. Attack Evaluation: Rather than evaluating the attack in the wild we chose to create a test setup in a lab environment. More specifically, we used the Mirantis 8.0 distribution that ACKNOWLEDGMENTS ships OpenStack “Liberty” with OvS version 2.3.2. The test The authors thank the anonymous reviewers for their valu- setup consisted of a server (the fuel master node) that can able feedback and comments. The authors would like to configure and deploy other OpenStack nodes (servers) includ- express their gratitude towards the German Bundesamt ing the OpenStack controller, compute, storage and network. für Sicherheit in der Informationstechnik, for sparking the Due to limited resources, we created one controller and one authors’ interest in SDN security. This work was par- [12] Bernd Grobauer, Tobias Walloschek, and Elmar tially supported by the Helmholtz Research School in Se- Stöcker. “Understanding Vulner- curity Technologies scholarship, Danish Villum Foundation abilities”. In: IEEE Security & Privacy Magazine 9.2 project “ReNet”, BMBF (Bundesministerium für Bildung und (Mar. 2011). Forschung) Grant KIS1DSD032 (Project Enzevalos), and by [13] Nawaf Alhebaishi et al. “Threat Modeling for Cloud the Leibniz Prize project funds of DFG/German Research Data Center Infrastructures”. In: Intl. Symposium on Foundation (FKZ FE 570/4-1). We would also like to thank Foundations and Practice of Security. Springer. 2016. the security team at Open vSwitch for their timely response. [14] Dan Gonzales et al. “Cloud-Trust - a Security As- sessment Model for Infrastructure as a Service (IaaS) REFERENCES Clouds”. In: Proc. IEEE Conference on Cloud Computing [1] Ben Pfaff et al. “Extending Networking into the Virtu- PP.99 (2017). alization Layer.” In: Proc. ACM Workshop on Hot Topics [15] Kallol Krishna Karmakar, Vijay Varadharajan, and in Networks (HotNETs). 2009. Uday Tupakula. “Mitigating attacks in Software De- [2] Len Sassaman et al. “Security Applications of Formal fined Network (SDN)”. In: Proc. IEEE Software Defined Language Theory”. In: IEEE Systems Journal 7.3 (Sept. Systems (SDS). May 2017. 2013). [16] Dongting Yu et al. Security: a Killer App for SDN? Tech. [3] Xin Jin, Eric Keller, and Jennifer Rexford. “Virtual rep. Indiana Uni. at Bloomington, 2014. Switching Without a Hypervisor for a More Secure [17] A. Costin. “All your cluster-grids are belong to us: Cloud”. In: Proc. USENIX Workshop on Hot Topics in Monitoring the (in)security of infrastructure moni- Management of Internet, Cloud, and Enterprise Net- toring systems”. In: Proc. IEEE Communications and works and Services (HotICE). 2012. Network Security (CNS). Sept. 2015. [4] Eric Keller, Ruby B. Lee, and Jennifer Rexford. “Ac- [18] OpenStack Security Guide. http://docs.openstack.org/ countability in Hosted Virtual Networks”. In: Proc. security-guide. Accessed: 27-01-2017. 2016. ACM Workshop on Virtualized Infrastructure Systems [19] Heidi J. Tretheway et al. “A snapshot of Openstack and Architectures. VISA ’09. 2009. users’ attitudes and deployments.” In: Openstack User [5] Ethan J. Jackson et al. “Softflow: A middlebox archi- Survey (Apr 2016). tecture for open vswitch”. In: Usenix Annual Technical [20] Eric C. Rosen, Arun Viswanathan, and Ross Callon. Conference (ATC). 2016. Multiprotocol Label Switching Architecture. RFC 3031 [6] Daniel Firestone. “VFP: A Virtual Switch Platform for (Proposed Standard). Internet Engineering Task Force, Host SDN in the Public Cloud.” In: Proc. Usenix Sympo- Jan. 2001. url: http://www.ietf.org/rfc/rfc3031.txt. sium on Networked Systems Design and Implementation [21] Eric C. Rosen et al. MPLS Label Stack Encoding. RFC (NSDI). 2017. 3032 (Proposed Standard). Internet Engineering Task [7] Mihai Dobrescu and Katerina Argyraki. “Software Dat- Force, Jan. 2001. url: http://www.ietf.org/rfc/rfc3032. aplane Verification”. In: Proc. Usenix Symposium on txt. Networked Systems Design and Implementation (NSDI). [22] Ryan Roemer et al. “Return-Oriented Programming: Apr. 2014. Systems, Languages, and Applications”. In: ACM Trans. [8] Danai Chasaki and Tilman Wolf. “Attacks and De- on Information and System Security (TISSEC) 15.1 (Mar. fenses in the Data Plane of Networks”. In: Proc. IEEE/I- 2012). FIP Transactions on Dependable and Secure Computing [23] ROPGadget Tool. https://github.com/JonathanSalwan/ (DSN) 9.6 (Nov. 2012). ROPgadget/tree/master. Accessed: 02-06-2016. [9] Joanna Rutkowska and Rafal Wojtczuk. “Qubes OS ar- [24] Sue Marek. ATT’s Stephens: 47% of Network Func- chitecture”. In: Invisible Things Lab Tech Rep 54 (2010). tions Are Virtualized. https://www.sdxcentral.com/ [10] Mohan Dhawan et al. “SPHINX: Detecting Security articles/news/atts-stephens-47-network-functions- Attacks in Software-Defined Networks.” In: Proc. In- virtualized/2017/07/. Accessed: 28-07-2017. 2017. ternet Society Symposium on Network and Distributed [25] Craig Matsumoto. Time for an SDN Sequel? Scott System Security (NDSS). 2015. Shenker Preaches SDN Version 2. https : / / www . [11] Nicolae Paladi and Christian Gehrmann. “Towards sdxcentral . com / articles / news / scott - shenker - Secure Multi-tenant Virtualized Networks”. In: Proc. preaches - revised - sdn - sdnv2 / 2014 / 10/. Accessed: IEEE Trustcom/BigDataSE/ISPA. Vol. 1. Aug. 2015. 27-01-2017. 2014.