1

A Framework and Comparative Analysis of Control Plane Security of SDN and Conventional Networks AbdelRahman Abdou and Paul C. van Oorschot Tao Wan Carleton University, Canada Huawei Canada

Abstract—Software defined networking implements the net- a multitude of network devices within its administrative work control plane in an external entity, rather than in each domain. By providing application programming inter- individual device as in conventional networks. This architectural faces (APIs), SDN makes it possible to develop net- difference implies a different design for control functions neces- sary for essential network properties, e.g., loop prevention and working applications, e.g., traffic engineering [3], thus link redundancy. We explore how such differences redefine the enabling network innovation. In contrast, CN devices are security weaknesses in the SDN control plane and provide a proprietary and closed, making it hard or impossible to framework for comparative analysis which focuses on essential develop innovative network applications. network properties required by typical production networks. The concept of SDN has evolved since the term was This enables analysis of how these properties are delivered by the control planes of SDN and conventional networks, and to originally coined in 2009 [4]. Here we try to clarify the critical compare security risks and mitigations. Despite the architectural properties of SDN from the perspective of network devices. difference, we find similar, but not identical, exposures in control A network device can be pure SDN, non-SDN, or hybrid. plane security if both network paradigms provide the same A pure SDN device implements no control function and is network properties and are analyzed under the same threat fully controlled by an external SDN controller. A non-SDN model. However, defenses vary; SDN cannot depend on edge based filtering to protect its control plane, while this is arguably device implements all of its own control functions and is not the primary defense in conventional networks. Our concrete controlled by any SDN controller. A hybrid SDN device both security analysis suggests that a distributed SDN architecture implements control functions, and is controlled by an SDN that supports fault tolerance and consistency checks is important controller. Based on this classification of network devices, a for SDN control plane security. Our analysis methodology may network can also be one of three types. A pure SDN network be of independent interest for future security analysis of SDN and conventional networks. consists of at least one SDN controller and network devices all of which are fully controlled by the controller. A non- Index Terms—Network security, SDN security, Control plane i.e., security, OpenFlow security SDN network ( CN) consists of network devices all of which implement and run their own control functions with I.INTRODUCTION no controlling external entity. A hybrid network consists of hybrid devices and at least one SDN controller. Software-Defined Networking is a relatively new network In academic work, “SDN" often implies a pure SDN net- architecture in which the control plane is separated from work, such as an OpenFlow network, and many academic each individual network device and instead implemented in SDN security research papers (e.g., [5]) focus primarily on the an external software entity. The external entity has complete security of OpenFlow networks. SDN controllers originating in knowledge of the topology of a network under its control, academic work, such as FloodLight and NOX, also primarily and programs the forwarding tables of each individual device support OpenFlow and control OpenFlow switches which arXiv:1703.06992v3 [cs.NI] 6 Dec 2017 in the network. In contrast in conventional networks (CNs), implement no control functionality (i.e., are pure SDN, rather the control plane, including implementation of e.g., a than hybrid). protocol such as (OSPF) [1], runs In contrast in industry, SDN commonly refers to hybrid inside each network device to learn forwarding tables in a networks consisting primarily of CN devices, augmented with distributed fashion. SDN architectures have two distinguishing open interfaces also allowing external control by an SDN properties of direct interest herein [2]: controller. For example Broadcom, a leading provider of 1) Control and data plane separation. Removing the switch chips, published OpenFlow Data Plane Abstraction control plane from network devices and implementing it (OF-DPA) software [6] to allow switches based on Broad- in an external SDN controller significantly reduces the com chips to be controlled by OpenFlow. Note those CN complexity of network devices, making them simpler devices, although often claimed to support OpenFlow and and cheaper than CN devices whose distributed control which can be controlled using the OpenFlow protocol, do plane functionality is implemented across millions of not actually implement OpenFlow tables and are not true lines of code, defined across hundreds of RFCs. OpenFlow switches. Rather, they use conventional tables such 2) Network programmability. An SDN controller, with as L3 tables and Access Control Lists (ACLs) to emulate complete knowledge of a network’s topology, controls the behavior of OpenFlow tables, which allows packets to be processed beyond destination addresses. As another example, Version: December 7, 2017 OpenDayLight [7] and ONOS [8], two leading open source 2

SDN controllers, can control not only OpenFlow switches but modularized SDN software architecture to facilitate imple- also conventional devices, e.g., using NETCONF [9]. It is clear mentation there. Our finding is supported by detailed security that industrial network practitioners focus more on network analysis. We argue that our methodology for comparative programmability than on the separation of control and data analysis will be of independent interest, to guide future SDN planes. We refrain from speculating on which type of SDN is security analysis in both academia and by practitioners. better, or is the future. The sequel is organized as follows. Section II provides back- We study and compare the control plane security of a pure ground information on CN and SDN architecture. Section III SDN (hereafter referred to as SDN) and a CN. While hybrid outlines fundamental network properties required by typical networks are more popular in the field, there is no clear production networks, as well as the threat model used for our consensus on how to best divide control functions locally analysis. Sections IV and V analyze the security risks of the inside a device and externally into a controller. Further, by control plane of conventional Layer-2 (L2) and Layer-3 (L3) studying the security of both SDN and CNs, we hope that networks respectively. Section VI analyzes security risks in security threats identified in each can be selectively applied to SDN networks. We compare the security risks and mitigation a given hybrid network when its local and external controls of SDN with CN in Section VII. Section VIII reviews related are well defined. work. Section IX concludes. Research on the security of SDN and CNs is in two distinct states. On one hand, the security of CNs has received II.BACKGROUND less academic attention but is well understood by network security practitioners; aside from the area of routing (e.g., Here we provide background on conventional and SDN BGP security [10]) there are relatively few academic papers networks for consistent terminology and later reference. Net- on the control plane security of a CN, security threats are working experts may advance to Section III. well understood by equipment vendors and many security mitigations are built into CN products (e.g., switches, routers). A. Conventional Networks In contrast, SDN security has received considerable academic attention (e.g., [11], [12], [13]), but its progress is considered A CN can be L2 or L3. A network consisting of only L2 slow (at best) by industrial measures. For example, neither of switches as its intermediate systems is called an L2 network. the two leading open source SDN controllers, OpenDaylight Two (or more) L2 networks can be connected, e.g., using an and ONOS, has implemented significant security mitigation. L3 . A network of L3 routers is called an L3 network. These different states of SDN and CN security research have Other than using different types of destination addresses for attracted little attention. We offer the following explanation. forwarding, L2 and L3 networks differ mainly in two aspects: We observe that many papers on SDN security assume a sim- 1) They use different mechanisms in constructing their ple network, ignoring practical properties such as redundancy forwarding tables. L2 devices learn their forwarding and scalability essential to realistic networks—thus excluding tables (i.e., MAC tables) from the data plane. L3 routers security risks faced by important network control functionality. build routing tables from the control plane using routing Further, security threats identified for SDN are not properly protocols. Note: MAC tables map MAC addresses to compared with those in CN. We suggest that the lack of switch ports, not to be confused with ARP tables which academic scrutiny, particularly systematization literature, on map IP addresses to MAC addresses. CN security, may be a significant contributing factor in the 2) They handle unknown packets differently. An unknown considerable academic research on SDN security having failed packet is a packet without any corresponding forwarding to have major impact in industry. rules. An L2 device floods an unknown packet to all We aim to address the gap by a comparative security ports except the receiving one to learn the forwarding assessment of conventional and SDN networks. Rather than rule, while an L3 router drops an unknown packet (and a security analysis of all aspects, we focus on control plane may also notify the packet source, e.g., using ICMP). security, since it is in their control plane architecture that CNs Due to these differences, L2 and L3 networks face different and SDN differ primarily. sets of security risks. Thus, we divide CN into L2 and L3, and We provide a framework consisting of essential network discuss separately in Sections IV and V. properties required by production networks. Using this, we study how those properties are achieved by SDN and CN respectively, and analyze the security risks and mitigations B. Software Defined Networking accordingly. Our finding is that the security threats faced by SDN involves one or more SDN controllers, each control- SDN and CN are comparable in an apples-to-apples compar- ling a number of network elements within its domain via ison, i.e., if despite the architectural differences between the standard protocols (such as OpenFlow). Each controller may two types of networks, they are tasked to provide the same run in multiple instances, each further managing a subset of network properties under the same threat model. However, network elements and backing-up other instances to provide defenses vary in that filtering in the network edge is effective both scalability and high availability. SDN controller instances in CN, but less so in SDN. Further, consistency checks, which also communicate with each other within the same domain, or are required by both networks to defeat inside attacks, can may be federated with controllers in other domains, e.g., to be implemented inside each CN device, but require a highly form a complete view of the network (see Fig. 1). Further, 3

between the pair. Further, it should be possible to use all links to transmit data, instead of only one. • Device Redundancy. This property is often referred to as high availability. A network consisting of two or more devices should remain fully available in the event of the failure of any single device. • Scalability. As a network grows and becomes large, it should remain functional and manageable. Network de- sign should allow growth without significant management Fig. 1: Generic SDN architecture overhead. Besides these properties, networks typically must provide other properties such as QoS and certain services to address there may be a hierarchy of SDN controllers for scalability or user needs. For example, a carrier network may need to multiple layer control. support virtual private networks (VPN) [14] to its customers, An SDN controller usually provides a web based portal e.g., using MPLS. Control protocols required to provide those and APIs for network operators and applications to control network services usually introduce their own security risks. network elements respectively. The interfaces between SDN However, discussion of such control protocols is beyond our controller and network elements are labelled southbound, present scope. and those between SDN controller and SDN applications are northbound. Scope of our Analysis: An SDN controller is an entity B. Threat Models that does not exist in a CN, thus its security requires special We consider two threat models: END-HOST, where only end attention. As noted earlier, we focus on control plane security hosts are assumed vulnerable to compromise or under attacker herein; this area has not received much attention [11], motivat- control, and ALL-ELEMENT, where all network elements are ing us to consider it and herein give a framework for directly assumed vulnerable. Under the END-HOST threat model, SDN comparing control plane security issues with those facing CNs. controllers and the network devices themselves are considered We see analysis of control plane security as an important step trustworthy but end hosts attached to the network are not. contributing to a broad security analysis of SDN, which should In contrast, neither SDN controllers nor network devices are include all SDN components (see Fig.1). considered trustworthy under the ALL-ELEMENT threat model; in other words, attacks could be from hosts, as well as network III.FRAMEWORK devices and SDN controllers. We use these two threat models in Section VII to compare the security risks and defenses of Our framework consists of a set of properties required SDN and CN. by typical production networks, and two threat models (see below) to be applied to both SDN and CNs. IV. L2 NETWORKS

A. Network Properties We now discuss how L2 networks satisfy the network properties of our framework, and analyze security risks and Production networks must provide properties such as loop mitigations associated with each property. free forwarding to allow entities attached to the network to communicate. Here we outline five primary such properties. While these are not specifically related to security, they are A. Basic Forwarding important in security analysis as each may require its own To provide network connectivity, an L2 device uses MAC control functions and introduce unique security risks. Since learning to build its MAC table, which maps switch ports to multiple properties may be provided by a common control MAC addresses, and possibly other information such as VLAN protocol, each property does not necessarily introduce new tags. When a switch receives a frame from one of its physical security risks. ports, it adds a new (or updates an existing) entry in the • Basic Forwarding. A network consisting of a single MAC table mapping the frame’s source MAC address to the switch must establish forwarding information to allow receiving port. Multiple MAC addresses can be associated with attached entities to communicate. a same port number. To forward a frame, a switch looks up its • Loop Free Forwarding. A network consisting of mul- destination MAC address in the table, and forwards the frame tiple devices and links which form physical loops must through the corresponding port. If no entry is found, the switch ensure there is no forwarding loop among network device floods the frame to all ports except the receiving one. The forwarding tables. intended destination, upon receiving the flooded frame, sends • Link Redundancy. If there are multiple links between a response frame enabling the switch to learn the mapping a pair of network devices, the network topology should between receiving port and responding source MAC address. remain unchanged in the event that one or several of such The missing mapping entry is added into the table for future links go down as long as there is one functioning link use. Figure 2 illustrates a MAC table for a single switch. 4

Port # MAC 1 A 2 B 3 C

Fig. 2: MAC table for switch with three devices connected.

Fig. 3: Operation of the spanning tree protocol. 1) Attacks: An L2 MAC table is learned from data plane (including end-user) packets. Thus, it is subject to MAC attacks [15]. A malicious host can send a packet with a path costs, and accordingly compute a spanning tree. A root falsified source MAC address to poison a switch’s MAC table. switch is first elected, typically that with the smallest identifier. Two known attack strategies are as follows. In MAC spoofing, Each non-root switch then determines the root port as the port an attacker sends frames with spoofed source MAC addresses with least-cost path leading to the root switch. Similarly, for matching those of target (victim) hosts, thereby hijacking each link in the network, the end port closer to the least- traffic destined to those victims. If a victim is actively sending cost path is called the designated port. All remaining ports packets to switches, the poisoned MAC table will alternate are called blocked ports. A spanning tree then consists of all between correct and falsified states. A more effective attack, the network switches (one as root) and some network links. MAC Flooding, sends a large number of garbage frames with The links not in the spanning tree are still used to exchange randomly generated source and destination MAC addresses control plane traffic (e.g., BPDU) but only in one direction to to fill up the MAC table. Once the table is full with non- be loop-free. 1) Attacks: existent MAC addresses, legitimate frames will not match any STP uses BPDU, with a multicast destination forwarding entry, resulting in flooding of frames to switch MAC address, to exchange topology information to elect a ports including those connecting to the attacker who can thus root bridge and to establish a spanning tree, assuming all eavesdrop or even hijack virtually all traffic. Any device, BPDUs are trustworthy. Due to the lack of security protection e.g., including end-user devices (outsiders) and network devices ( no default robust authentication [19]), STP is subject to BDPU tampering (insiders), can similarly manipulate a MAC table. BPDU spoofing attacks [20], as well as and BDPU flooding. For example, an attacker could send a spoofed Note that this differs from ARP spoofing (Section V-A1). BPDU packet with a low priority and small MAC address MAC attacks poison the MAC table of a switch using any to result in the lowest bridge identifier among all switches, packet with a spoofed source MAC address. ARP spoofing thus winning the root bridge election. Being the root bridge, poisons the ARP table of a host or router using only ARP- the attacker receives virtually all network traffic within the related packets (e.g., ARP response or gratuitous ARP [16]). STP domain. BDPU tampering could lead to the calculation 2) Defenses: MAC attacks can be mitigated by preventing of incorrect network topology. BDPU flooding could force untrusted devices such as hosts from sending packets with switches to continuously re-calculate topology, resulting in spoofed MAC addresses. One such mitigation mechanism, service disruption. port security [17], allows a switch to bind a port to one or 2) Defenses: An administrator may intervene in root place- several MAC addresses (MAC binding). Port security usually ment, e.g., manually specifying the location of the root switch; also allows limits on the number of MAC addresses to be Cisco’s root guard command [21] facilitates this. Likewise, associated with a switch port (MAC limiting). MAC bind- BPDU filter prevents a host from participating in STP by ing can prevent MAC spoofing, but is static and typically filtering BPDUs in NNIs. requires manual configuration—possibly introducing configu- ration overhead and misconfigurations. MAC limiting appears more practical as it can mitigate MAC flooding attacks and C. Link Redundancy requires only simple configuration, but alone, does not prevent While STP can prevent forwarding loops, it uses a single MAC spoofing. link between a pair of switches even when redundant links exist, resulting in underutilized network bandwidth or even packet loss in the event of link failures. To improve band- B. Loop Free Forwarding width usage and redundancy, L2 protocols may support link In L2 networks with multiple switches and looping links aggregation, grouping multiple links into one virtual link. A (see Fig. 3), forwarding loops can occur when unknown L2 Link Aggregation Group (or LAG), viewed as a single link, frames are flooded and no frame TTL field limits how can be included in a spanning tree, allowing their collective many times a frame is forwarded. Thus for loop prevention, an use for link protection and load balancing. L2 control protocol like the Spanning Tree Protocol (STP) [18] Link aggregation can be configured manually, or estab- is needed. Switch ports connecting other switches and end user lished dynamically by the Link Aggregation Control Protocol hosts are often referred to as Network-to-Network Interfaces (LACP) [22]. LACP transmits LACP Data Unit (LACPDU) (NNIs) and User-to-Network Interfaces (UNIs) respectively. to inform the other end (partner) of its state and its under- In STP, switches exchange Bridge Protocol Data Units standing of partner state. Based on LACPDU, a LAG can be (BPDUs) carrying information about switch identifiers and dynamically created and updated. 5

1) Attacks: A switch running LACP sends to, and receives within the fabric. Each customer edge switch learns all local from its partner, LACPDUs to maintain link aggregation. MAC addresses and some remote MAC addresses; S5 and LACPDUs are typically sent over a point-to-point link, making S6 learn no end-user MAC address. Example network fabric man-in-the-middle tampering difficult, but remaining vulner- implementations are Transparent Interconnection of Lots of able to LACP spoofing attacks because (1) LACP is usually Links (TRILL) [25] and Shortest Path Bridging (SPB)—see implemented in the CPU (vs. data plane), allowing a switch IEEE 802.1aq [26]. to receive LACPDUs from remote entities; and (2) it has no security protection (e.g., peer authentication) [22]. Thus, an external entity (e.g., a host) may send forged LACPDUs to a switch to influence the state of its link aggregation, e.g., to cause link instability or even denial of service. 2) Defenses: Implementing LACP in the data plane may ensure that LACPDUs are only received from a given port and never leave that port, mitigating forged LACPDU injection.

D. Device Redundancy Device level redundancy ensures that the failure of one or Fig. 4: L2 network with network fabric more devices does not result in loss of network connectivity; e.g., Cisco Switch Stacking (CSS) [23] allows a number of 1) Attacks: VLANs are subject to VLAN hopping attacks— switches, usually of the same model, to form a redundancy traffic from one VLAN can be received by another, allowing group. Within a redundancy group, a master is elected dynam- L2 attacks against one VLAN to be launched from a differ- ically, e.g., based on bridge identifiers and/or priority values. ent VLAN. One attack strategy is VLAN double encapsula- If a master fails, a new master is elected to ensure ongoing tion [27]. Another attack strategy is to exploit switch miscon- network connectivity. figuration or VLAN auto negotiation protocols to impersonate 1) Attacks: Electing a switch master by dynamic election another switch. In this way, a malicious host can pretend is subject to election spoofing attacks. An adversary may send to be a switch and the link between the host and a switch falsified messages to become the master. would appear to be a trunk link, allowing the host to send 2) Defenses: The election process should use cryptographic and receive packets with any VLAN tag. Thus, the network methods for origin authentication and message integrity to isolation provided by VLAN is completely broken. exclude unauthorized entities, e.g., a host, from joining. There are multiple means for implementing L2 network fab- Switches within a redundancy group should be connected ric, e.g., TRILL and SPB, all of which use a , via dedicated ports, and the election process should only run typically IS-IS, to automatically discover and maintain the in those ports. Any election message received from other ports network topology inside the fabric. IS-IS is subject to several should be dropped. attacks, such as PDU spoofing, and DoS due to replaying hello messages [28], [25]. E. Scalability 2) Defenses: VLAN-related vulnerabilities may be miti- gated by disabling VLAN auto-negotiation, and configuring Scalability is an important issue with STP and MAC learn- VLAN filtering in UNIs. Note that a packet may contain more ing; every switch must learn all MAC addresses and identifiers than one VLAN tag; all such tags should be filtered if present in the network. While easy in small networks, challenges arise in packets received from UNIs. in larger networks such as a large enterprise data center with IS-IS vulnerabilities may be mitigated by enabling addi- many physical servers and virtual machines, each with several tional IS-IS cryptographic authentication [24], and ignoring MAC addresses. The number of MAC addresses may exceed unauthenticated PDUs. Further, IS-IS messages received from the MAC table capacity of a switch, and increase the delay of UNIs should be dropped. MAC table look ups. To improve scalability, VLANs may be used to divide a V. L3 NETWORKS network into segments, each an isolated L2 broadcast domain. MAC learning then occurs within a VLAN, reducing MAC A. Basic Forwarding table size. L3 routing is used to connect VLANs. L3 devices, namely routers, perform two main tasks: route Another method to improve scalability is to group core learning and packet forwarding [29]. In simple networks where switches into domains called network fabric. In Fig. 4, two or more subnets are connected by a single router, the switches S5, S6 and ports on switches S1–S4 (customer edge is usually manually configured without running switches) form a fabric and run an L2 routing protocol, any routing protocol. A packet sent from one subnet to another e.g., Intermediate System to Intermediate System (IS-IS) [24], has a destination MAC address of the default gateway of to learn the fabric network topology. In Fig. 4, a frame the source host, thus always arrives at the router via one from A to B is encapsulated with an outer header with S1 of its interfaces. A router, upon receiving a packet from one as source address and S4 as destination address. The outer subnet destined to another, performs three actions: removing header usually introduces TTL to prevent indefinite forwarding the packet’s L2 header, looking up the routing table for the 6 next hop, and encapsulating the packet with a new L2 header same) destination address to be routed to their next hops over for forwarding. A next hop in a routing table could be a multiple links of equal cost. Multiple path routing is a local local interface or an IP address. Routing table looking up is decision made within a single router, requiring no interaction recursive until a next hop is a local interface. In this case, it with adjacent or remote routers. Thus, it neither requires further looks up the ARP table associated with that interface control protocols nor appears to introduce new security risks. for the MAC address of the packet’s next hop IP address. If not found, the router uses ARP to obtain the MAC address. D. Device Redundancy 1) Attacks: Since an L3 router uses ARP to resolve the MAC address of a packet’s destination IP, it is subject to ARP If a gateway router goes down, traffic across different cache poisoning attacks [30]. subnets will be unable to reach their destinations, result- 2) Defenses: There are several approaches to address ARP ing in service outage. To improve availability, two or more cache poisoning. Dynamic ARP Inspection (DAI) [31] is a routers often share a common virtual IP address and run a mechanism by which ARP responses are checked against (1) control protocol such as Virtual Router Redundancy Protocol a central DB that binds IP to MAC addresses (this may be (VRRP) [38] to dynamically elect a master as the default populated by listening to DHCP requests and responses in the gateway of a subnet. When a master fails, VRRP dynamically network); or (2) static pre-configured ARP entries. selects another router as the master. VRRP runs over IP with Cryptographic measures can also be used to distribute pre- an IP multicast address as its destination. populated IP-to-MAC-address mapping attestations, such as 1) Attacks: Protocols for high availability routing such as Ticket-based ARP (TARP) [32]. A voting-based protocol, VRRP [38] are subject to spoofing attacks. For example, requiring network consensus before updating ARP entries, has VRRPv3 [38] does not include authentication of VRRP mes- also been proposed by Nam et al. [33]. sages, thus an attacker may send a spoofed VRRP message with the highest priority to become the master of the router B. Loop Free Forwarding cluster. Such an attacker will receive all traffic to and from In networks with multiple routers and redundant physical a subnet. While the previous versions of VRRP [39], [40] do paths, a routing protocol is often used to advertise and learn include message authentication, it was removed from version 3 routing information. Routing protocols are either link state because it could be exploited to result in (malicious) election (e.g., IS-IS, OSPF—Open Shortest Path First [1]) or distance of multiple masters [38]. It is also argued [38] that other vector (e.g., RIP—Routing Information Protocol [34]). attacks, such as ARP spoofing, exist which could result in the 1) Attacks: While attack methods vary among different same attack effect (e.g., becoming the gateway of end hosts). routing protocols, a common attack objective is routing table 2) Defenses: Dropping VRRP messages that arrive from a poisoning—to pollute network topology information and de- host-connected port prevents an attacker sitting at the network rived forwarding tables by advertising or injecting false routes edge from spoofing such messages [38]. Additionally, VRRP through announcements. For example, a malicious router could message includes a TTL set to 255 by default. Upon receiving advertise a malicious Link State Advertisement (LSA) (e.g., a VRRP message, a router validates the TTL field and discards with a false link cost) to influence other routers’ calculation a VRRP message whose TTL is not equal to 255. This limits of routing tables. Such attack is easy to launch but has limited the ability of remote attackers (e.g., outside of a network) from impact since a neighboring router will eventually advertise a spoofing VRRP packets. correct LSA with a fresher sequence number, resulting in the removal of the falsified LSA from being used for routing table E. Scalability calculation. More advanced attacks (cf. [35]) can be launched Within an AS, scalability in L3 networks is provided to increase the effectiveness of routing table poisoning. using routing protocols, supported by hierarchical routing. For 2) Defenses: To mitigate routing table poisoning attacks, example, OSPF allows a large network to be divided into sub- three levels of defenses should be considered. First, routing domains (OSPF areas). Routers within an OSPF area need protocols should only run in NNIs (routing updates received only maintain network topology information of the area they from UNIs should be dropped). This is to prevent an outsider belong to. A backbone OSPF area is used to connect all other (e.g., a host) from participating in routing protocol commu- areas. Thus routing advertisements are limited to within an nication. Second, message origin authentication should be area, reducing the size of routing databases. Between ASes, implemented to prevent a malicious (compromised, previously BGP is used to advertise network reachability information. legitimate) router from impersonating another router. Third, Security risks of routing protocols, and their mitigation, are routing updates should be corroborated when being used to discussed in Section V-B above. calculate routing tables. For example, a link cost advertised by one router should be corroborated with the link cost advertised by the other router on the same link. VI.SOFTWARE DEFINED NETWORKS In SDN, the control plane of a device is implemented in C. Link Redundancy an external entity, as opposed to within the device in a CN. L3 networks usually provide link redundancy through rout- This architectural difference impacts how network properties ing strategies like Equal Cost Multiple Path (ECMP) rout- from our framework are provided. More specifically, in CNs, a ing [36], [37], which allow packets to a common (i.e., the property achieved in the data plane is also considered achieved 7 in the control plane. This does not hold in SDN due to the 2) Defenses: MAC binding (discussed earlier) can mitigate separation of planes. Thus for SDN, we discuss separately how MAC spoofing attacks, albeit requiring static configuration. each network property is provided for the data and control MAC limiting (also discussed earlier) can mitigate MAC planes. We use OpenFlow switches [41] as an example in our flooding attacks, but cannot prevent MAC spoofing. discussion. To mitigate VLAN spoofing, an OpenFlow switch can designate its ports as UNIs and NNIs, and remove VLAN tags in packets received from UNIs. Note since there is only A. Basic Forwarding one switch, NNIs will not receive any traffic but the defense still works in the case of multiple switches. If the port an Here we consider how a single SDN controller, controlling SDN controller connects to cannot be determined, and the a single OpenFlow switch connected with a number of hosts, controller needs to tag traffic for some reason, this defense learns forwarding information. We assume that the controller becomes problematic. has a direct connection with the switch, thus no need to learn To mitigate IP spoofing, OpenFlow controllers could avoid about this control connection. the learning of IP based forwarding rules from the data plane To configure the switch to provide connectivity among by planning IP address assignment and configuring flow tables connected hosts, the controller must learn the mapping be- with IP prefixes, acknowledging that prefix matching might be tween hosts (e.g., their MAC addresses) and switch ports. To slower than precise matching. do so, the controller may configure the switch to forward ARP requests and unknown packets, to the controller. An OpenFlow switch forwards such a packet to a controller using B. Loop Free Forwarding the PACKET_IN message, which includes the switch port Here we consider a single controller controlling a number from which the packet is received. The PACKET_IN message of OpenFlow switches (Fig. 5). provides the controller with information about which hosts connect to which switch ports. If a destination host is also unknown, the controller instructs the switch to flood this packet using a PACKET_OUT message. The response from the destination will also be sent to the controller, allowing the controller to learn the location of both hosts. As a result, the controller can set up flow rules for the pair to communicate. This learning process by a controller, called Host Tracking Service, is equivalent in principle to MAC learning by an L2 switch. It demonstrates how a conventional L2 control Fig. 5: An SDN with multiple switches. function is taken out from a switch, and implemented inside a controller. One difference is that MAC learning, being L2, only To configure forwarding tables on OpenFlow switches, the learns MAC addresses (possibly VLAN IDs). OpenFlow can controller must first learn the network topology using a control learn both L2 addresses (MAC address and possibly VLAN protocol such as Link Layer Discovery Protocol (LLDP) [42]. ID) and IP addresses. We call this process Host Learning, for There are two scenarios to consider. (1) There exists a better comparability with MAC learning. dedicated control network (e.g., a direct link between each 1) Attacks: Host learning by a controller, being based on switch and the controller) such that each OpenFlow switch can information provided by a switch and hosts, is thus subject communicate with the controller, e.g., by establishing a TLS to spoofing attacks (MAC spoofing, IP spoofing, VLAN tag connection with the controller. (2) No such dedicated control spoofing), since a dishonest host or switch can forge such network exists, and the controller must discover all switches information inside a packet. Often called Host Location Hi- and set up proper flow tables so they can begin communicating jacking [11], here we call it Host Profile Poisoning for better with the controller to receive further flows. comparability with MAC table poisoning. Host learning is In case (1), the switch initiates communication with the also subject to flooding attacks—an attacker may generate a controller during a boot-up process. The controller thus obtains large number of packets with arbitrary MAC and IP addresses, information about individual switches under its control without resulting in creating (1) a large number of host profiles inside running any control protocol. However, the controller does the controller, (2) a large number of messages sent to the not know the connectivity between switches, i.e., the network controller, and (3) a large number of flow tables inside a topology. In case (2), the controller does not know which switch. Thus, it is possible to cause DoS or packet interception, switches are under its control and must use a protocol to as unknown packets are also flooded. For example, if the discover them (cf. Section 3.3 of [43]). Here we consider memory allocated for host profiles in a controller is full, case (1), which is also commonly studied in other papers an existing host profile (e.g., the oldest) will be overwritten, (e.g., [11]). resulting in the flooding of a new packet destined to that host. For topology discovery, an OpenFlow controller sends an This resembles an L2 MAC flooding attack. LLDP packet (inside a PACKET_OUT message with output Host-learning messages between switches and the controller port set to ALL) to every switch under its control. A switch may be exploited to cause a message forwarding loop. receiving such a message floods to all of its ports. A switch 8 receiving an LLDP packet from a neighbor switch must mitigate spoofing by hosts or switches. Consistency checks of forward it to the controller, e.g., due to the absence of flow OpenFlow events, if implemented by a controller, can detect rules for processing such a packet, triggering a default rule for falsified events by a host or switch. forwarding unknown packets to the controller, or the existence of an explicit flow rule for forwarding LLDP packets to the D. Device Redundancy controller. This PACKET_IN message to the controller also includes the port number that receives the LLDP packet. For redundancy among a group of OpenFlow switches, a Thus, the controller discovers a link between two switches, controller could monitor the status of each switch, and update and subsequently all links between all switches, allowing switch configurations and flow tables accordingly to allow traf- completion of a complete network topology. fic go through a new switch. For redundancy among a group of 1) Attacks: An OpenFlow controller, discovering network OpenFlow controllers, OpenFlow switches are configured with topology by learning from LLDP packets sent by switches, is multiple controllers. When the current controller goes down, thus subject to LLDP spoofing attacks. An attacker, a switch a switch establishes a connection with a next controller. A or a host, may send falsified LLDP packets to a controller to distributed election protocol can also be used to elect a master contaminate computation of network topology. For example, a controller, often for SDN applications. When the master fails, host may send falsified LLDP packets to insert itself and create another slave replica becomes the master. The result of such non-existent links into the topology. This is a Link Fabrication master election can also be synchronized into the switch’s Attack [11]. LLDP flooding may also force a controller to configuration of controller preference to ensure all switches continuously re-calculate network topology, disrupting service. are also controlled by the same master. For example, ONOS 2) Defenses: To mitigate LLDP spoofing from hosts, an uses ZooKeeper [44], a tool implementing an election protocol OpenFlow switch can designate its ports as NNIs and UNIs, for distributed coordination and election. OpenDaylight uses and reject LLDP packets received from UNIs. This defense Akka and Raft [45] for master election. would work if the port an SDN controller connects to is Further, controller states must also remain synchronized prior known. Otherwise, it will be problematic, since LLDP among controllers. For example, ONOS implements an even- packets to and from the controller might also be filtered. tual consistency model [46], in which a background process Message authenticity and integrity, if implemented by LLDP, updates written objects in all replicas periodically. can effectively mitigate LLDP spoofing by hosts. Consistency 1) Attacks: Beside spoofing attacks against device redun- checks of LLDP packets by a controller can mitigate LLDP dancy (cf. Section VI-C1), an election protocol used by con- spoofing by a host or switch. See Section VII-B for discussion trollers to achieve redundancy is subject to spoofing attacks. on how to mitigate attacks from an SDN controller. First, a non-controller entity (e.g., a host) may join the election process to cause undesirable results. Second, a misbehaving controller may be able to manipulate the master election C. Link Redundancy process (e.g., by manually picking the smallest allowable time We note two options for implementing link aggregation for before candidacy election) to become the next master. two OpenFlow switches with multiple physical links between For example, Akka used by OpenDaylight for controller them. (1) The OpenFlow switches, just as an L2 switch, run clustering employs no default security mechanisms [47], e.g., LACP between them to create a virtual link presented to the for integrity protection of inter-cluster messages. In ONOS, OpenFlow controller during link discovery. (2) An OpenFlow we note no security mechanisms are used to ensure the controller discovers shared links between two switches and integrity of update information. A timestamp may be forged, uses a group table (available in OpenFlow 1.3 and later) to and if replicas are not properly authenticated, an attacker may distribute traffic among a set of switch ports within a group. impersonate one of them and manipulate stored objects. Here the OpenFlow controller must monitor the status of all 2) Defense: If the ports SDN controllers connect to are links and update a corresponding group table upon detection of prior known, election messages if received from other UNIs a link state change. Note, if an OpenFlow switch shares multi- can be be dropped. Otherwise, message origin authentication ple links with a conventional switch or host running LACP, the and integrity need to be implemented for the election protocol OpenFlow switch can be configured to pass LACPDUs from and state replication, e.g., using mutually authenticated TLS the conventional switch to the controller to be processed. We among controllers, to prevent an outsider from participating consider the second case, since a switch implementing LACP in or tampering with the election process. Additional mecha- is considered hybrid rather than pure SDN switch. nisms, e.g., information corroboration, are also needed to miti- 1) Attacks: An SDN controller is subject to message spoof- gate attacks by a misbehaving controller (e.g., compromised by ing attacks. For example, an attacker may send false link up an attacker) participating in election with forged information. and down events to a controller, to manipulate the state of a link group. Such spoofing may be from a switch or host. 2) Defenses: Link state events should only be sent from E. Scalability OpenFlow switches, not hosts. If the port an SDN controller Unlike CNs, which run control protocols such as IS-IS to connects to is prior known, such events if received from implement a network fabric for scalability and other benefits, other UNIs should be dropped. Message authenticity and OpenFlow controllers can configure flow rules in a way to integrity, if implemented by OpenFlow events, can effectively improve scalability. For example, an OpenFlow controller may 9 configure: (1) encapsulation rules at ingress switches, (2) for- respectively, are similar in nature but differ in details. For warding rules based on outer headers at intermediate switches, example, the attack vector of MAC table poisoning is MAC and (3) decapsulation rules at egress switches. In this way, address spoofing, while both MAC and IP addresses could be the controller creates tunnels shared by many individual flows spoofed in host profile poisoning. Since an L3 router with and the intermediate switches only need to be configured with manually configured routing table does not learn forwarding the rules related to tunnels, not individual flow, significantly information from the data plane, it is not subject to IP spoofing reducing the number of rules required. Edge switches (ingress attacks. Another subtle difference is related to the size limit and egress) also only need to be configured with the rules of the MAC table and memory allocated to host profiles when relevant to the end hosts connected to them. In this way, flooding is employed. MAC table size could vary from a few OpenFlow controllers may create a network fabric without thousand to a few million entries, depending on the vendor need of running additional control protocols. and model of a switch; an SDN controller usually has larger Scalability in the SDN control plane can be provided by memory and is thus less vulnerable to such flooding. dividing a network into areas or domains, each controlled A CN is also less vulnerable to DoS attack (than an SDN by one or more controllers. Controllers could be peer-to- network) because the SDN controller itself is a new attack peer or hierarchical. In a peer-to-peer model, area controllers surface, as is the link between a switch and an SDN controller, synchronize states among themselves so that each maintains a which could become a new bottleneck [51]. consistent global view of the network. In a hierarchical model, Defenses for MAC poisoning and host profile poisoning lower level controllers maintain a subset of the global view; are similar. For example, port security could be used in only a top level controller has the global view. a relatively static network to bind switch ports with MAC Distributed controllers need to communicate with each other addresses for both CN and SDN. In a dynamic network where to exchange reachability and state information. There is cur- MAC addresses might change often (e.g., in a data center with rently no standard defined for inter controller communication. server virtualization), static binding is problematic for both A distributed protocol is usually needed. For example, BGP paradigms. is suggested to be the message exchange protocol among We do not discuss the ALL-ELEMENT threat model here SDN controllers [48]. ONOS relies on a distributed databases, since this simple network consists of only one switch and e.g., Cassandra [49] and Distributed Hash Tables (DHTs), for one controller. If the switch or the controller is malicious, the distributing network topology and state information among network would be completely compromised. controllers. 1) Attacks: Vulnerabilities may arise from distributed com- munication in a peer-to-peer model, or from controller-to- B. Loop Free Forwarding controller communication in a hierarchical model. For ex- Conventional L2 and L3 networks use STP and routing ample, if BGP would to be used to exchange information protocols (such as OSPF) for loop free forwarding. SDN uses among controllers, vulnerabilities in BGP could be exploited LLDP for topology discovery. to attack SDN. Vulnerabilities could also arise from the In the END-HOST threat model, an end host can attack both distributed database if it would to be used for synchronization CNs and SDN, resulting in incorrect forwarding tables by among controllers. For example, without proper configuration, exploiting protocols’ vulnerabilities. While the impact from Cassandra may be vulnerable to query injection attacks [50]. such attacks appears comparable, attack techniques will differ 2) Defense: As with any other distributed protocol such since the protocols exploited differ. as an election protocol, communication among controllers, Defenses for CNs can rely on UNI filtering. However, that either peer-to-peer or hierarchical, must provide data origin works in SDN only if the attach points of the SDN controller authentication and message integrity to prevent outsiders from are prior known and remain static. Otherwise, UNI filtering participating in or tampering with the communication. Further, is ineffective and crypto mechanisms are required in SDN to an additional mechanism such as information corroboration prevent outsiders from participating in topology discovery. is needed to mitigate misbehavior by legitimate controllers. In the ALL-ELEMENT threat model, the network (e.g., L2 or Note, unlike in a simple network where the ports to which L3 device and SDN controller itself) could be malicious; risks controllers connect to can be prior known, it is hard to define a and defenses then appear similar for CNs and SDN, albeit communication boundary for distributed controllers to prevent with subtle differences. From a risk perspective, a malicious outsiders from participating in the election process. CN device or an SDN controller may be able to compromise the entire network (e.g., influencing the routing table of any VII.COMPARISON device within the network), acknowledging that a malicious Table I summarizes the control functions, attacks and de- SDN controller appears capable of causing more damage. fenses noted above. Here we compare security risks and From a defense perspective, data origin authentication, mes- defenses of CN and SDN control planes in both the END- sage integrity, and consistency checks are all required by both HOST and ALL-ELEMENT threat models of Section III. CNs and the SDN control plane to counter insider attacks (e.g., to detect and discard false information received from other A. Basic Forwarding legitimate nodes). In CNs, consistency checks can be done MAC table poisoning (against a CN switch) and host profile inside individual devices. In SDN, consistency checks should poisoning (against an SDN controller), the two major threats be done by both switches and controller. First, an OpenFlow 10

CONVENTIONAL NETWORKS NETWORK PROPERTIES SDN LAYER-2 LAYER-3 Control Functions Static routes, Address Basic Forwarding MAC Learning Host Location Learning Resolution Protocol (ARP) Link Layer Discovery Protocol Loop Free Forwarding Spanning Tree Protocol (STP) Routing Protocols (OSPF, RIP) (LLDP) Equal Cost Multiple Path Link Redundancy Link Aggregation (LACP) Controller (ECMP) Virtual Router Redundancy Device Redundancy Switch Stacking Election Protocol Protocol (VRRP) VLAN, Network fabric Routing Protocols (OSPF, Scalability BGP, distributed DB (TRILL, VxLAN) IS-IS) Attacks MAC table poisoning (MAC Basic Forwarding spoofing and MAC ARP table poisoning [30] Host profile poisoning [11] flooding) [15]

BPDU spoofing, tampering and Routing advertisement Loop Free Forwarding Link fabrication [11] flooding [19], [20] spoofing [35] Spoofed link-manipulation Link Redundancy LACPDU spoofing [22] Link DoSing [36] messages (e.g., [22]) Device Redundancy Stacking spoofing VRRP message spoofing [38] Master election manipulation [47]

VLAN hopping [27]; Switch Routing advertisement Scalability impersonation [27]; Routing BGP attacks, distributed DB attacks spoofing [35] advertisement spoofing Defences Dynamic ARP inspection Port Security [17]; MAC (DAI) [31]; Ticket-based ARP MAC binding, host location Basic Forwarding binding and limiting (TARP) [32]; Voting-based validation [11] protocols [33] Root Guard [21]; BPDU UNI filtering of LLDP UNI filtering and consistency Loop Free Forwarding filtering (prevent a host from packets [11], Authentication of check of routing advertisements masquerading as a switch) LLDP and OFDP packets

LACP source port UNI filtering of control messages, Link Redundancy authentication (data plane N/A mutual authentication of control implementation) channel

Run master election process on UNI filtering of VRRP Authenticity and integrity in master Device Redundancy dedicated ports, authenticate messages, TTL checks election devices involved in the process

VLAN filtering on UNIs, and Authenticity and integrity in UNI filtering and consistency Scalability disabling VLAN auto communication among SDN check of routing advertisements negotiation controllers

TABLE I: A summary comparison between Conventional Networks (CNs) and SDN. switch should validate LLDP messages to ensure that it does discovery and route calculation. It appears difficult to contain not contain false information (e.g., a faked link between the damage from a misbehaving controller if it is monolithic. sender and the receiver). Second, an SDN controller should Thus, a controller is better divided into small, independent validate LLDP messages to rule out faked nodes and faked units to minimize the risk from a misbehaving control unit links. However, new defenses are needed to mitigate or reduce and facilitate cross-checking the behavior of each unit. risks of a controller misbehaving in doing network topology If adopted for SDN controllers, a micro service architec- 11 ture [52] can serve this purpose. As an example of such an E. Scalability architecture, the control function providing loop free forward- For scalability in CN and SDN, respectively, routing pro- ing can be implemented in three micro services, the first col- tocols such as OSPF and BGP can be used. They are subject lecting and validating LLDP messages, the second performing to similar attacks. Regarding defenses, the network boundary topology and route calculation, the third updating flow rules can be defined for CN to discard control messages from end in switches. Each micro service runs multiple instances, each hosts in the END-HOST threat model. In SDN this is less of which cross-checks requests and responses from multiple effective since SDN controllers often run insider servers that other service instances. To cross-check behavior of flow rule connect to the edge of a network; SDN controllers require updating services, other types of services such as real time crypto mechanisms to prevent outsiders from participating in flow validation (e.g., VeriFlow [53]) can be implemented. the routing protocols. In the ALL-ELEMENT threat model, to detect inside attacks, CNs and SDNs require message origin authentication and C. Link Redundancy message integrity, as well as consistency checks. A CN uses LACP for link aggregation; an SDN controller Discussion. We save summary observations for Section IX. can monitor link state changes, and update link groups in a switch accordingly. Both are vulnerable to message spoofing VIII.RELATED WORK and tampering attacks. In the END-HOST threat model, rules can be configured on To complement the preceding comparative analysis, we dis- UNIs to filter LACPDU packets for CNs. If the attach points of cuss relevant literature for security in SDN and counterparts in the SDN controller are prior known and static, rules can also be conventional networks (CNs). Broad surveys of SDN security configured to filter link up/down events for SDN. Otherwise, are available elsewhere (e.g., [54], [55], [56]). crypto mechanisms are required by SDN to prevent outsiders from sending link up/down events to the SDN controller. A. Control Plane In the ALL-ELEMENT threat model, message origin au- 1) Security-oriented controllers: As one of the first network thentication and message integrity can be used to address a operating systems, NOX [57] provides greater flexibility to the legitimate switch spoofing a link group member. If a link group management plane. Despite lacking the ability to undertake member itself misbehaves, it falls short for the other member most network functionalities by itself, NOX aims to provide to maintain a correct link group state since the misbehaving sufficient APIs to ease the fulfilment of such functions. Porras end can manipulate packets (e.g., selectively dropping them) et al. [12] proposed SE-Floodlight, a security enhanced system to achieve the same end. based on Floodlight [58]. Network administrators manually In SDN, message origin authentication and integrity, e.g., by assign roles to applications, while SE-Floodlight mediates mutually authenticated TLS, can mitigate a legitimate switch all OpenFlow operations to enforce a role-based permission spoofing another switch by sending the controller faked link model. SE-Floodlight also provides authentication and flow up and down events. Further, the function controlling link conflict resolution services (based on FortNOX [59]), both redundancy can be implemented in micro services, several occurring on the system level independent of the applications, running simultaneously to cross-check each others’ behavior. to enforce privilege separation (cf. also PermOF [60]). Shin et al. [61] identified several reasons for controller weaknesses, including lack of (1) resource control, (2) application sepa- D. Device Redundancy ration, (3) application authentication and authorship, and (4) access control, and presented Rosemary as a (non-monolithic) Both L2 and L3 use an election protocol to exchange micro-NOS to address these shortcomings. messages among a device group to elect a master, thus being 2) Control plane security extensions and APIs: Many se- subject to spoofing attacks. In SDN, an election need not be curity extensions to SDN controllers have been proposed to implemented in switches for data plane redundancy, but is monitor and detect suspicious network behavior. VeriFlow [53] required for controller redundancy. and FlowVisor [62] are two examples. The former provides In the END-HOST threat model, rules can be configured on real time checking and verification of forwarding behavior, NNIs to filter control messages from end hosts. If the attach whereas the latter enables network slicing such that each slice points of the SDN controller are prior known and static, such is typically under a different control domain, thus providing an approach can also be employed for SDN. Otherwise SDN logical separation of multiple controller instances (cf. [63]). requires message origin authentication and message integrity For such multi-slice networks, FlowChecker [64] is an exam- to counter outsider attacks. ple of a tool that checks consistency across multiple slices. In the ALL-ELEMENT threat model, message origin authen- In contrast to VeriFlow, FlowGuard [65] is a firewall for tication and message integrity are required to prevent one SDNs that specifically focuses on conflict resolution. Note that legitimate device from impersonating another. An additional SE-Floodlight resolves conflicts only between flows, whereas mechanism appears required to detect a legitimate device from FlowGuard resolves conflicting network policies in general. participating in an election using false information. A similar To address the source-binding problem within the network, mechanism appears required in SDN. FlowTags [66] enables switches to tag packets for appropriate 12 source binding, avoiding conflict with middleboxes in the attacks [84]; detection in SDN however differs from CNs. Chi network. Kim et al. [67] proposed Kinetic to not only monitor et al. [85] implemented applications that periodically sample network properties, but also enable administrators to take flow rules, and check if a random subset of switches are appropriate control actions in response to network changes, behaving as instructed. and to analyze source of errors in control programs leading 3) DoS attacks: While SDN is considered more vulnerable to the undesired network behavior. Similarly, Flover [68] and to DoS attacks [51] than CNs, mitigation techniques appear NetPlumber [69] are systems to verify that flow policies do to follow non-conventional solutions. For example, Flood- not contradict with desired network security policies. Guard [86] rate-limits packets sent to the controller to protect The proper extent of privileges that should be granted controller bandwidth from being consumed by intentional to an SDN application is unclear. Excessive privileges may PACKET_IN requests from clueless switches. SDNsec [87] constitute a significant weak point if an application becomes allows edge switches to encode whole routes on each ingress compromised or is malicious [70]; too few may not allow packet, thus mitigating DoS by state exhaustion. Flow aggre- sufficient flexibility to run security applications. Fresco [5] gation [88] and flow time-out [89] are also widely regarded is a framework for developing security SDN applications, as good practices to mitigate switch memory exhaustion. providing APIs for developers to access sensitive network 4) MitM attacks: Sphinx [13] aims to counter security resources securely. Similarly, OperationCheckpoint [71] aims threats from within an SDN network (e.g., from network to secure the network against third party applications by switches and end hosts) by building flow graphs to represent a ensuring that critical operations can be executed by trusted closed form of the network topology, and using these to detect applications only. anomalous switch behavior. Another tool, TopoGuard [11], 3) Attack Mitigation: Benton et al. [72] highlight the aims to detect LLDP hijacking and MitM attacks, by validating importance of isolating applications running on top of the the network view seen by the controller, to mitigate host controller and the importance of verifying flow tables, to avoid location hijacking and link fabrication vulnerabilities. erroneous controllers (including errors introduced without a malicious intent). Braga et al. [73] showed how machine IX.DISCUSSIONAND CONCLUDING REMARKS learning can be used to identify traffic involved in a DoS at- tack. BASE [74] was proposed as an anti-spoofing mechanism, Conventional networks and SDN share similar security risks aiming to mitigate DoS. SD-Anti-DDoS [75] is another tool related to the control plane, but differ in defenses needed. We used to clean bloated flow tables after a DoS attack. observe the following from our analysis in Section VII. In the 4) Proposals for control plane scalability: To ensure scala- END-HOST threat model, conventional networks can define a bility and fault tolerance, numerous proposals advocate repli- network boundary to filter control messages from end hosts. cation and distribution of the control plane [76], [77], [78]. This approach is less effective in SDN—its control plane is Onix [79] is a prominent example that abstracts the network implemented in SDN controllers, which are usually connected distribution state to the control plane running on top. Hy- to the edge of the network; their attaching points, similar to perFlow [80] allows multiple separate SDN domains to be end host locations, may change unless there is a dedicated consolidated and controlled from a single point. control network separate from user networks. Thus, SDN largely requires cryptographic protection to prevent outsiders from participating in the control plane. B. Data Plane In the ALL-ELEMENT threat model (Section III-B), both Despite SDN’s promise to ease management and service de- conventional networks and SDN require crypto mechanisms, ployment, it is becoming clear that not all services can simply as well as consistency checks to mitigate insider attacks. They be implemented as applications on the controller. Depending differ in where and how such checks can be implemented. on required levels of network support, some security solutions While straightforward consistency checks might be imple- in the literature propose either low-level modules running in mented within individual CN devices, it is less obvious where controller kernel space, or software running on switches. These and how to do this in SDN. Our analysis suggests that a highly create challenges in managing/updating network elements, as modularized and distributed SDN software architecture may is the case with CNs. The OpenFlow Extension Framework facilitate consistency checks, improving SDN control plane (OFX) [81] leverages control plane centralization to allow security (cf. Section VII-B, including for modularization based dynamic installation of software on switches. on micro service architecture). As noted earlier, current SDN 1) Security services: Distinct from CNs, the centralization controllers (e.g., ONOS, Open DayLight) lack mechanisms to of the control plane in SDNs challenges conventional means handle insider threats. by which gateway services, e.g., firewalls and IDSs, are set Our framework for analysis of control plane security, suit- up. For example, instructing edge switches to forward a copy able for both conventional networks and SDN, is summarized of the traffic to a separate IDS box would consume substantial by Table I and the discussion in Section VII. It allows an delay and bandwidth [82]. FleXam [83] is an OpenFlow apples-to-apples comparison, exploring both security risks and extension to enable a switch to send sample packets (including defenses. An objective of our framework and analysis is to payload) from a specific flow to the controller. help guide further SDN research, and to aid practitioners in 2) Handling compromised switches: Compromising an the design, development, and deployment of software-defined SDN switch enables a wide range of MitM and impersonation networks with stronger robustness and security properties. 13

Future work includes designing SDN controllers that can [21] Cisco, “Safe layer 2 security in-depth version 2,” http://www.cisco.com/ survive in the ALL-ELEMENT threat model, and means for warp/public/cc/so/cuso/epso/sqfr/sfblu_wp.htm#wp998892. [22] IEEE, “Standard for Local and Metropolitan Area Networks–Link Ag- comprehensive testing of security vulnerabilities and defenses gregation,” IEEE Std 802.1AXTM-2014 (Revision of IEEE Std 802.1AX- discussed herein. 2008), pp. 1–320, Dec. 2014. [23] Cisco, “Catalyst 3750-X and 3560-X Switch Software Configuration Guide - Managing Switch Stacks,” "http://www.cisco.com/c/en/ ACKNOWLEDGEMENT us/td/docs/switches/lan/catalyst3750x_3560x/software/release/12- 2_55_se/configuration/guide/3750xscg/swstack.pdf", Nov. 2014. We thank Yapeng Wu and Xingjun Chu for their con- [24] T. Li and R. Atkinson, “IS-IS Cryptographic Authentication,” RFC structive discussion and comments on this paper. The third 5304 (Proposed Standard), IETF, Oct. 2008, updated by RFCs 6233, 6232. [Online]. Available: http://www.ietf.org/rfc/rfc5304.txt author acknowledges funding from the Natural Sciences and [25] R. Perlman, D. Eastlake, D. Dutt, S. Gai, and A. Ghanwani, “Routing Engineering Research Council of Canada (NSERC) for both Bridges (RBridges): Base Protocol Specification,” RFC 6325 (Proposed his Canada Research Chair in Authentication and Computer Standard), IETF, Jul. 2011, updated by RFCs 6327, 6439. [Online]. Available: http://www.ietf.org/rfc/rfc6325.txt Security, and a Discovery Grant. [26] D. Fedyk and M. Seaman, “802.1aq - Shortest Path Bridging,” Mar. 2012. [Online]. Available: http://www.ieee802.org/1/pages/802.1aq.html [27] Cisco Networking Academy, “Routing and Switching Essentials Com- REFERENCES panion Guide,” Cisco Press, Feb. 2014. [28] D. Eastlake, T. Senevirathne, A. Ghanwani, D. Dutt, and A. Banerjee, [1] J. Moy, “OSPF Version 2,” RFC 2328 (Internet Standard), IETF, Apr “Transparent Interconnection of Lots of Links (TRILL) Use of IS-IS,” 1998, updated by RFCs 5709, 6549, 6845, 6860. [Online]. Available: RFC 7176 (Standards Track), IETF, May 2014. [Online]. Available: http://www.ietf.org/rfc/rfc2328.txt http://www.ietf.org/rfc/rfc7176.txt [2] N. Feamster, J. Rexford, and E. Zegura, “The road to SDN: an intellec- [29] J. F. Kurose and K. W. Ross, Computer Networking: A Top-Down tual history of programmable networks,” ACM Computer Communica- Approach, 6th ed. Pearson Higher Ed, 2013, vol. 1. tion Review, vol. 44, no. 2, pp. 87–98, 2014. [30] Y. Volobuev, “Playing redir games with ARP and ICMP,” The BUG- [3] S. Giacalone, D. Ward, J. Drake, A. Atlas, and S. Previdi, “OSPF Traffic TRAQ mailing list, http://www. goth.net/iceburg/tcp/arp.games.html, Engineering (TE) Metric Extensions,” RFC 7471 (Standards Track), 1997. IETF, Mar 2015. [Online]. Available: http://www.ietf.org/rfc/rfc7471.txt [31] C. Systems, “Understanding and Configuring Dynamic ARP Inspection,” [4] K. Greene, “TR10: Software-Defined Networking,” "http: http://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst4500/12- //www2.technologyreview.com/news/412194/tr10-software-defined- 2/25ew/configuration/guide/conf/dynarp.html#wp1039231, 2004. networking/", Feb 2009. [32] W. Lootah, W. Enck, and P. McDaniel, “TARP: Ticket-based address [5] S. Shin, P. Porras, V. Yegneswaran, M. Fong, G. Gu, and M. Tyson, resolution protocol,” Computer Networks, vol. 51, no. 15, pp. 4322– “Fresco: Modular composable security services for software defined 4337, 2007. networks,” in NDSS, Feb 2013. [33] S. Y. Nam, D. Kim, J. Kim et al., “Enhanced ARP: preventing [6] Broadcom, “OpenFlow-Data Plane Abstraction Networking Software,” ARP poisoning-based man-in-the-middle attacks,” IEEE Commun. Lett., http://www.broadcom.com/products/ethernet-communication-and- vol. 14, no. 2, pp. 187–189, 2010. switching/switching/of-dpa-software, 2015. [34] C. Hedrick, “Routing Information Protocol,” RFC 1058 (Historic), [7] J. Medved, R. Varga, A. Tkacik, and K. Gray, “OpenDaylight: Towards IETF, Jun 1988, updated by RFCs 1388, 1723. [Online]. Available: a model-driven SDN controller architecture,” in IEEE “A World of http://www.ietf.org/rfc/rfc1058.txt Wireless, Mobile and Multimedia Networks" (WoWMoM), 2014, pp. 1–6. [35] G. Nakibly, E. Menahem, Y. Elovici, and A. Waizel, “Owning the [8] P. Berde, M. Gerola, J. Hart, Y. Higuchi, M. Kobayashi, T. Koide, Routing Table - Part II,” in BlackHat, 2013. B. Lantz, B. O’Connor, P. Radoslavov, W. Snow, and G. Parulkar, [36] D. Thaler and C. Hopps, “Multipath Issues in Unicast and Multicast “ONOS: towards an open, distributed SDN OS,” in ACM HotSDN, 2014, Next-Hop Selection,” RFC 2991 (Informational), IETF, Nov 2000. pp. 1–6. [Online]. Available: http://www.ietf.org/rfc/rfc2991.txt [9] R. Enns, M. Bjorklund, and J. Schoenwaelder, “Network Configuration [37] C. Hopps, “Analysis of an Equal-Cost Multi-Path Algorithm,” Protocol (NETCONF),” RFC 6241 (Standards Track), IETF, Jun 2011. RFC 2992 (Informational), IETF, Nov 2000. [Online]. Available: [Online]. Available: http://www.ietf.org/rfc/rfc6241.txt http://www.ietf.org/rfc/rfc2992.txt [10] P. C. van Oorschot, T. Wan, and E. Kranakis, “On interdomain rout- [38] S. Nadas, “Virtual Router Redundancy Protocol (VRRP) Version 3 ing security and pretty secure BGP (psBGP),” ACM Transactions on for IPv4 and IPv6,” RFC 5798 (Standards Track), IETF, Mar. 2010. Information and System Security (TISSEC), vol. 10, no. 3, 2007. [Online]. Available: http://www.ietf.org/rfc/rfc5798.txt [11] S. Hong, L. Xu, H. Wang, and G. Gu, “Poisoning network visibility in [39] S. Knight, D. Weaver, D. Whipple, R. Hinden, D. Mitzel, P. Hunt, software defined networks: New attacks and countermeasures,” in NDSS, P. Higginson, M. Shand, and A. Lindem, “Virtual Router Redundancy Feb. 2015. Protocol,” RFC 23388 (Standards Track), IETF, Apr 1998. [Online]. [12] P. Porras, S. Cheung, M. Fong, K. Skinner, and V. Yegneswaran, Available: http://www.ietf.org/rfc/rfc2338.txt “Securing the software defined network control layer,” in NDSS, Feb [40] R. Hinden et al., “Virtual Router Redundancy Protocol (VRRP),” 2015. RFC 3768 (Standards Track), IETF, Apr 2004. [Online]. Available: [13] M. Dhawan, R. Poddar, K. Mahaf, and V. Mann, “Sphinx: Detecting http://www.ietf.org/rfc/rfc3768.txt security attacks in software defined networks,” in NDSS, Feb 2015. [41] Open Networking Foundation, “OpenFlow Switch Specification Version [14] A. R. Sharafat, S. Das, G. Parulkar, and N. McKeown, “MPLS- 1.5.0,” "https://www.opennetworking.org/images/stories/downloads/sdn- TE and MPLS VPNs with OpenFlow,” ACM SIGCOMM Computer resources/onf-specifications/openflow/openflow-switch-v1.5.0.noipr. Communication Review, vol. 41, no. 4, pp. 452–453, 2011. pdf", Dec 2014. [15] S. Convery, “Hacking Layer 2: Fun with Ethernet Switches,” ftp://n. [42] IEEE, “Standard for Local and Metropolitan Area Networks–Station and manet.nu/pub/doc/Hacking_Layer_2_Fun_with_Ethernet_Switches.pdf, Media Access Control Connectivity Discovery,” IEEE Std 802.1AB-2009 2002. (Revision of IEEE Std 802.1AB-2005), pp. 1–204, Sep. 2009. [16] K. R. Fall and W. R. Stevens, TCP/IP Illustrated, volume 1: The [43] M. Casado, T. Garnkel, M. Freedman, A. Akella, D. Boneh, N. McKe- protocols. Addison-Wesley, 2011. owon, and S. Shenker, “SANE: A Protection Architecture for Enterprise [17] S. Wilkins, “Switchport Security Concepts and Configuration,” Cisco Networks,” in USENIX Security Symposium, 2006. Press, Jul. 2011. [44] P. Hunt, M. Konar, F. P. Junqueira, and B. Reed, “ZooKeeper: Wait-free [18] R. Perlman, “An Algorithm for Distributed Computation of a Spanning Coordination for Internet-scale Systems,” in USENIX ATC, 2010. Tree in an Extended LAN,” in ACM SIGCOMM, 1985, pp. 44–53. [45] D. Ongaro and J. Ousterhout, “In Search of an Understandable Consen- [19] E. Vyncke and C. Paggen, “Attacking the spanning tree protocol,” http: sus Algorithm,” in USENIX ATC, 2014, pp. 305–319. //www.ciscopress.com/articles/article.asp?p=1016582&seqNum=2. [46] W. Vogels, “Eventually consistent,” Communications of the ACM, [20] T. Kiravuo, M. Sarela, and J. Manner, “A Survey of Ethernet LAN vol. 52, no. 1, pp. 40–44, 2009. Security,” Communications Surveys & Tutorials, IEEE, vol. 15, no. 3, [47] Akka Docs, “Network Security,” http://getakka.net/docs/remoting/ pp. 1477–1491, 2013. security, Akka, 2015. 14

[48] H. Yin, H. Xie, T. Tsou, D. Lopez, P. Aranda, and R. Sidi, “SDNi: [75] Y. Cui, L. Yan, S. Li, H. Xing, W. Pan, J. Zhu, and X. Zheng, “SD-Anti- A Message Exchange Protocol for Software Defined Networks (SDNs) DDoS: Fast and efficient DDoS defense in software-defined networks,” across Multiple Domains,” Internet Draft, Jun 2012. [Online]. Available: Journal of Network and Computer Applications, vol. 68, pp. 65–79, http://tools.ietf.org/id/draft-yin-sdn-sdni-00.txt 2016. [49] A. Lakshman and P. Malik, “Cassandra: a decentralized structured [76] F. Botelho, A. Bessani, F. Ramos, and P. Ferreira, “On the design of storage system,” ACM SIGOPS Operating Systems Review, vol. 44, no. 2, practical fault-tolerant SDN controllers,” in IEEE European Workshop pp. 35–40, 2010. on SDN (EWSDN), 2014, pp. 73–78. [50] A. Ron, A. Shulman-Peleg, and E. Bronshtein, “No SQL, no Injection?” [77] S. Betge-Brezetz, G.-B. Kamga, and M. Tazi, “Trust support for SDN in IEEE S&P workshop on Web 2.0 Security & Privacy, 2015. controllers and virtualized network applications,” in IEEE Network [51] S. Shin and G. Gu, “Attacking software-defined networks: A first Softwarization (NetSoft), 2015. feasibility study,” in ACM HotSDN (extended abstract), 2013, pp. 165– [78] B. Heller, R. Sherwood, and N. McKeown, “The controller placement 166. problem,” in ACM HotSDN, 2012, pp. 7–12. [52] N. Dragoni, S. Giallorenzo, A. Lluch-Lafuente, M. Mazzara, F. Montesi, [79] T. Koponen, M. Casado, N. Gude, J. Stribling, L. Poutievski, M. Zhu, R. Mustafin, and L. Safina, “Microservices: yesterday, today, and R. Ramanathan, Y. Iwata, H. Inoue, T. Hama, and S. Shenker, “Onix: tomorrow,” CoRR, vol. abs/1606.04036, 2016. [Online]. Available: A distributed control platform for large-scale production networks,” in http://arxiv.org/abs/1606.04036 USENIX OSDI, vol. 10, 2010, pp. 351–364. [53] A. Khurshid, W. Zhou, M. Caesar, and P. Godfrey, “Veriflow: Verifying [80] A. Tootoonchian and Y. Ganjali, “HyperFlow: A distributed control network-wide invariants in real time,” ACM SIGCOMM Computer plane for OpenFlow,” in USENIX INM/WREN, 2010. Communication Review, vol. 42, no. 4, pp. 467–472, 2012. [81] J. Sonchack, J. M. Smith, A. J. Aviv, and E. Keller, “Enabling Practi- [54] S. Scott-Hayward, S. Natarajan, and S. Sezer, “A Survey of Security cal Software-defined Networking Security Applications with OFX,” in in Software Defined Networks,” Communications Surveys & Tutorials, NDSS, 2016. IEEE, vol. 18, no. 1, pp. 623–654, 2016. [82] S. A. Mehdi, J. Khalid, and S. A. Khayam, “Revisiting traffic anomaly [55] I. Ahmad, S. Namal, M. Ylianttila, and A. Gurtov, “Security in software detection using software defined networking,” in Recent Advances in defined networks: A survey,” Communications Surveys & Tutorials, Intrusion Detection. Springer, 2011, pp. 161–180. IEEE, vol. 17, no. 4, pp. 2317–2346, 2015. [83] S. Shirali-Shahreza and Y. Ganjali, “Efficient implementation of security [56] I. Alsmadi and D. Xu, “Security of software defined networks: A applications in OpenFlow controller with FleXam,” in IEEE High- survey,” Computers & Security, vol. 53, pp. 79–108, 2015. Performance Interconnects (HOTI), 2013, pp. 49–54. [57] N. Gude, T. Koponen, J. Pettit, B. Pfaff, M. Casado, N. McKeown, and [84] M. Antikainen, T. Aura, and M. Särelä, “Spook in Your Network: S. Shenker, “NOX: Towards an Operating System for Networks,” ACM Attacking an SDN with a Compromised OpenFlow Switch,” in Secure Computer Communication Review, Jul 2008. IT Systems. Springer LNCS, 2014, vol. 8788, pp. 229–244. [58] “Floodlight OpenFlow Controller,” http://www.projectfloodlight.org/ [85] P.-W. Chi, C.-T. Kuo, J.-W. Guo, and C.-L. Lei, “How to detect a floodlight/, 2015. compromised SDN switch,” in IEEE Network Softwarization (NetSoft), [59] P. Porras, S. Shin, V. Yegneswaran, M. Fong, M. Tyson, and G. Gu, 2015. “A Security Enforcement Kernel for OpenFlow Networks,” in ACM [86] H. Wang, L. Xu, and G. Gu, “FloodGuard: A DoS Attack Prevention HotSDN, 2012, pp. 121–126. Extension in Software-Defined Networks,” http://faculty.cse.tamu.edu/ guofei/paper/FloodGuard_DSN15.pdf, Texas A&M University, Tech. [60] X. Wen, Y. Chen, C. Hu, C. Shi, and Y. Wang, “Towards a secure con- Rep., 2015. troller platform for OpenFlow applications,” in ACM HotSDN (extended [87] T. Sasaki, C. Pappas, T. Lee, T. Hoefler, and A. Perrig, “SDNsec: abstract), 2013, pp. 171–172. Forwarding Accountability for the SDN Data Plane,” in IEEE ICCCN, [61] S. Shin, Y. Song, T. Lee, S. Lee, J. Chung, P. Porras, V. Yegneswaran, 2016, pp. 1–10. J. Noh, and B. B. Kang, “Rosemary: A robust, secure, and high [88] R. Kloti, V. Kotronis, and P. Smith, “OpenFlow: A security analysis,” performance network operating system,” in ACM CCS, 2014. in IEEE ICNP, 2013, pp. 1–6. [62] R. Sherwood, G. Gibb, K.-K. Yap, G. Appenzeller, M. Casado, N. McK- [89] R. Kandoi and M. Antikainen, “Denial-of-service attacks in OpenFlow eown, and G. Parulkar, “Can the production network be the testbed?” SDN networks,” in IFIP/IEEE Integrated Network Management (IM), in USENIX OSDI, 2010, pp. 1–14. 2015, pp. 1322–1326. [63] S. Gutz, A. Story, C. Schlesinger, and N. Foster, “Splendid isolation: A slice abstraction for software-defined networks,” in ACM HotSDN, 2012, pp. 79–84. [64] E. Al-Shaer and S. Al-Haj, “FlowChecker: Configuration Analysis and Verification of Federated OpenFlow Infrastructures,” in ACM SafeConfig Workshop, 2010, pp. 37–44. [65] H. Hu, W. Han, G.-J. Ahn, and Z. Zhao, “FlowGuard: Building robust firewalls for software defined networks,” in ACM HotSDN, 2014, pp. 97–102. [66] S. K. Fayazbakhsh, L. Chiang, V. Sekar, M. Yu, and J. C. Mogul, “Enforcing network-wide policies in the presence of dynamic middlebox actions using FlowTags,” in USENIX NSDI, 2014. [67] H. Kim, J. Reich, A. Gupta, M. Shahbaz, N. Feamster, and R. Clark, “Kinetic: Verifiable dynamic network control,” in USENIX NSDI, 2015, pp. 59–72. [68] S. Son, S. Shin, V. Yegneswaran, P. Porras, and G. Gu, “Model checking invariant security properties in OpenFlow,” in IEEE ICC, 2013, pp. 1974–1979. [69] P. Kazemian, M. Chan, H. Zeng, G. Varghese, N. McKeown, and S. Whyte, “Real time network policy checking using header space analysis,” in USENIX NSDI, 2013, pp. 99–111. [70] S. Matsumoto, S. Hitz, and A. Perrig, “Fleet: Defending SDNs from Malicious Administrators,” in ACM HotSDN, Aug. 2014. [71] S. Scott-Hayward, C. Kane, and S. Sezer, “OperationCheckpoint: SDN Application Control,” in IEEE ICNP, 2014, pp. 618–623. [72] K. Benton, L. J. Camp, and C. Small, “OpenFlow vulnerability assess- ment,” in ACM HotSDN (extended abstract), 2013, pp. 151–152. [73] R. Braga, E. Mota, and A. Passito, “Lightweight DDoS flooding attack detection using NOX/OpenFlow,” in IEEE LCN, 2010, pp. 408–415. [74] J. Kwon, D. Seo, M. Kwon, H. Lee, A. Perrig, and H. Kim, “An incrementally deployable anti-spoofing mechanism for software-defined networks,” Computer Communications, 2015.