VFP: a Virtual Switch Platform for Host SDN in the Public Cloud Daniel Firestone, Microsoft
Total Page:16
File Type:pdf, Size:1020Kb
VFP: A Virtual Switch Platform for Host SDN in the Public Cloud Daniel Firestone, Microsoft Abstract recent years has been on building scalable and flexible Many modern scalable cloud networking architectures network controllers and services, which is critical. rely on host networking for implementing VM network However, the design of the programmable vswitch is policy - e.g. tunneling for virtual networks, NAT for equally important. It has the dual and often conflicting load balancing, stateful ACLs, QoS, and more. We requirements of a highly programmable dataplane, with present the Virtual Filtering Platform (VFP) - a high performance and low overhead, as cloud programmable virtual switch that powers Microsoft workloads are cost and performance sensitive. Azure, a large public cloud, and provides this policy. In this paper, we present the Virtual Filtering Platform, We define several major goals for a programmable or VFP – our cloud scale virtual switch that runs on all virtual switch based on our operational experiences, of our hosts. VFP is so named because it acts as a including support for multiple independent network filtering engine for each virtual NIC of a VM, allowing controllers, policy based on connections rather than controllers to program their SDN policy. Our goal is to only on packets, efficient caching and classification present both our design and our experiences running algorithms for performance, and efficient offload of VFP in production at scale, and lessons we learned. flow policy to programmable NICs, and demonstrate 1.1 Related Work how VFP achieves these goals. VFP has been deployed Throughout this paper, we use two motivating examples on >1M hosts running IaaS and PaaS workloads for from the literature and demonstrate how VFP supports over 4 years. We present the design of VFP and its API, their policies and actions. The first is VL2 [2], which its flow language and compiler used for flow can be used to create virtual networks (VNETs) on processing, performance results, and experiences shared hardware using stateless tunneling between deploying and using VFP in Azure over several years. hosts. The second is Ananta [4], a scalable Layer-4 load 1. Introduction balancer, which scales by running the load balancing The rise of public cloud workloads, such as Amazon NAT in the vswitch on end hosts, leaving the in- Web Services, Microsoft Azure, and Google Cloud network load balancers stateless and scalable. Platform [13-15], has created a new scale of datacenter In addition, we make references and comparisons to computing, with vendors regularly reporting server OpenFlow [5], a programmable forwarding plane counts in the millions. These vendors not only have to protocol, and OpenVswitch [1] (OVS), a popular open provide scale and the high density/performance of source vswitch implementing OpenFlow. These are two Virtual Machines (VMs) to customers, but must provide seminal projects in the SDN space. We point out core rich network semantics, such as private virtual networks design differences from the perspective of a public with customer supplied address spaces, scalable L4 load cloud on how our constraints can differ from those of balancers, security groups and ACLs, virtual routing open source projects. It is our goal to share these tables, bandwidth metering, QoS, and more. learnings with the broader community. This policy is sufficiently complex that it often cannot 2. Design Goals and Rationale economically be implemented at scale in traditional VFP’s design has evolved over time based on our core routers and hardware. Instead a common approach experiences running a large public cloud platform. VFP has been to implement this policy in software on the was not our original vswitch, nor were its original VM hosts, in the virtual switch (vswitch) connecting functions novel ideas in host networking – VL2 and VMs to the network, which scales well with the number Ananta already pioneered such use of vswitches. of servers, and allows the physical network to be Originally, we built networking filter drivers on top of simple, scalable and very fast. As this model separates a Windows’s Hyper-V hypervisor for each host function, centralized control plane from a data plane on the host, which we chained together in a vswitch – a stateful it is widely considered an example of Software Defined firewall driver for ACLs, a tunneling driver for VL2 Networking (SDN) – in particular, host-based SDN. VNETs, a NAT driver for Ananta load balancing, a As a large public cloud provider, Azure has built its QoS driver, etc. As host networking became our main cloud network on host based SDN technologies, using tool for virtualization policy, we decided to create VFP them to implement almost all virtual networking in 2011 after concluding that building new fixed filter features we offer. Much of the focus around SDN in drivers for host networking functions was not scalable or desirable. Instead, we created a single platform based such as OVS have added virtualization functionality on the Match-Action Table (MAT) model popularized outside of the MAT model. For example, constructing by projects such as OpenFlow [5]. This was the origin virtual networks is accomplished via a virtual tunnel of our VFPAPI programming model for VFP clients. endpoint (VTEP) schema [29] in OVSDB [8], rather VFP’s core design goals were taken from lessons than rules specifying which packets to encapsulate learned in building and running both these filters, and (encap) and decapsulate (decap) and how to do so. the network controllers and agents on top of them. We prefer instead to base all functionality on the MAT 2.1 Original Goals model, trying to push as much logic as possible into the The following were founding goals of the VFP project: controllers while leaving the core dataplane in the vswitch. For instance, rather than a schema that defines 1. Provide a programming model allowing for what a VNET is, a VNET can be implemented using multiple simultaneous, independent network programmable encap and decap rules matching controllers to program network applications, appropriate conditions, leaving the definition of a minimizing cross-controller dependencies. VNET in the controller. We’ve found this greatly Implementations of OpenFlow and similar MAT models reduces the need to continuously extend the dataplane often assume a single distributed network controller that every time the definition of a VNET changes. owns programming the switch (possibly taking input The P4 language [9] attempts a similar goal for switches from other controllers). Our experience is that this or vswitches [38], but is very generic, e.g. allowing new model doesn’t fit cloud development of SDN – instead, headers to be defined on the fly. Since we update our independent teams often build new network controllers vswitch much more often than we add new packet and agents for those applications. This model reduces headers to our network, we prefer the speed of a library complex dependencies, scales better and is more of precompiled fast header parsers, and a language serviceable than adding logic to existing controllers. We structured for stateful connection-oriented processing. needed a design that not only allows controllers to independently create and program flow tables, but 2.2 Goals Based on Production Learnings would enforce good layering and boundaries between Based on lessons from initial deployments of VFP, we them (e.g. disallow rules to have arbitrary GOTOs to added the following goals for VFPv2, a major update in other tables, as in OpenFlow) so that new controllers 2013-14, mostly around serviceability and performance: could be developed to add functionality without old 4. Provide a serviceability model allowing for controllers needing to take their behavior into account. frequent deployments and updates without 2. Provide a MAT programming model capable of requiring reboots or interrupting VM connectivity using connections as a base primitive, rather than for stateful flows, and strong service monitoring. just packets – stateful rules as first class objects. As our scale grew dramatically (from O(10K) to O(1M) OpenFlow’s original MAT model derives historically hosts), more controllers built on top of VFP, and more from programming switching or routing ASICs, and so engineers joined us, we found more demand than ever assumes that packet classification must be stateless due for frequent updates, both features and bug fixes. In to hardware resources available. However, we found Infrastructure as a Service (IaaS) models, we also found our controllers required policies for connections, not customers were not tolerant of taking downtime for just packets – for example end users often found it more individual VMs for updates. This goal was more useful to secure their VMs using stateful Access Control challenging to achieve with our complex stateful flow Lists (ACLs) (e.g. allow outbound connections, but not model, which is nontrivial to maintain across updates. inbound) rather than stateless ACLs used in commercial 5. Provide very high packet rates, even with a large switches. Controllers also needed NAT (e.g. Ananta) number of tables and rules, via extensive caching. and other stateful policies. Stateful policy is more Over time we found more and more network controllers tractable in soft switches than in ASIC ones, and we being built as the host SDN model became more believe our MAT model should take advantage of that. popular, and soon we had deployments with large 3. Provide a programming model that allows numbers of flow tables (10+), each with many rules, controllers to define their own policy and actions, reducing performance as packets had to traverse each rather than implementing fixed sets of network table. At the same time, VM density on hosts was policies for predefined scenarios. increasing, pushing us from 1G to 10G to 40G and even Due to limitations of the MAT model provided by faster NICs.