Automating Virtual Machine Network Profiles

Automating Virtual Machine Network Profiles Vivek Kashyap Arnd Bergman Stefan Berger IBM IBM IBM [email protected] [email protected] [email protected] Gerhard Stenzel Jens Osterkamp IBM IBM [email protected] [email protected] Abstract and VM to external network communication a virtual bridge is included in the Linux host. The Linux vir- With the explosion of use of virtual machines in the tual bridge relays unicast traffic between the virtual and data-center/cloud environments there is correspondingly physical interfaces attached to it. It further replicates a requirement for automating the associated network and forwards broadcast and multicast transmission re- management and administration. The virtual machines ceived on its port(s). share the limited number of network adapters on the sys- For network isolation and security, the iptables/ebtables tem among them but may run workloads with contend- rules might be enforced on the system. The more com- ing network requirements. Furthermore, these work- mon rules are to prevent ARP or IP spoofing, block loads may be run on behalf of customers desiring com- sending a link level broadcast, or to allow only specific plete isolation of their network traffic. The enforcement set of protocol traffic. of network traffic isolation through access controls (filters) and VLANs on the host adds additional run-time Different workloads running in the VMs might also be and administrative overhead. This is further exacerbated specifically restricted within certain limits based on the when Virtual Machines are migrated to another physical workload requirements, priorities and possibly based on system as the corresponding network profiles must be the bandwidth purchased by the user. re-enforced on the target system. The physical switches must also be reprogrammed. For the purposes of this paper the layer-2 forwarding, multiplexing and filtering(ebtables) function is collec- This paper describes the Linux enhancements in kernel, tively considered part of the virtual-bridge. in libvirt and layer-2 networking, enabling the offloading of the switching function to the external physical The per-packet processing for bridging function - packet switches while retaining the control in Linux host. The relaying, replicating, evaluation against filter rules, or layer 2 network filters and QoS profiles are automati- bandwidth control - imposes a heavy burden that takes cally migrated with the virtual machine on to the target away CPU resources that could be utilized more gain- system and imposed on the physical switch port without fully. This problem gets more and more exacerbated as administrative intervention. the number of virtual machines supported on a single host increases in the multi-core, and large memory sys- We discuss the proposed IEEE standard (802.1Qbg) and tems being used in the data-center/cloud environments. its implementation on Linux for automated migration of port profiles when a VM is migrated from one system to Another problem faced in large deployments is the need another. for maintaining consistent view of the port profiles. The switch fabric and the embedded switches in the hypervisor(Linux KVM host) may have different capabilities 1 Introduction and also be under separate administrative control. A network adapter is shared across multiple virtual ma- This is further exacerbated as the VMs (running im- chines(VM) on a Linux host. To accommodate VM-VM portant workloads) migrate from one system to another. • 147 • 148 • Automating Virtual Machine Network Profiles The filter rules and QoS enforcement must follow the Virtual Ethernet Port Aggregator VM and be quickly in place. Not all policies may be deployable in the hypervisor and the switch must be in- The component in the physical end-station that works formed of the migration. together with the adjacent switch to support EVB is called "Virtual Ethernet Port Aggregator" or VEPA. As the MAC address associated with the virtual interfaces migrates as well, the physical switch at the target does not know if the MAC, now visible at another VM Edge Switch Edge port, is a migrated VM or a different VM using the Regular vSwitch VM same MAC. Thereby, it needs to be informed of the port- mode allows VM- to-VM traffic with VM profile policy to deploy. limited policy VEB L2 net(s) enforcement. VM The proposal ’Edge Virtual Bridging’ in IEEE VM [802.1Qbg] addresses these issues through offloading of Port Profile the switching function to the adjacent bridge i.e. the Database physical switch in the network. vSwitch in VEPA mode forces all VM traffic to fully- capable edge for VM V E L2 net(s) full policy P 1.1 Edge Virtual Bridging: IEEE 802.1Qbg A enforcement. VM VM The Edge Virtual Bridging(EVB) proposal defines the protocols, configuration and control required across the physical end station, such as the Linux host, and the adjacent switch. A VEPA is very simplified version of an Ethernet bridge This section provides an introduction to the IEEE that allows multiple downlink ports to communicate 802.1Qbg proposal and the subsequent sections describe with a single uplink port but not with each other. Eth- our implementation of the same on Linux to configure ernet frames from one of the downlink ports get sent di- KVM guests in ’VEPA’ mode. rectly to the uplink, and Ethernet frames arriving at the uplink port get forwarded to just the destination with Reflective Relay the matching MAC address, or flooded to all downlink ports in case of broadcast. A VEPA does not support The proposed standard extends the adjacent bridge ca- unknown unicast frames, which get silently dropped. pabilities to the virtual machines by ensuring that all the packets sent by the VM’s are first sent to the switch port. The VEPA component therefore, is able to provide VM The switch then consults its tables, and if the packet to VM communication in conjunction with an uplink is destined to another VM on the same host, sends the port which is configured in ’reflective relay’ mode. packet back to the host. The packet is then forwarded to the destination VM. Dynamic discovery and configuration for VEPA This packet flow enables the switch to enforce fabric mode wide packet filtering and control policies across all network traffic. Otherwise the policy and rules might need The Link-level Discovery Protocol(LLDP)[LLDP] is to be co-ordinated and enforced across the switch and used for layer-2 discovery operations. The EVB pro- the hypervisor leading to administrative overhead and posal extends the TLVs (configuration messages) sup- inefficiencies outlined earlier. ported to include advertisement of ’reflective relay’ capabilities and setting of the adjacent bridge’s port in re- Existent switches do not reflect the packet back on the flective relay mode. The TLV also includes additional port on which it was received. Therefore, a new mode, parameters such as the the number of virtual interfaces referred to as the ’reflective relay’ mode is introduced in (VSI or Virtual station interfaces in EVB parlance) that the Edge Virtual Bridging(EVB) proposal. a station or bridge can support. See [802.1Qbg] for de- scription of all capabilities and parameters exchanged. 2010 Linux Symposium • 149 accordingly. This mechanism also addresses the issue of ’reincarnated’ or reused MAC address and a MAC address that has appeared on the port after a VM migra- Station (e.g., Hypervisor) Bridge tion since the switch is informed of the exact profile to Bridge advertises Server what modes it can configures EVB TLV – OFFER CAPABILTIES 1 support and the max impose. itself from the Capabilities number of VSIs it can available Forwarding: Std, RR handle. capabilities Other: VSI, Auth, etc. according to Current Config.(Std, None) local policy. # VSIs Supported = J # VSIs Configured = 0 The VSI Discovery protocol (VDP) defined by EVB, RTE = 15 therefore defines a set of states and commands ex- 2 EVB TLV - CONFIGURE Bridge Capabilities & Current Config. matches its Forwarding: RR configuration changed between the bridge and the station. These are: Other: VSI, Auth, etc. to the limited # VSIs Supported = J capabilities # VSIs Configured = K advertised by RTE = 10 the station. EVB TLV – CONFIRMATION 3 • Pre-associate: Inform the switch of the VSI and Capabilities Forwarding: Std, RR But still port-profile and intention to associate Other: VSI, Auth, etc. advertises its Current Config. full set of Forwarding: RR capabilities. Other: VSI, Auth, etc. # VSIs Supported = J # VSIs Configured = K • Pre-associate with Resource Reservation: Same as RTE 10 Pre-associate except also reserve resources at the switch for a future association. The bridge periodically advertises its capabilities. The • Associate: Associate the VSI. This causes all re- Station receives the TLV, and based on its configuration, sources to be allocated at the switch and the impo- may respond to configure the port in ’reflective relay’ sition of the VSI port profile. (RR) mode. • DeAssociate: De-associate the VSI from the port VSI Discovery and Configuration and profile The switching function is offloaded from the end-station with VEPA and dynamic configuration of the switch Edge Control Protocol port to RR mode. This however does not fully address The discovery and exchange of bridge capabilities is the problems outlined above. A mechanism is required performed over LLDP. LLDP is an unacknowledged to associate the virtual interface (VSI) to specific profile protocol and has limitations on frequency and number for filtering, VLAN and bandwidth control. of transmissions. This is achieved by informing the switch of the MAC For VDP, the EVB has proposed a new link-level trans- address and VLAN ID pair and the profile that must be port called the "Edge Control Protocol"(ECP) which ac- used.

Automating Virtual Machine Network Profiles

Flexible Lustre Management

Reducing Power Consumption in Mobile Devices by Using a Kernel

I.MX Linux® Reference Manual

Networking with Wicked in SUSE® Linux Enterprise 12

O'reilly Linux Kernel in a Nutshell.Pdf

Communicating Between the Kernel and User-Space in Linux Using Netlink Sockets

Beancounters

Postmodern Strace Dmitry Levin

Compression and Replication of Device Files Using Deduplication Technique

Daemon Management Under Systemd ZBIGNIEWSYSADMIN JĘDRZEJEWSKI-SZMEK and JÓHANN B

Interaction Between the User and Kernel Space in Linux

Instant OS Updates Via Userspace Checkpoint-And