Network Learnings on 66 Bare Metal Nodes

White Paper Cloud Computing OpenStack* Network learnings on 66 bare metal nodes Manjeet Singh Bhatia, Ganesh Mahalingam, Nathaniel Potter, Malini Bhandaru, Isaku Yamahata, Yih Leong Sun Open Source Technology Center, Intel Corporation Executive summary switching implementations for LB and OVS. OSIC hosts the world’s largest OpenStack* [1] developer ML2 enables different switching solutions to coexist cloud, comprised of 2,000 nodes, to enable community- and interoperate in a single OpenStack cloud instance. wide development and testing of OpenStack features and The purpose of our experiment was to compare the functionality at an unmatched scale. Cloud and Software performance of these two switching solutions in various Defined Networking (SDN) are front and center in today’s scenarios. Figure 1 shows a high level design of ML2 with a enterprise and telecommunications landscape. OpenStack, switch agent. a leading open source cloud operating system, has a following of thousands and over two hundred production Controller 1 Controller N deployments [2]. Intel® and Rackspace* founded the Neutron Server Neutron Server OpenStack Innovation Center* (OSIC) [3] to accelerate the Plugins Plugins development and innovation of OpenStack for enterprises around the globe. ML2 ML2 In this whitepaper, we share our learnings from a grant of 66 nodes for a 3 week period in the OSIC developer cloud. The goal of this experiment was to compare the performance of two Layer 2 software switching solutions: Linux bridge (LB) and Open VSwitch (OVS). We share how to deploy a cloud on the OSIC cluster, discuss the Network node 1 Network node 1 OpenStack neutron Modular Layer 2 (ML2) architecture, and present our experimental results. In the course of the Linux bridge agent OVS agent work we identified and fixed some deployment bugs. Last, but not least, we also share how to submit an experiment proposal to OSIC. Figure 1. Overview of ML2 design Problem statement: Deploying on the OSIC developer cloud OpenStack networking performance We received 66 bare metal nodes with no operating Several events occur when a user launches a virtual system installed on the OSIC cluster. machine in an OpenStack cloud. Multiple Application The physical connectivity cabling was preprovisioned, Programmer Interface (API) calls are made, including one and no action was required from our end. We received a to create a virtual network interface card (vNIC). Behind document that detailed the switch and network layout. the scenes, there are database transactions, messages exchanged between various services over a message We used iLO* [4], the remote server management tool from queue, plenty of logging, and much more. Each element in HP*, to manage and build the servers. Each server comes this diverse set of components can impact performance with a unique IP address for iLO management. We were and might need to be tuned differently for different cloud given the iLO credentials needed to access the servers via usage scenarios. a browser or the command line. Each server has multiple network interfaces, and two of them are usable with PXE In general, OpenStack is designed to support various booting [5] in conjunction your favorite provisioning tool. vendor plugins. Its networking component, neutron, is no different. neutron allows users to choose their switching Deployment was a two-phase process. First, we installed solution. The Modular Layer 2 (ML2) plugin was introduced a minimal operating system, and second, we deployed in neutron in the Havana release to replace the monolithic the cloud itself on the hosts. Some of the hosts were configured as controller nodes and the rest as compute Test plan hosts. It took around 40 minutes for the first phase and We spawned 100, 200, 500, and 1000 virtual machines about another 21 minutes in phase two to complete the (VMs) for both LB and OVS deployments. All the VMs deployment of OpenStack. Note that the details might were on the same network, each VM was created with change in the future, and the deployment process might one virtual network interface, and a single switch become even simpler. port was hooked up. Spinning up a VM involves some communication between nova and neutron to create Phase I: Server provisioning virtual switch ports. Those switch ports are hooked up to We first manually installed Ubuntu* 14.04 Server as our the VMs. A small script facilitated creating these VMs and operating system on one of the hosts, and then used measuring the time it took them to become active. We did Cobbler* [6] to provision all the others. There are other not mix switching technologies within our deployments. open source tools available to provision servers like bifrost* [7] with ironic* [8]. VM launch results In table 1, we’ve shared the results obtained from our VM Phase II: OpenStack Deployment spawn tests with LB and OVS. Launch time is measured as the time period between the point at which nova’s We used OpenStack kolla [9] to deploy OpenStack on the launch API is called and the point at which nova marks the cluster. kolla was both simple to use and, with its vibrant database as task accomplished and returns all the VM-ids. responsive community, quick to provide bug fixes which However, many other activities happen behind the scenes eased our deployment task. before one can successfully ping the new VMs. We refer to We wrote some scripts and OpenStack-Ansible* [10] the period during which these activities occur as the “ping playbooks for predeployment work, such as configuring time”. network interfaces on all the servers, installing software Note that the times indicated refer to the total number of dependencies, and injecting ssh public keys. kolla uses VMs launched in a test. Interestingly, the launch and the Docker* containers and OpenStack-Ansible playbooks ping times seem to grow sublinearly. As seen on figure 2. to install OpenStack services. For large deployments, we recommend running a Docker registry on the local Time [s] 250 network that contains all the container images for the services you anticipate running in your cloud. 200 Our configuration was comprised of three control nodes, LB Launch three network nodes, and 58 compute nodes. We reserved 150 OVS Launch one node to serve as the deployment and monitoring LB Ping OVS Ping host. Please note that this setup is configurable in most 100 deployment tools. It typically varies based on use case and 50 anticipated workload. In a production system, TLS is used for security and REST API calls are HTTPS. However, we did 0 #VMs not use TLS in our deployment. Thus, all times should only 100 200 300 400 500 600 700 800 900 1000 be used for their trends and not their absolute values. Figure 2. VM launch and ping time growth. Experiments Most cloud deployments choose either LB or OVS, with The times match consistently across all VM ping times with the choice possibly depending on the administrators’ one huge exception: The one variation in the data was for prior experience with one or the other. Few, if any, 500 VMs with OVS. Further investigation is required to deployments have tried to change switching technologies determine the reason for this finding. in their production clouds. We were curious to compare Our study of the logs indicates that further performance and contrast the two solutions in terms of their ease of gains are possible by addressing issues using RabbitMQ* deployment, control plane (network setup latency), and for interprocess communication, and by removing data path performance (speed of handling the packets). database connection bottlenecks by using an active-active We compared performance along these dimensions by database. modifying the number of virtual machines being launched. TOTAL LAUNCH TIME (SECONDS) TOTAL PING TIME (SECONDS) #VMS LB OVS LB OVS 100 12 13 35 43 200 21 21 53 53 500 49 51 210 115 1000 61 63 268 263 Table 1. VM launch results, including both launch time and ping time. Port create/destroy stress results The RabbitMQ was used for message queuing on all three control nodes. The RabbitMQ process consumed a lot of We launched and destroyed virtual machines repeatedly CPU cycles on all three control nodes, sometimes using over ten runs. There was little difference in latency 90% of the CPU by itself, leaving few resources for other for port creation between OVS and LB. However, the processes to use. Some Erlang optimization of RabbitMQ delete operation exposed some differences. In LB, port would definitely save some CPU cycles for other processes. cleanup was the same across multiple runs. In OVS, the performance degraded halfway through our ten runs. This ciao[15] opted for a lightweight messaging protocol to might be a database effect, possibly caused by too many reduce the messaging burden by both reducing message rows in the table. size and the need to ensure persistence. Discussion Neutron-server In the following section, we examine issues spanning ease In our experiments we used a single network. For each of deployment and experimental results, and also propose VM a virtual switch port is created and associated with further work. A quick look at the VM launch times indicates the network as a tap. At some points, the neutron server that OpenStack handles them quickly, as measured by process used over 90% of the CPU, which in turn lead to shorter launch times than the linear growth in the number delays in RabbitMQ message handling and other issues. of VMs launched. However, to develop a deeper insight, Can the neutron server process be more efficient and less we analyzed logs, CPU utilization, and the various running chatty? processes. See table 2 for a list of uncovered issues. Conclusions Database bottleneck In the course of deploying the cloud with OVS and LB The number of connections required to handle the volume we uncovered some issues and chased down their fixes, of API calls was too large to be handled by the CPU, and in so doing we helped improve OpenStack for the given that the database was configured in active-passive community.

Network Learnings on 66 Bare Metal Nodes

Compass – a Streamlined Openstack Deployment System

Red Hat Satellite 6.3 Architecture Guide

Automating the Enterprise with Ansible

Cobbler Provider

Cobbler Documentation Release 3.0.1

Cobbler Documentation Release 2.8.5

Building and Managing Virtual Machines at the Tier 1

Deployment of Compute Nodes for the WLCG with Cobbler, Ansible and Salt

Automated Installation with the Cobbler Provisioning Tool

Fedora 10 Installation Guide

Integrating with Cobbler

Nested Virtualization Environments