Best Practices for Oversubscription of CPU, Memory and Storage in vSphere Virtual Environments How far can oversubscription be taken safely?

Written by Scott D. Lowe

Abstract organizations have been able to reinvent the modern data One of the benefits of is that it enables center. Whereas data centers of ten years ago tended to be administrators to efficiently share host resources among server-centric places, modern data centers revolve around the different applications. In fact, administrators often needs of line-of-business applications, including ensuring that oversubscribe the physical resources on a host in order to those applications remain highly available and able to survive maximize the number of workloads that can run on a host. But the loss of host servers. how much oversubscription is too much? This paper discusses what oversubscription is and why it is used, explores the pros Virtualization has changed the data center dynamic in many and cons of the practice, and proposes some ideas about the other ways as well. While workloads used to be confined to point at which oversubscription becomes dangerous the hardware on which they were originally installed, in a modern data center, workloads are fluid; they flow from host Introduction to host based on sets of administrator-defined rules as well Virtualization enables data centers to focus on the needs of as in reaction to changes in the host environment. The fluidic business applications, but it can make resource allocation nature of the modern data center has added new challenges to more challenging. resource allocation, but over the years, both free and paid tools One of the great features of virtualization is the ability to run have been introduced to assist administrators in their resource many disparate workloads on a single host server, thereby planning efforts. maximizing the utilization of that host server. In doing so, Sizing resources in a resources than are actually available on virtualized environment the host. For example, suppose a host Virtualization makes hardware sizing has 96 GB of physical RAM. Under the more complex—but offers opportunities right circumstances, an administrator for improving data center efficiency. might assign 128 GB of RAM to all of the The rise of virtualization has also enabled virtual machines running on that host. the use of hardware in ways that were never envisioned even just ten years ago. But just how far can this oversubscription In those days, administrators purchased be taken? The limits depend on a servers sized to support the peak needs number of factors. This paper discusses of a single application, and that sizing oversubscription in general, explores included a projection of the resources its pros and cons, and proposes the application would likely need over some ideas about the point at which Administrators can the life of the server hardware. Because oversubscription becomes dangerous. oversubscribe the many servers were deployed with just a single application, resource planning Resource management and physical resources was relatively simple. In modern data oversubscription on a host in order center environments, which are heavily What is oversubscription? virtualized, resource planning takes on Oversubscription in vSphere refers to maximize new complexity: because a wide array to various methods by which more of I/O patterns will be present on single resources than are available on the the number of pieces of hardware, administrators need physical host can be assigned to the workloads that can insight into how individual applications virtual servers that are supported by interact with the rest of the environment. that host. In general, administrators run on it. In other have the ability to oversubscribe This blending of I/O in a heavily processing, memory and storage words, they can virtualized environment has also created resources in virtual machines. assign to virtual a significant opportunity for efficiency in the data center. Whereas administrators Refusing to oversubscribe resources is the machines, in used to size individual servers based safest choice, but often wastes resources. on the needs of a single application, Different administrators have aggregate, more the mixed nature of I/O in a virtual different opinions on the wisdom of resources than are environment enables sharing of resources oversubscribing physical resources. with different peak needs. As a result, Many administrators prefer to assign actually available on there is opportunity for administrators only those resources that are physically to very efficiently share host resources available to support all of the running the host. among different applications. workloads. This is the safest option as it ensures that, in general, all running Overprovisioning is useful—but how virtual machines will always have the much is too much? resources they need. Even with virtualization, administrators still sometimes overprovision resources However, in days of physical servers, it and size individual virtual machines was not uncommon to find that physical (VMs) to meet peak demands, so there servers rarely made use of all of their are often resources that go unused in resources. From a processor standpoint, a . vSphere provides a utilization averaged only 5–15 percent, number of powerful methods for sharing meaning that there was a whole lot of idle resources with other running room for growth. workloads. In addition, administrators can oversubscribe the physical resources on a host in order to maximize the number of workloads that can run on a host. In other words, they can assign to virtual machines, in aggregate, more

2 Oversubscription maximizes the value In vSphere, a physical CPU (pCPU) of resources—but introduces risk of the refers to: host not having enough resources to • When hyperthreading is not present or service all its VMs. enabled: A single physical CPU core While virtual machines are generally • When hyperthreading is present and more right-sized than their physical enabled: A single logical CPU core counterparts were in the past, there is still room to grow built in, especially Here are two examples: when particular workloads are idle. • If a host has two eight core processors and Many administrators see this as an hyperthreading is either not supported or opportunity to make use of those idle not enabled, that host has sixteen physical resources in order to maximize virtual CPUs (8 cores x 2 processors). machine density on a host. However, • If a host has two eight core processors Oversubscription with oversubscription, administrators are and hyperthreading is enabled, that host basically assigning to virtual machines has thirty-two physical CPUs (8 cores x 2 is an opportunity more resources than are actually processors x 2 threads per core). available on the host. In other words, to maximize VM if all of the virtual machines suddenly How those resources are presented to density on a host. requested access to all of their allocated virtual machines resources, the host would not have In a virtual machine, processors are But it introduces enough resources to service the needs. referred to as virtual CPUs (vCPUs). When an administrator adds vCPUs to a risks: if all of the VMs Resource oversubscription, while it does virtual machine, each of those vCPUs is suddenly request increase virtual machine density, carries assigned to a pCPU, although the actual with it some risks. Once a particular pCPU may not always be the same. access to all of their resource is finally exhausted, if that There must be enough pCPUs available resource happens to be oversubscribed, to support the number of vCPUs allocated resources, stability issues can occur and major assigned to an individual virtual machine performance problems can be or that virtual machine will not boot. the host will not introduced affecting all of the workloads have enough running on the host server. However, that doesn’t mean that administrators are limited to just the resources to service Before we discuss more about number of pCPUs in the host. On the oversubscription, it’s important to contrary, there is no 1:1 ratio between the needs. understand vSphere manages the three the number of vCPUs that can be basic types of resources: assigned to virtual machines and the • Processing resources number of physical CPUs in the host. • Memory resources In fact, as of vSphere 5.0, there is a • Storage resources maximum of 25 vCPUs per physical core, and administrators can allocate up to How vSphere manages 2,048 vCPUs to virtual machines on a processing resources single host. How physical resources are represented on a vSphere host How vSphere manages In vSphere, administrators assign CPUs memory resources to virtual machines in order to support vSphere uses a number of techniques the workload needs of each individual to maximize the use of RAM in a virtual machine. These virtual processing virtual environment: resources are pulled from the host’s • Transparent page sharing (TPS)—In most available physical CPUs. The number of virtual environments, administrators run physical CPUs that are present in hosts is many copies of the same . dependent on a couple factors. In these cases, there is a lot of duplication of memory pages in host memory. Transparent page sharing is basically a

3 form of memory deduplication—vSphere • Swapping to disk—Swapping to disk combines multiple identical memory pages is the ’s last-ditch effort to into just one and frees the remaining retrieve enough physical RAM to satisfy pages up for other uses. TPS has an almost the needs of workloads running on a imperceptible impact on the performance host. Swapping is a by which the of the host. hypervisor moves the least used memory • Memory ballooning—When the VMware pages to disk. Those memory pages are Tools are installed inside a guest virtual still accessible, but when they are required, machine, a memory balloon driver is they must be retrieved from disk. Swapping installed as well. This driver acts as a will noticeably degrade the overall Windows process, and the OS can use its performance of the host. normal techniques to assign idle or unused memory pages to Because vSphere’s other memory The number of the driver. The balloon driver then “pins” management techniques are so good, those pages and reports this back to the swapping usually takes place only on vCPUs that can be hypervisor. If the host becomes low on seriously overcommitted hosts, although physical memory, guest memory pages are swapping can also be caused by resource assigned to VMs is assigned to the balloon driver, and the host pool constraints or memory limits not limited to the can then reclaim these memory pages in configured on a virtual machine. In addition, order to address the needs of other virtual if a VM does not have VMware Tools number of physical machines that may need the RAM. installed or VMware Tools is not running, the ballooning process would get skipped CPUs in the host. In this way, when a particular virtual completely and the system will go straight machine has RAM to spare, it can to swapping. transparently share that RAM with other virtual machines on the same host, It should be noted that neither swapping enabling the host to achieve yet higher nor compression take place unless levels of VM density. Whereas TPS is a there is a memory contention issue on memory deduplication technique, the the host, or in the situations discussed ballooning process brings to RAM a sort of with regard to swapping. In most thin provisioning capability. The ballooning environments, memory contention process does require some processing issues that result in swapping or overhead, which is usually imperceptible compression should be avoided since in the performance of the guest and host. this situation means that the host has However, in extreme cases, ballooning can basically run out of RAM. cause swapping inside the OS. • Memory compression—Introduced How vSphere manages in vSphere 4.1, memory compression storage resources can, in some cases, replace the costly Storage is the third piece of the swapping process. With this technique, resource puzzle in a vSphere rather than memory pages being swapped environment. Storage resources can to disk on a per-VM basis, the memory also be oversubscribed through what pages are compressed and placed into have become very common resource a compression cache on disk. When the allocation techniques. need arises to swap to return to RAM a page that would have been swapped, the Thin provisioning is the most common page is instead retrieved from the cache technique for overprovisioning and uncompressed. While this process is storage resources. less costly than swapping to disk, it does The most common technique for still carry something of a performance hit. overprovisioning storage is a process known as thin provisioning. In many cases, when an administrator allocates storage to a virtual machine, more storage than is absolutely necessary

4 is allocated. After all, it’s reasonable acceptable and often recommended to expect that the virtual machine will method for managing storage capacity. continue to need additional disk space as time goes on. Some storage devices have additional features, such as data compression and Thin provisioning operates as follows: deduplication, that enable additional When an administrator provisions the levels of oversubscription. For the total disk space for the virtual machine, purposes of this paper, however, the the virtual machine is told that it has focus is on the hypervisor, so only thin access to the entirety of the allocated provisioning will be discussed. space. In reality, however, vSphere gives the virtual machine only the space that it Getting insight into resource usage is actually consuming. For instance, if an in your environment administrator allocates 200 GB to a new Now that we have see how resources Swapping to disk virtual machine, but that virtual machine are managed in a vSphere environment, is using only 40 GB, the remaining let’s move on to oversubscribing is the hypervisor’s 160 GB remain available for allocation those resources. In order to determine last-ditch effort to to other virtual machines. As a virtual whether resources are overcommitted, machine requires more space, vSphere the administrator needs a monitoring retrieve enough provides additional chunks to that virtual tool such as the free vOPS™ Server machine, up to the size of the disk that Explorer tool from Dell®. One of its physical RAM to was originally allocated. several utilities is Environment Explorer, satisfy the needs of which provides administrators with a By using thin provisioning, administrators high-level view of resource usage in workloads running can create virtual machines with the environment. As shown in Figure 1, virtual disks of a size that is necessary Environment Explorer shows resource on a host. Swapping in the long term, without having to utilization as a percentage of actual will noticeably immediately commit the total disk physical resources, making the utility space that is necessary to support that a perfect fit for exploring the topic of degrade the overall allocation. In many tests, it has been resource over-commitment. shown that thin provisioning carries with performance of it only a very slight—almost negligible— Now let’s explore some guidelines for the host. performance impact. Accordingly, thin oversubscribing each of the three types provisioning has become a common, of resources: processing, memory and storage.

Figure 1. Environment Explorer (part of the free vOPS Server Explorer tool) provides a high-level view of resource usage.

5 Oversubscribing processing Metrics to watch resources The following metrics will help you The common wisdom maintain a vCPU to pCPU ratio that As mentioned earlier, in vSphere 5, every makes efficient use of resources while physical processor core can support still allowing workloads to run well: up to 25 vCPUs. However, for every • Inside virtual machines additional workload beyond a 1:1 vCPU • CPU Utilization—When average CPU to pCPU ratio, the vSphere hypervisor usage remains high, it’s time to add an needs to invoke processor additional vCPU to the VM. in order to distribute processor time • On the host: to virtual machines that need it. For • CPU Ready—CPU Ready is, by far, the example, if an administrator has created most important gauge of overall host a vCPU to pCPU ratio of 5:1, then each health standpoint with regard to CPU. By using thin processor is supporting five vCPUs. CPU Ready indicates the length of time that a VM is waiting for enough physical provisioning, Experts in the field disagree about the processors to become available in order administrators can proper rules of thumb when it comes to meet its demands. For example, if a to vCPU to pCPU ratio. However, there VM is allocated four vCPUs, this metric create VMs with are two items on which just about will show the length of time that the VM everyone agrees: waited for four corresponding pCPUs to virtual disks of a size • Start with one vCPU per virtual machine. become available at the same time. that is necessary Most experts agree that administrators • CPU Utilization—The overall CPU should create new virtual machines with usage on the host server is also critical in the long term, just one vCPU and add virtual vCPUs because it enables an administrator to as needs dictate. As vCPUs are added, understand just how much work the without having to the virtual machine is tied to requiring host server is doing. immediately commit processor time from the host. Whenever the virtual machine needs to perform an Real-world observations and advice the total disk space operation, it has to wait for a number of physical CPUs equal to the number of Virtual resources that is necessary assigned vCPUs to be available. So, as 4 vCPUs - 200 % of Actual cores to support that administrators add more vCPUs to a virtual 3 GB Memory - 37 % of Physical machine, there is an increased risk of 26.6 GB Storage - 13 % of Provisioned allocation. poorer overall performance. Figure 2. A lab with a vCPU to pCPU • The vCPU to pCPU ration is workload ratio of 2;1 dependent. While 1:1 vCPU to pCPU assignment is sometimes advocated, other Online forums are filled with questions ratios are common. Although vSphere from users requesting insight into 5 supports a ratio of up to 25:1, your acceptable vCPU to pCPU ratios in a ability to achieve a high ratio will depend real-world environment. While some on the kinds of workloads you need to responses continue to advocate for a support. If the host is supporting lots of 1:1 ratio, from a pure density standpoint, virtual machines, each with only meager 1:1 should be considered a worst-case processing needs, the vCPU to pCPU ratio scenario. Figure 2 shows a lab with a could be quite high. If, however, the host ratio of 2:1. is running a number of processor intensive workloads, the ratio may be much smaller.

6 Some respondents indicate that they In addition, Environment Explorer have received guidance that suggests identifies where host processor no more than a 1.5:1 vCPU to pCPU resources are overcommitted, so you’ll ratio, but guidance from industry experts know where perform additional analysis suggests that vSphere real-world to determine if that over-commitment numbers are in the 10:1 to 15:1 range. is causing performance issues. As the Still others indicate that VMware itself has “% of actual cores” metric begins to a recommended ratio range of 6:1 to 8:1. surpass 500 percent, carefully monitor CPU Ready and general workload The Dell white paper, “Demystifying performance to ensure that business CPU Ready (% RDY) as a Performance needs are being met. Metric,” establishes the following vCPU:pCPU guidelines: Oversubscribing memory resources • 1:1 to 3:1 is no problem. The common wisdom Environment • 3:1 to 5:1 may begin to cause Oversubscribing RAM is one of the more performance degradation. controversial resource oversubscription Explorer (part of • 6:1 or greater is often going to cause options. Whereas CPU and storage Dell’s free vOPS a problem. resources are often overcommitted, there seems to be some conservatism Server Explorer In addition, keeping the CPU Ready when it comes to overcommitting RAM. metric at 5 percent or below is tool) provides a considered a best practice. Metrics to watch high-level view of On a host server, administrators need to The actual achievable ratio in a specific watch the amount of RAM actually in use resource usage. environment will depend on a number by virtual machines. As the actual RAM in of factors: use approaches 100 percent, either add • vSphere version—The vSphere CPU additional RAM to the server or migrate scheduler is always being improved. The workloads to hosts that have more newer the version of vSphere, the more available RAM. consolidation that should be possible. • Processor age—Newer processors are Real-world observations and advice much more robust than older ones, so In order to maximize VM density and organizations with newer processors ensure that the environment remains should be able to achieve higher operational, it’s important to monitor processor ratios. the actual memory utilization. That’s • Workload type—Different kinds of why Environment Explorer displays RAM workloads on the host will result in different usage (the “% of physical” metric) using optimal ratios. the amount of RAM actually provisioned to each virtual machine, rather than vScope Explorer, another utility included the amount of RAM actually being in vOPS Server Explorer, can help used by virtual machines once all of you investigate performance metrics, vSphere’s various memory management including CPU Ready, at both the host techniques are taken into consideration. and virtual machine levels, so you can determine whether your vCPU to pCPU ratio is too high.

7 The level of over-commitment possible storage volume running out of space, depends on one primary factor: how the VMs will still think they have available much memory deduplication can take disk space to use but there won’t be any place by virtue of the fact that there are space available. This can cause a serious many similar workloads running on the outages that can result in data loss host. The greater the level of disparity and costly recovery. So if you use thin between running workloads, the less provisioning, be sure to monitor carefully. memory consolidation that can take place and the less density that can be enjoyed. Metrics to watch Here are some observations about what To mitigate the risks associated with others are doing and recommending with thin provisioning, you need to keep a regard to memory over-commitment: close eye on the amount of free space Most experts agree • Many administrators refuse to in a datastore. As a datastore gets low oversubscribe RAM at all. on space, proactively add space to the that administrators • Some administrators prefer to not exceed datastore or use Storage vMotion to 125 percent of physical memory, feeling move one of the virtual machines to should create new that going beyond that limit carries a different datastore that has enough virtual machines unacceptable risk. available capacity to serve the needs of • If every workload on the server is identical, the workloads. with just one vCPU much higher over-commitment levels are possible. Real-world observations and advice and add virtual • Many other administrators simply spot- Thin provisioning is well represented vCPUs as needed. check host memory usage, but don’t in Environment Explorer, although it’s regularly scan for over-commitment levels. displayed only in aggregate. In the example in Figure 2, 26.8 GB of storage Oversubscribing storage resources is currently in use—13 percent of what’s The common wisdom actually provisioned to the three virtual It has become commonplace to machines that are powered on. As the oversubscribe storage resources using “% of provisioned” metric approaches thin provisioning. This technique offers 100 percent, take care to ensure that many benefits; the primary one is additional physical resources are maximizing the use of the organization’s made available. storage capacity. Plus, thin provisioning also helps IT in two ways. First, it enables Conclusion administrators to give a virtual machine Overprovisioning of processing, all of the storage it will ever need memory and storage can help you without having to constantly watch maximize resource utilization in your to see if it needs more space. Second, virtual environment. But you want to thin provisioning can reduce conflicts overprovision in such as way that you can between IT and other teams: application also maintain high performance. Using owners can request all of the storage the techniques and real-world advice they like and storage administrators— presented in this white paper, along with knowing full well that the request is the right tools, you can balance these too high—can simply grant the request needs and overprovision wisely. without worrying about wasting that over-requested storage.

However, thin provisioning also carries with it some challenges. While it can make life easier on a daily basis, it does add some complexity, and if administrators aren’t careful, they can introduce major availability issues. If the storage oversubscription results in the

8 For More Information © 2013 Dell, Inc. ALL RIGHTS RESERVED. This document DELL ASSUMES NO LIABILITY WHATSOEVER AND DISCLAIMS contains proprietary information protected by copyright. No ANY EXPRESS, IMPLIED OR STATUTORY WARRANTY RELATING part of this document may be reproduced or transmitted in TO ITS PRODUCTS INCLUDING, BUT NOT LIMITED TO, THE any form or by any means, electronic or mechanical, including IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR photocopying and recording for any purpose without the A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. IN NO written permission of Dell, Inc. (“Dell”). EVENT SHALL DELL BE LIABLE FOR ANY DIRECT, INDIRECT, CONSEQUENTIAL, PUNITIVE, SPECIAL OR INCIDENTAL Dell, Dell Software, the Dell Software logo and products—as DAMAGES (INCLUDING, WITHOUT LIMITATION, DAMAGES identified in this document—are registered trademarks of Dell, FOR LOSS OF PROFITS, BUSINESS INTERRUPTION OR LOSS Inc. in the U.S.A. and/or other countries. All other trademarks OF INFORMATION) ARISING OUT OF THE USE OR INABILITY and registered trademarks are property of their respective TO USE THIS DOCUMENT, EVEN IF DELL HAS BEEN ADVISED owners. OF THE POSSIBILITY OF SUCH DAMAGES. Dell makes no representations or warranties with respect to the accuracy or The information in this document is provided in connection completeness of the contents of this document and reserves with Dell products. No license, express or implied, by estoppel the right to make changes to specifications and product or otherwise, to any intellectual property right is granted by descriptions at any time without notice. Dell does not make this document or in connection with the sale of Dell products. any commitment to update the information contained in this EXCEPT AS SET FORTH IN DELL’S TERMS AND CONDITIONS AS document. SPECIFIED IN THE LICENSE AGREEMENT FOR THIS PRODUCT,

About Dell Dell Inc. (NASDAQ: DELL) listens to customers and delivers worldwide innovative technology, business solutions and services they trust and value. For more information, visit www.dell.com.

If you have any questions regarding your potential use of this material, contact:

Dell Software 5 Polaris Way Aliso Viejo, CA 92656 www.dell.com Refer to our Web site for regional and international office information.

Whitepaper-vSphere-Environ-US-KS-2013-03-28