White Paper

EMC INFRASTRUCTURE FOR HIGH PERFORMANCE MICROSOFT AND ORACLE DATABASE SYSTEMS EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5

• Simplified storage management with FAST VP • Accelerated performance with VFCache

EMC Solutions Group

Abstract

This white paper describes an automated storage tiering solution for multiple mission-critical applications virtualized with VMware vSphere® on the EMC® Symmetrix® VMAX® 40K storage platform. With VFCache™ enabled on the host level, read I/O is cached and offloaded from the VMAX storage virtual pool.

November 2012

Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.

The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.

Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com.

VMware, ESX, vMotion, VMware vCenter, and VMware vSphere are registered trademarks or trademarks of VMware, Inc. in the United States and/or other jurisdictions.

All trademarks used herein are the property of their respective owners.

Part Number H11035.1

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 2 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Table of contents

Executive summary ...... 6 Business case ...... 6 Solution overview ...... 6 Key results ...... 6

Introduction ...... 7 Purpose ...... 7 Scope ...... 7 Audience...... 7 Terminology ...... 7

Key technology components ...... 9 Overview ...... 9 EMC VFCache ...... 9 Server-side Flash caching for maximum speed ...... 9 Write-through caching to the array for total protection ...... 9 Application agnostic ...... 9 Integration with vSphere ...... 9 Minimum impact on system resources ...... 10 VFCache active/passive clustering support ...... 10 EMC Symmetrix VMAX 40K ...... 10 EMC Virtual Provisioning ...... 10 EMC FAST VP ...... 10 NEC Express5800/A1080a-E ...... 11 VMware vSphere 5 components ...... 11 VMware vSphere 5 ...... 11 VMware vCenter Server ...... 11 EMC PowerPath/VE ...... 11 Oracle Database 11g R2 ...... 11 Oracle Automatic Storage Management ...... 11 Oracle Grid Infrastructure ...... 12 Microsoft SQL Server 2012 ...... 12 SQL Server Failover Clustering ...... 12

Solution architecture and design ...... 13 Overview ...... 13 Physical architecture ...... 13 Hardware resources ...... 14 Software resources ...... 15 Storage connectivity...... 15

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 3 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Storage Virtual Provisioning design ...... 16 FAST VP configuration ...... 17 Storage design considerations ...... 17 Oracle database and workload profile ...... 18 Oracle database schema ...... 19 Oracle database services ...... 19 Oracle LUN configuration ...... 20 Microsoft SQL Server workload type ...... 20 SQL Server 2012 DSS workload and profile ...... 20 SQL Server 2012 DSS LUN configuration ...... 21 SQL Server 2012 OLTP workload and profile ...... 21 SQL Server 2012 OLTP LUN configuration ...... 22 SQL Server 2012 and Windows 2008 R2 settings for DSS and OLTP workloads ...... 22 Operating system and SQL Server instance settings ...... 22 Database settings ...... 22 VMware vSphere configuration...... 23 VMware virtual machine configuration ...... 23 SQL Server 2012 clustering on VMware ...... 24 EMC Virtual Storage Integrator ...... 25 VFCache configuration with VMware ...... 26

Performance testing processes ...... 29 Overview ...... 29 Validation ...... 29 Application workloads ...... 29 Test procedure ...... 30 Test scenarios ...... 30

Three-tier FAST VP without and with VFCache ...... 31 Objective ...... 31 Test scenarios ...... 31 Test result summary ...... 31 Three-tier FAST VP performance results ...... 33

Two-tier FAST VP without and with VFCache ...... 34 Overview ...... 34 Test scenarios ...... 34 Two-tier FAST VP OLTP performance results ...... 34 Two-tier VFCache and FAST VP OLTP performance results ...... 35 Two-tier FAST VP performance results breakdown ...... 37

VFCache impact on FAST VP ...... 38

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 4 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Overview ...... 38 Three-tier FAST VP behavior with VFCache ...... 38

VFCache with DSS workload ...... 40 Overview ...... 40 VFCache as the cache ...... 40 VFCache as the DAS store ...... 41

VFCache with SQL Server failover cluster instance ...... 42 Overview ...... 42 Microsoft failover clustering (active/passive) support ...... 42 Validation ...... 43

Conclusion ...... 44 Summary ...... 44 Findings ...... 44

References ...... 45 White papers ...... 45 Product documentation ...... 45 Other documentation ...... 45

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 5 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Executive summary

As enterprises move their databases and applications to the private cloud, their IT Business case organizations must strive for more efficiency and improved quality of service, including:

• Extending the high performing Flash technology from storage to host to support mixed workloads.

• Moving the workload from the storage array to host-based VFCache™ so that the array can serve more I/Os for other applications.

• Reducing capital expenditures and ongoing costs. • Maintaining high performance levels and providing predictable performance to deliver the quality of service required in these environments. It is essential that infrastructure and tools simplify storage management processes and improve performance with a minimum of manual tasks.

EMC® Symmetrix® VMAX® 40K, and associated management tools, have been Solution overview developed to be the foundation of this infrastructure and to meet real business needs:

• Performance optimization—Optimizing and prioritizing business applications, allowing customers to dynamically allocate resources within a single array.

• Ease of management—Elimination of manually tiering applications when performance objectives change over time.

• Host-side storage acceleration—Accelerating application performance to extreme levels and storing read-cache hot data closest to server memory through consolidation with VFCache.

NEC Express5800/A1080a-E is the base server platform of this solution. Representing the fifth generation of enterprise server architecture from NEC, this line of server maintains NEC’s legacy for developing scalable enterprise servers that offer exceptional configuration flexibility, capacity, reliability and availability features. Pairing the NEC Express5800/A1080a-E with VMware vSphere® 5.0, the platform creates an outstanding solution for enterprise virtualization needs.

Our testing shows that this solution, based on EMC Symmetrix VMAX with Enginuity™ Key results 5876, FAST VP, and EMC VFCache, provides the following performance results:

• VFCache improves OLTP performance by offloading much of the read I/O traffic from the storage array for other applications.

• VFCache can solidly support an OLTP workload with SAN-based central storage, and can improve the application performance on two-tier FAST VP.

• Active/passive-hypervisor and physical cluster support means that VFCache can ensure data integrity while accelerating application performance in a highly available environment.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 6 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Introduction

This white paper describes the design, testing, and validation of an enterprise Purpose VMware® infrastructure using the EMC Symmetrix VMAX 40K storage platform with Enginuity 5876, and EMC VFCache as its foundation. This solution demonstrates how VFCache complements Symmetrix FAST VP in providing performance, scalability, and application-specific functionality to the solution using representative application environments, including Microsoft SQL Server and Oracle.

Specifically, this solution:

• Validates that VFCache hardware can be shared to, and serve multiple applications in a VMware virtualized environment.

• Validates that VFCache can be consolidated with FAST VP enabled on Symmetrix VMAX, and also that recently accessed data within workloads can be effectively offloaded from SAN-based central storage to a VFCache card.

• Validates that VFCache can improve performance while the SAN-based storage has limited spindles to provide excellent response times for the read-intensive workload.

• Demonstrates that data integrity can be guaranteed by adding VFCache to a clustered SQL Server instance.

This white paper discusses multiple EMC products as well as those from other Scope vendors. Some general configuration and operational procedures are outlined. However, for detailed product installation information, refer to the user documentation provided with those products.

This white paper is intended for EMC employees, partners, and customers, including Audience IT planners, virtualization architects and administrators, and any other IT professionals involved in evaluating, acquiring, managing, operating, or designing infrastructure that leverages EMC technologies.

Throughout this white paper, we assume that you have some familiarity with the concepts and operations related to enterprise storage and virtualization technologies and their use in information infrastructures.

Table 1 defines several terms used in this paper. Terminology

Table 1. Terminology

Term Definition ASM Oracle Automatic Storage Management

DSS Decision Support System (that is, data warehouse)

FAST VP Fully Automated Storage Tiering for Virtual Pools

FC

FCI Failover Cluster Instance

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 7 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Term Definition HBA Host bus adapters

HS Hot swap

IOPS I/Os per second

LUN Logical unit number

NIC Network interface controller

OLTP Online transaction processing

pRDM physical Raw Device Mapping

RAID Redundant array of independent disks

SAN

SAS Serial Attached SCSI

SATA Serial Advanced Technology Attachment

SCSI Small Computer System Interface

SLC Single-Level Cell

TDev Thin device

TPS Transactions per second

VSI Virtual Storage Integrator

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 8 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Key technology components

This solution used the following key hardware and software components: Overview • EMC VFCache • EMC Symmetrix VMAX 40K storage array • EMC Virtual Provisioning™ • EMC FAST VP • NEC Express5800/A1080a-E ® • VMware vSphere • Oracle Database 11g R2 Enterprise Edition • Microsoft SQL Server 2012 Enterprise Edition

EMC VFCache is a server Flash-caching solution that reduces latency and increases EMC VFCache throughput to improve application performance by using intelligent caching software and Peripheral Component Interconnect Express (PCIe) Flash technology. A number of VFCache features are highlighted below. For more information refer to the VFCache Installation and Administration Guide.

Server-side Flash caching for maximum speed VFCache software caches the most frequently referenced data on the server-based PCIe card, thereby putting the data closer to the application.

VFCache caching optimization automatically adapts to changing workloads by determining which data is most frequently referenced and promoting it to the server Flash card. This means that the “hottest” (that is, the most active) data automatically resides on the PCIe card in the server for faster access.

VFCache offloads the read traffic from the storage array, which allows it to allocate greater processing power to other applications. While one application is accelerated with VFCache, the array’s performance for other applications is maintained or even slightly enhanced.

Write-through caching to the array for total protection VFCache accelerates reads and protects data by using a write-through cache to the storage to deliver persistent high availability, integrity, and disaster recovery.

Application agnostic VFCache is transparent to applications, therefore no rewriting, retesting, or recertification is required to deploy VFCache in the environment.

Integration with vSphere Integration of the VSI plug-in with VMware vSphere vCenter simplifies the management and monitoring of VFCache.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 9 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Minimum impact on system resources VFCache does not require a significant amount of memory or CPU cycles, as a majority of Flash management is done on the PCIe card, and uses fewer server resources. However, unlike other PCIe solutions, there is no significant overhead from using VFCache on server resources.

VFCache active/passive clustering support VFCache clustering support ensures data integrity of an active/passive clustered application. The VFCache-enabled cluster also accelerates application performance.

EMC Symmetrix VMAX 40K with Enginuity version 5876 provided the tiered storage EMC Symmetrix configuration used in the test environment. VMAX 40K Built on the strategy of powerful, trusted, smart storage, this solution incorporated a highly scalable Virtual Matrix Architecture™ that enables Symmetrix VMAX arrays to grow seamlessly and cost-effectively. Symmetrix VMAX supports Flash drives, FC drives, and SATA drives within a single array, as well as an extensive range of RAID types.

The EMC Enginuity operating environment controls all components in the Symmetrix VMAX array. Enginuity 5876 for Symmetrix VMAX offers:

• More efficiency: New zero downtime technology for migrations (technology refreshes) and lower costs with automated tiering.

• More scalability: Up to two times more performance, with the ability to manage up to 10 times more capacity per storage administrator.

• More security: Built-in encryption, RSA-integrated key management, increased value for virtual server and mainframe environments, replication enhancements, and a new electronic licensing model.

EMC Virtual Provisioning is EMC’s implementation of thin provisioning. It is designed EMC Virtual to simplify storage management, improve capacity utilization, and enhance Provisioning performance. Virtual Provisioning provides for the separation of physical storage devices from the storage devices as perceived by host systems. This enables nondisruptive provisioning and more efficient storage use. This solution uses virtually provisioned storage for all deployed applications.

For detailed information on Virtual Provisioning, refer to the EMC Solutions Enabler Symmetrix Array Controls CLI v7.4 Product Guide.

EMC FAST VP is a feature of Enginuity version 5875, and higher, that provides EMC FAST VP automatic storage tiering at the sub-LUN level. Virtual pools are Virtual Provisioning thin pools.

FAST VP provides support for sub-LUN data movement in thin-provisioned environments. It combines the advantages of Virtual Provisioning with automatic storage tiering at the sub-LUN level to optimize performance and cost, while simplifying storage management and increasing storage efficiency.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 10 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 FAST VP data movement between tiers is based on performance measurement and user-defined policies, and is executed automatically and nondisruptively.

NEC The NEC Express5800/A1080a has many key design features that are ideal for mixed- workload large-scale virtualization. With a maximum memory configuration of 2 TB, Express5800/A10 eight CPU sockets (160 threads) and 14 PCI Express 2.0 slots, consolidating entire 80a-E database, application, and Web infrastructures onto a single NEC Express5800/A1080a is the preferred solution to growing IT needs.

For this solution, the Microsoft SQL Server and Oracle application servers are fully VMware vSphere 5 virtualized using VMware vSphere 5. This section describes the virtualization components infrastructure, which uses the following components and options:

• VMware vSphere 5.0.1 • VMware vCenter™ Server ® • EMC PowerPath /VE for VMware vSphere Version 5.7

VMware vSphere 5 VMware vSphere 5 is a complete, scalable, and powerful virtualization platform, with infrastructure services that transform IT hardware into a high-performance shared computing platform, and application services that help IT organizations deliver the highest levels of availability, security, and scalability.

VMware vCenter Server VMware vCenter is the centralized management platform for vSphere environments, enabling control and visibility at every level of the virtual infrastructure.

EMC PowerPath/VE EMC PowerPath/VE for VMware vSphere delivers PowerPath multipathing features to optimize VMware vSphere virtual environments. PowerPath/VE installs as a kernel module on the VMware ESXi™ host and works as a multipathing plug-in (MPP) that provides enhanced path management capabilities to ESXi hosts.

Oracle Database 11g Release 2 Enterprise Edition delivers industry-leading Oracle Database performance, scalability, security, and reliability on a choice of clustered or single 11g R2 servers running Windows, Linux, or UNIX. It provides comprehensive features for transaction processing, business intelligence, and content management applications.

Oracle Automatic Storage Management Oracle Automatic Storage Management (ASM) is a volume manager and a file system for Oracle database. ASM is Oracle's recommended storage management solution that provides an alternative to conventional volume managers, file systems, and raw devices.

ASM uses disk groups to store data files. An ASM disk group is a collection of disks that ASM manages as a unit. Within a disk group, ASM exposes a file system interface for Oracle database files. The content of files stored in a disk group is evenly

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 11 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 distributed, or striped, to eliminate hot spots and to provide uniform performance across the disks. The performance is comparable to the performance of raw devices.

Oracle Grid Infrastructure For this solution, Oracle Grid Infrastructure was installed with the Standalone Server option.

The Oracle Grid Infrastructure for a standalone server provides system support for an Oracle database including volume management, file system, and automatic restart capabilities. If you plan to use Oracle Restart or Oracle ASM, then you must install Oracle Grid Infrastructure before you install and create the database.

Oracle Grid Infrastructure for a standalone server combines Oracle Restart and Oracle ASM into a single set of binaries that is installed in the Oracle Grid Infrastructure home.

Microsoft SQL Server 2012 is Microsoft’s database management and analysis system Microsoft SQL for e-commerce, line-of-business, and data warehousing solutions. By enabling a Server 2012 modern data platform with SQL Server 2012, users can get built-in, mission-critical capabilities and enable breakthrough insights across the organization with familiar analytics tools and enterprise-ready Big Data solutions.

SQL Server Failover Clustering In SQL Server failover clustering, the operating system and SQL Server work together to provide availability in case of an application failure, hardware failure, or operating system error. Failover clustering provides hardware redundancy through a configuration in which vital, shared resources are automatically transferred from a failing computer to an equally configured server. SQL Server failover clustering in the active/passive mode is for one instance of a set of databases.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 12 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Solution architecture and design

EMC solutions are validated architectures that are designed to reflect real-world Overview deployments. This section describes the key components, resources, and overall architecture that make up this solution and its environment.

Figure 1 depicts the physical architecture for this solution. Physical architecture

Figure 1. Physical architecture diagram This solution is built on an EMC Symmetrix VMAX 40K array running Enginuity 5876. The array provides a mix of Flash, FC, and SATA drives. FAST VP continually monitors and tunes performance by relocating data across storage tiers, based on access patterns and predefined FAST policies.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 13 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 We provisioned Microsoft SQL Server 2012 (two OLTP workloads and one Decision Support System (DSS)) and Oracle 11g R2 (OLTP). We also built Microsoft failover clustering on the virtualized environment to validate VFCache and Microsoft Cluster Service (MSCS) consolidation. These applications ran on virtual machines in a VMware vSphere 5 cluster environment on EMC VMAX 40K storage.

Load generation tools drove these applications simultaneously to validate the infrastructure and acceleration function from VFCache. Failover was performed for SQL Server failover clustering to verify VFCache and Windows Server Failover Clustering (WSFC) integration.

The effects of applying the FAST policy are documented in Performance testing processes.

Table 2 lists the hardware resources used in this solution environment. Hardware resources Table 2. Hardware resources

Equipment Quantity Configuration EMC Symmetrix VMAX 40K 1 Three-engine, 128 GB cache per engine Enginuity 5876 33 x Flash 200 GB (including 1 HS) 132 × 600 GB 10k FC drives (including 6 HS) 70 × 2 TB 7.2k SATA drives (including 3 HS) 64 x 450 GB 15k FC drives (including 2 HS)

NEC Express5800/A1080a-E 2 8-socket (10 C/2.40 GHZ/30 MB cache) 1 TB RAM 4 GbE IP ports 4 × 146 GB 2.5-in. 15k SAS disks 1 × internal RAID controller 12 × 8 PCIe slots/2 × 16 PCIe slots 2 × dual-port 8 Gb/s HBAs (4 FC) 1 × quad-port GbE NIC

SAN 1 8 Gb enterprise-class FC switch

VFCache 2 700 GB SLC EMC VFCache cards

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 14 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Table 3 lists the software resources used in this solution environment. Software resources

Table 3. Software resources

Software Version EMC Symmetrix VMAX Enginuity code 5876

EMC PowerPath/VE for VMware 5.7

EMC Unisphere® for VMAX 1

EMC Solutions Enabler 7.4

VMware vSphere 5 (Enterprise Plus) 5.0.1

Oracle ASMlib 2.0.5

Oracle Database 11g R2 11.2.0.3

Microsoft Windows Server 2008 R2 SP1

Microsoft SQL Server 2012 RTM

Microsoft TPC-E toolkit 1.12.0

Quest Benchmark Factory 5.8.1

Red Hat Enterprise Linux Server 5.7

Swingbench 2.3

VFCache driver 1.5

VFCache software 1.5

The application workloads were logically separated using masking views within the Storage VMAX 40K and HBAs. Figure 2 shows the front-end port use for each application. connectivity Oracle and SQL Server OLTP workloads were on separate ESXi hosts but shared the same eight front-end ports on the VMAX. The SQL Server OLTP and DSS workloads were on the same ESXi host but the front-end ports on the VMAX were separate. The purpose was to separate the application by the I/O size. The OLTP application is typically 8k–64k I/O in size and is measured in IOPS. For DSS with large I/O sizes ranging from 8k to 256k, disk performance is typically measured by throughput (in megabytes per second).

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 15 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5

Figure 2. Logical grouping of ports to applications EMC Virtual Provisioning greatly simplifies storage design. We created four thin pools Storage Virtual on the array, based on the drive types available. Provisioning design Table 4 shows the thin pool definitions.

Table 4. Thin pool configuration

Drive size/ Data No. of RAID No. of Pool FAST VP Thin pool name technology/ device data protection drives capacity usage RPM size devices FLASH_3RAID5 200 GB Flash RAID5 3+1 32 68.8 64 4.2 TB Oracle/SQL GB Server OLTP

FC10K_RAID1 600 GB FC 10k RAID1 126 66 GB 504 32 TB Oracle/SQL Server OLTP

FC15K_RAID1 450 GB FC 15k RAID1 64 49.2 256 12.2 TB DSS GB

SATA_6RAID6 2 TB SATA 7.2k RAID6 6+2 72 240 256 60 TB Oracle/SQL GB Server OLTP/DSS

For this solution, the Oracle and Microsoft OLTP applications were bound to the FC10K_RAID1 pool. The Microsoft DSS application was bound to the FC 15K_RAID1 pool, which was backed by a smaller number of drives.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 16 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 VMAX administrators can set high-performance policies that use more Flash drive FAST VP capacity for critical applications, and cost-optimized policies that use more SATA configuration drive capacity for less critical applications.

The ideal FAST VP policy is to specify a 100 percent allocation for each of the tiers included. Such a policy provides the greatest amount of flexibility to an associated storage group, as it allows 100 percent of the storage group’s capacity to be promoted or demoted to any tier within the policy.

However, Data warehouse applications tend to issue scan-intensive operations that access large portions of the data at a time and also commonly perform bulk loading operations. These operations result in larger I/O sizes than OLTP workloads, and they require a storage subsystem that can provide the required throughput. This makes the throughput or megabytes per second (MB/s) the critical metric.

Although Flash disk storage can provide more than a 100 MB/s throughput, generally it is best suited to serving a small portion of the database’s hot data. Therefore, in this solution, we used a two-tier policy consisting of FC and SATA storage to provide a cost-efficient mix of storage to satisfy the needs of DSS workloads.

Table 5 shows the FAST VP policies used for the application workloads in this solution for Oracle, SQL Server OLTP, and SQL Server DSS.

Table 5. FAST VP policy for Oracle, SQL Server OLTP, and SQL Server DSS

Storage group FAST policy name Flash FC SATA MSSQL_OLTP MSSQL_OLTP 100 percent 100 percent 100 percent

MSSQL_DSS MSSQL_DSS 0 percent 100 percent 100 percent

Oracle Oracle 100 percent 100 percent 100 percent

Storage design considerations The design incorporates the following recommended practices for mission-critical database applications with FAST VP:

• Use separate storage volumes for data files and log files. • Use separate file groups for large databases. • For ASM, EMC recommends separate ASM disk groups for DATA, REDO, FRA, and TEMP.

• Bind all thin devices to the FC tier. • Pin log devices and temp files to the FC tier. Figure 3 shows an overview of how each critical application is configured for FAST VP. In this implementation, only data LUNs are managed by FAST VP. LUNs for OS, temp, and LOG are pinned to the FC tier, excluding them from FAST VP decisions and movement.

Note The DSS SQL Server instance tempdb is moved out after moving the DSS tempdb to VFCache.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 17 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5

Figure 3. General view of FAST VP configuration for mission-critical applications

The Swingbench Order Entry PL/SQL Server (SOE) schema was used to deliver the Oracle database — OLTP workloads required by this solution. Swingbench consists of a load generator, a and workload coordinator, and a cluster overview. The software enables a load to be generated and profile the transactions and response times to be charted.

Table 6 details the Oracle database and workload profile for this solution.

Table 6. Oracle database and workload profile

Profile characteristic Details Database size 2 TB

Database version Oracle Database 11g R2 single instance

Storage type Oracle ASM

Oracle system global area (SGA) 24 GB

Workload type OLTP

Workload profile Swingbench Order Entry (TPC-C-like) workload

Database metric Transactions per second (TPS)

Workload read/write ratio 80/20

Swingbench sessions 1,800

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 18 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Two identical schemas were used to deliver the OLTP workloads required by this Oracle database solution: Schemas SOE1 and SOE2. A Swingbench Order Entry workload was schema generated and run against schema SOE1. Since the I/O distribution across the database was completely even and random, this reduced sub-LUN skewing (since the entire database was highly active), therefore the second schema SOE2 remained idle to simulate a more normal environment where some objects are not highly accessed.

Table 7 lists the tables and indexes for the SOE schema used in this solution (SOE1).

Table 7. SOE schema

Table name Index CUSTOMERS CUSTOMERS_PK (UNIQUE), CUST_ACCOUNT_MANAGER_IX, CUST_EMAIL_IX, CUST_LNAME_IX, CUST_UPPER_NAME_IX

INVENTORIES INVENTORY_PK (UNIQUE), INV_PRODUCT_IX, INV_WAREHOUSE_IX

ORDERS ORDER_PK (UNIQUE), ORD_CUSTOMER_IX, ORD_ORDER_DATE_IX, ORD_SALES_REP_IX, ORD_STATUS_IX

ORDER_ITEMS ORDER_ITEMS_PK (UNIQUE), ITEM_ORDER_IX, ITEM_PRODUCT_IX

PRODUCT_DESCRIPTIONS PRD_DESC_PK (UNIQUE), PROD_NAME_IX

PRODUCT_INFORMATION PRODUCT_INFORMATION_PK (UNIQUE), PROD_SUPPLIER_IX

WAREHOUSES WAREHOUSES_PK (UNIQUE)

LOGON –

Database services are entry points to an Oracle database that enable the Oracle database management of workloads across the cluster. For this solution, each of the test services schemas had a corresponding database service mapped to it, as shown in Table 8. This enabled monitoring of the Oracle I/O to be mapped to each individual service and hence schema.

Table 8. Oracle database services

Schema Service Instance Swingbench sessions SOE1 SOE1.oracledb.ie orafast 1800

SOE2 SOE2.oracledb.ie orafast 0

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 19 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Table 9 lists the LUN configuration for the Oracle application. Oracle LUN configuration Table 9. Oracle LUN configuration

ASM disk group TDev hyper size Number of TDevs Capacity (GB) +DATA 64 GB 2 128

+SOE1 64 GB 15 960

+SOE2 64 GB 15 960

+FRA 64 GB 2 128

+TEMP 64 GB 1 64

+REDO 64 GB 1 64

Total (GB) 2304

In the test environment, the following two applications generated the different Microsoft SQL workload patterns running on the Microsoft SQL Server 2012 enterprise class Server workload platform: type • A TPC-E-like application, acting as a typical OLTP application • A TPC-H-like application, acting as a typical DSS application

The test workload for the SQL Server 2012 DSS application was based on a TPC-H-like SQL Server 2012 workload. The TPC-H -like application models the analysis part of the business DSS workload and environment where trends are computed and refined data is produced to support the profile making of sound business decisions. In a TPC-H-like application, periodic refresh functions are performed against a DSS database whose content is queried on behalf of, or by, various decision makers. Table 10 details the SQL Server DSS database and workload profile for this solution.

Table 10. SQL Server DSS database and application profile

Profile characteristic Details Total SQL Server database capacity 2 TB

Number of SQL Server instances 1

Concurrent users 2

Read/write ratio (user databases) 100:0 (typical)

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 20 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Table 11 shows the LUN utilization for Microsoft SQL Server DSS LUNs, and how they SQL Server 2012 were used. DSS LUN configuration Table 11. Table LUN use for Microsoft SQL Server DSS

Purpose LUN size No. of TDevs Capacity (GB) SQL Server DSS virtual machine data store 128 1 128

File group1–8 480 8 3840

Log 64 1 64

tempdb log 32 1 32

tempdb 64 6 384

Total (GB) 4,448

The test workload for the SQL Server 2012 OLTP application instances was based on SQL Server 2012 a TPC-E-like workload. It was composed of a set of transactional operations which OLTP workload and simulate an online stock trading floor, which is latency-sensitive and combines profile multiple query types. These include insert and updates per application transaction.

The OLTP applications are primarily heavily indexed to support low latency retrieval of small numbers of rows from data sets that often have little historical data volume. These types of database operations induce significant disk head movement and generate classic random I/O scan patterns. Table 12 details the SQL Server OLTP database and workload profile for this solution.

Table 12. SQL Server OLTP database and application profile

Profile characteristic Details Total SQL Server database capacity 1 TB

Number of SQL Server instances 2

Number of user databases for each 1 (400 GB, 600 GB) virtual machine

Mixed workloads to simulate both a hot and a Concurrent users warm application environment.

Read/write ratio 90:10

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 21 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Table 13 and Table 14 show the LUN use for Microsoft SQL Server LUNs. SQL Server 2012 OLTP LUN configuration Table 13. Microsoft SQL Server LUN use—SQL Server OLTP instance #1 Purpose LUN size No. of TDevs Capacity (GB) SQL Server OLTP virtual machine data store 128 1 128

SQL1 tpce root 64 1 64

SQL1 file group 1–8 128 8 1024

SQL1_Log 256 1 256

SQL1 tempdb log 64 1 64

SQL1 tempdb 64 4 256

Total (TB) 1,792

Table 14. Microsoft SQL Server LUN use—SQL Server OLTP instance #2

Purpose LUN size No. of TDevs Capacity (GB) SQL Server OLTP virtual machine data store 128 1 128

SQL2 tpce root 32 1 32

SQL2 file group 1–8 64 8 512

SQL2_Log 128 1 128

SQL2 tempdb log 64 1 64

SQL2 tempdb 64 4 256

Total (TB) 1,120

SQL Server 2012 Operating system and SQL Server instance settings and Windows 2008 For the SQL Server 2012 tests we used Windows 2008 R2 as the operating system. R2 settings for Our settings for DSS and OLTP workloads were: DSS and OLTP • Large-page memory support was enabled for the SQL Server instance by workloads enabling the 834 startup parameter.

• The Lock pages in memory option was used for the SQL Server instances. • All data and log LUNs were formatted using a 64 KB allocation unit size.

Database settings For user databases, we used these settings:

• For DSS user database multiple data files—16 data files on 16 thin devices (TDevs).

• For OLTP user database multiple data files—eight data files on eight TDevs. • Disabled the autogrow option for data files and manually grew all data files.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 22 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 For tempdb, we used these settings:

• Pre-allocated space and added a single data file per LUN. We ensured all files were the same size.

• Assigned temp log files to one of the LUNs dedicated to log files. • Enabled autogrow—In general, the use of a large growth increment is appropriate for data warehouse workloads. A value equivalent to 10 percent of the initial file size is a reasonable starting point. We followed standard SQL Server best practices for database and tempdb sizing considerations. For more information, see Capacity Planning for tempdb in SQL Server Books Online.

For the transaction log we used this configuration:

• Created a single transaction log file per database on one of the LUNs assigned to the transaction log space. Spread log files for different databases across available LUNs or use multiple log files for log growth, as required.

• Enabled the autogrow option for log files.

VMware vCenter Server provided a scalable and extensible platform to centrally VMware vSphere manage VMware vSphere environments, providing control and visibility at every level configuration of the virtual infrastructure.

We connected two ESXi5 hosts to the VMAX 40K array. Host A ran the virtual machine for Oracle, and also ran the FCI standby SQL Server virtual machine. Host B ran the virtual machines for the SQL Server OLTP, the DSS, and ran the FCI active SQL Server virtual machine.

The clustered SQL Server data LUNs used physical Raw Device Mapping (pRDM) VMware virtual disks, besides these shared disks, all virtual machines in this configuration used machine virtual machine disks (VMDK) from VMware Virtual Machine File System (VMFS) data configuration store volumes, including the OS and boot LUNs. Each VMFS data store hosts a single VMDK disk, ensuring high performance and zero contention. This practice also ensures you have the ability to restore at an application level with EMC TimeFinder Clone and Snap on the VMAX 40K array.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 23 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Table 15 shows the virtual machines CPU and memory allocation for each application virtual machine.

Table 15. Virtual machine CPU and memory allocation

Application Virtual machine name CPU count Memory size Oracle ORACLEDB 32 54,272 MB

Microsoft SQL Server SQLTPCH01 32 131,072 MB DSS

Microsoft SQL Server SQLTPCE01 32 16,000 MB OLTP SQLTPCE02 32 16,000 MB

Domain controller 4 4,096 MB

In this solution, a clustered SQL Server instance was built across two ESXi hosts for SQL Server 2012 the VFCache function test. The cluster requires specific hardware and software. clustering on VMware The ESXi hosts had the following configuration:

• One physical network adapter dedicated to the VMkernel. • Shared storage must be on an FC SAN. In this solution, the two shared disks were from VMAX 40K.

• RDM in physical compatibility (pass-through) mode.

Table 16. Microsoft SQL Server Failover Cluster Instance (FCI) LUN use

Purpose Quantity of LUNs Capacity (GB) SQL Server FCI boot LUN 2 160 data store

Microsoft Distributed 1 200 Transaction Coordinator (MSDTC) data store

Microsoft SQL Server 2 200 user database store

Total (TB) 0.92

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 24 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 EMC Virtual Storage Integrator (VSI) provides enhanced visibility into Symmetrix EMC Virtual VMAX 40K directly from the vCenter GUI. Figure 4 shows the data store and storage Storage Integrator pool information, which provides information about virtual pool usage for the Oracle_SOE1_1 data store.

Figure 4. Data store and storage pool information viewed from VSI VMAX 40K volumes hosted the VMFS data stores and pRDM disks for this solution. Figure 4 shows the ESXi server and the Symmetrix VMAX 40K storage mapping with details about VMFS data stores and the LUNs. The VSI Storage Viewer feature identifies details about VMFS data stores such as the VMAX storage volumes hosting the data store, the paths to the physical storage, pool usage information, and data store performance statistics.

Figure 5 shows the LUN view from VSI. From here, administrators can identify the Symmetrix device ID for LUNs and data stores, if user-defined labels are set on VMAX LUNs. Administrators can export these listings to CSV files for manipulation with VMware PowerCLI scripts for the rapid provisioning of data stores to the ESXi hosts.

Figure 5. EMC Virtual Storage Integrator LUN view

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 25 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 In a VMware environment, the VFCache card resides on the ESXi server, while VFCache VFCache software is installed on each of the virtual machines that are accelerated by configuration with VFCache. The VFCache VSI plug-in, which resides on the vCenter client, is used to VMware manage VFCache. VFCache can accelerate performance for either RDM or VMFS LUNs in a VMware environment.

The VFCache installation is distributed over various vSphere system components. Figure 6 illustrates the location of these installed components. VFCache software is installed on the guest machines and the VFCache VSI plug-in is installed on a vSphere client.

Figure 6. VFCache in VMware environment

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 26 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Multiple virtual machines on the same ESXi server can share the performance advantages of a single VFCache card. As shown in Figure 7, the Flash device (VMFS) is carved into virtual disks and presented to the virtual machines. Refer to VFCache Installation Guide for VMware 1.5 for detailed VFCache VMware configuration.

Figure 7. VFCache in VMware environment VFCache is integrated with VSI plug-ins to simplify VFCache management and monitoring. Figure 8 shows how VSI is used to manage VFCache in the VMware environment.

Figure 8. VSI VFCache management

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 27 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 We monitored and observed how many I/Os were offloaded by the VFCache card, as shown in Figure 9.

Figure 9. VSI VFCache monitor

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 28 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Performance testing processes

Overview This section describes how we tested the applications in the solution environment.

Each test is described in more detail in later sections.

Notes

• Benchmark results are highly dependent on workload, specific application requirements, and system design and implementation. Relative system performance will vary as a result of these and other factors. Therefore, this workload should not be used as a substitute for a specific customer application benchmark when critical capacity planning or product evaluation decisions are contemplated.

• The environment was rigorously controlled; results obtained in other operating environments may vary significantly.

• EMC Corporation does not warrant that a user can or will achieve performances similar to these.

Validation To validate the environment, we deployed all applications and populated them with test data. Each of the applications (Oracle, SQL Server OLTP, and SQL Server DSS) was deployed at the production location, and workloads were driven against each application running simultaneously on the VMAX 40K storage array.

We used Unisphere’s Performance Analyzer module on VMAX to monitor and gather storage performance data in addition to application performance monitoring tools.

Application For each application, we used load generation tools to simulate realworld user interactions. The details are as follows: workloads • We used a Microsoft TPC-E toolkit on the client virtual machines to generate TPC-E-like loads simultaneously for SQL Server OLTP databases. This emulated warm and hot workloads. The SQL Server OLTP application I/O pattern is typically 8 KB read/write, with a read/write ratio of 90:10 percent, respectively.

• We used Quest Benchmark Factory to generate a TPC-H-like load for the SQL Server DSS database. The DSS application I/O pattern is typically 64 KB, with 100 percent read ratio on the data LUNs.

• We generated a Swingbench TPC-C-like order entry workload with 1,800 users and ran it against the Oracle database. The Oracle I/O pattern is 8 KB read and 8 KB write, with a read/write ratio of 80/20 percent, respectively.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 29 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 The test procedure was as follows: Test procedure 1. Baseline test The baseline performance metrics were measured when the application workloads (Oracle, SQL Server OLTP, and SQL Server DSS) were run together and stabilized within three hours. We measured each application’s performance to ensure it was within predefined KPIs and that all workloads co-existed without a negative impact on each other.

2. Enable VFCache After the application workloads stabilized, we enabled VFCache on the OLTP workload and measured the performance acceleration and workload offloading from three-tiered FAST VP storage to VFCache. We enabled a card on each of the SQL Server OLTP and Oracle virtual machines, and configured as much space as possible for all three workloads, according to demand. Enabling VFCache on a DSS workload was also verified as cache, and as a local disk store for tempdb.

The minimum space requirement for the VFCache is 25 GB. The 700 GB VFCache cards (651 GB usable space) was divided by capacity and allocated to the four virtual machines on the two ESXi servers. The detailed allocation is listed in Table 17.

Table 17. VFCache allocation on virtual machines

VFCache allocation per ESXi 01 ESXi 02 application/virtual machine (allocation unit: GB) (allocation unit: GB) Oracle OLTP 600 N/A

SQL Server DSS N/A 200

SQL Server OLTP 01 N/A 200

SQL Server OLTP 02 N/A 200

Total 625 625

The performance tests contained the following scenarios: Test scenarios • Three-tier FAST VP without and with VFCache • Two-tier FAST VP without and with VFCache • VFCache impact on FAST VP • VFCache with DSS workload • VFCache with SQL Server failover cluster instance

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 30 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Three-tier FAST VP without and with VFCache

The objective of this test was to validate the solution build under normal operating Objective conditions for a normal work day, with FAST VP storage tiering enabled. Tests were run without and with VFCache enabled to evaluate the heavy workloads offloading from three-tiered FAST VP storage to VFCache.

We evaluated all aspects of this solution, including the VMware vSphere server and virtual machine performance, Oracle, SQL Server OLTP, and SQL Server DSS server and client experiences.

The VFCache offloading test had two scenarios: Test scenarios • Before enabling VFCache: All OLTP workloads were on VMAX with three-tier FAST VP enabled. The storage can support the workload with an excellent application response time; however the workload was heavy on the array.

• After enabling VFCache on the virtual machines: The read I/O can be offloaded to VFCache. The array still has three-tier FAST VP enabled and can handle other I/O requests.

The test result summary is as follows: Test result summary • Without VFCache enabled the array received more than 40,000 IOPS from the host side. With VFCache enabled this number fell to only 12,000 IOPS (approximately) from the host side.

• VFCache significantly reduced the IOPS and back-end adapter utilization of the storage array in the three-tier FAST VP configuration, while there was no change in the OLTP TPS performance. The array was then freed up for other I/O requests.

• The SQL Server OLTP, Oracle Database, and LUN response times decreased after VFCache was enabled.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 31 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 As shown in Figure 10, the IOPS received by the VMAX 40K array front-end adapter fell from approximately 40,000 to 12,000, while the back-end adapter busy percentage decreased from 67 percent to 33 percent. This means that approximately 28,000 IOPS were offloaded by VFCache.

Figure 10. Array workload before and after VFCache was enabled The test procedure was carried out with the three application (Oracle, SQL Server OLTP, and SQL Server DSS) workloads running together. Figure 11 shows VFCache reduced storage device IOPS for Oracle and SQL Server OLTP data LUNs respectively.

Figure 11. Storage device IOPS before and after VFCache is enabled

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 32 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Table 18 shows detailed results of running workloads on the source array for Oracle Three-tier FAST VP and SQL Server OLTP applications before and after enabling VFCache on the three-tier performance FAST VP. results With Flash, FC, and SATA tiers in the FAST VP pool, after enabling VFCache:

• The storage IOPS reduced significantly from more than 40,000 to approximately 12,000, because the VFCache offloaded the IOPS and the SAN-based storage could serve I/Os from other applications. As the result of the decrease in back- end storage utilization, the virtual machine and ESXi CPU utilization increased.

• There was no obvious increase in OLTP TPS because the three-tier FAST VP could serve the application with excellent performance.

• Response times decreased, because the VFCache solution cached the most frequently referenced data on the server-based PCIe card, thereby putting the data closer to the application.

Table 18. Detailed results before and after enabling VFCache

Three tiers Three tiers configured Components Performance configured with without VFCache VFCache VMAX VMAX OLTP IOPS (total) 40,605 12,275

ESXi 01 Average CPU utilization 65 percent 77 percent

ESXi 02 Average CPU utilization 42 percent 43 percent

Swingbench TPS 8,535 8,537

Average Oracle Oracle OLTP database response time 4 3 (ms)

vCPU utilization 84.4 percent 91.4 percent

SQL01 latency (ms) 7/8/7 3/5/3 (read/write/transfer)

SQL02 latency (ms) 3/4/4 2/4/2 (read/write/transfer) SQL Server

SQL01 vCPU utilization 30 percent 73 percent

SQL02 vCPU utilization 66 percent 81 percent

Transaction/sec 5,725 5,846

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 33 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Two-tier FAST VP without and with VFCache

With only FC and SATA tiers in the FAST VP pool, central storage may be a Overview performance bottleneck because the limited spindle numbers may not provide excellent response times for heavy workloads. However, with VFCache enabled on the virtual machines, the performance bottleneck can be overcome and the application continues to experience excellent levels of storage latency.

Test scenarios The VFCache offloading test had two scenarios: • Before enabling VFCache: All OLTP workloads were on VMAX with two-tier FAST VP enabled. The two-tier storage has limited spindles to provide excellent response times for read I/O intensive applications.

• After enabling VFCache on the virtual machines: VFCache can significantly improve application performance.

After disabling the Flash tier in FAST VP, most of the workload was served by the FC Two-tier FAST VP tier. The FC tier disk utilization was very high and the SAN-based central storage (FC OLTP performance and SATA) could not support enough IOPS or provide acceptable response times for results the I/O-intensive OLTP applications. The test results were as follows:

• The average FC disk IOPS was 142, which in theory is the maximum value for 10K FC disks.

• The maximum FC disk utilization was almost at 100 percent. • The total OLTP IOPS was approximately 10,000 on average; the storage could not serve more IOPS.

• For SQL Server OLTP the average disk response time was more than 20 milliseconds, and for Oracle, the application response time was more than 35 milliseconds. These times exceed vendor-recommended limits.

• CPU utilization for the ESXi and the virtual machine was low:

ƒ ESXi CPU utilization was less than 20 percent. ƒ Virtual machine CPU utilization was less than 5 percent.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 34 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Figure 12 shows the FC tier disk heat map. The red color means the disk workload was very high (hot). Utilization reached 100 percent of the total IOPS capacity.

Figure 12. FC tiers disk heat map without VFCache

Performance improved after enabling the VFCache on the Oracle and SQL Server OLTP Two-tier VFCache virtual machines. The test results were as follows: and FAST VP OLTP performance • The average FC disk IOPS was 43, which was approximately 30 percent of the results maximum capacity for 10K FC disk IOPS.

• The FC disk maximum utilization was approximately 65 percent. • For SQL Server OLTP, the average disk response time was no more than 3 milliseconds, and for Oracle, the application response time was no more than 3 milliseconds.

• ESXi and virtual machine CPU utilization increased greatly when compared with the results before VFCache was enabled:

ƒ On ESXi-1 CPU utilization increased from 16 percent to 21 percent, and on ESXi-2 increased from 2 percent to 40 percent.

ƒ The SQL Server OLTP database CPU utilization increased from less than 2 percent to 60-70 percent; for the Oracle database it increased from 48 percent to 89 percent.

VFCache can provide excellent response times for a read I/O intensive OLTP workload, and reduce the storage workload from the system. As a result, the system was able to handle more OLTP TPS. ESXi and virtual machine CPU usage increased because of the increased SQL Server and Oracle utilization with transactional processing.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 35 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Figure 13 shows the FC tier disk heat map. The yellow color means the disk workload was normal compared with those in Figure 12. The previous heavy workload on this tier was removed by VFCache.

Figure 13. FC tiers disk heat map with VFCache Figure 14 shows that high Oracle latency and high SQL Server data LUN latency on the two-tiered SAN-based storage can be effectively eliminated by adding VFCache to the virtual machine hosting the applications. The average response times fell by approximately six to ten times for SQL Server OLTP and Oracle, respectively.

Figure 14. Oracle and SQL Server latency before and after VFCache

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 36 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Table 19 shows the detailed performance metrics for storage, ESXi, and virtual Two-tier FAST VP machines of the running workloads on the source array for Oracle and SQL Server performance OLTP applications before and after enabling VFCache on the two-tier FAST VP. With FC results breakdown and SATA tiers in the FAST VP pool, after enabling VFCache:

• OLTP application TPS and response times significantly improved because VFCache can offload read I/O processing from the storage array, while reducing disk latencies, thus enabling higher transactional throughput. It can address “hot-spots” in the data center and alleviate high utilization of a two-tier FAST VP storage environment.

• CPU usage increased because of the increased SQL Server and Oracle utilization with transactional processing. With VFCache enabled the system was able to handle more SQL Server and Oracle TPS.

Table 19. Performance metrics for storage, ESXi, and virtual machines

Two tiers configured Two tiers configured Components Performance without with VFCache VFCache VMAX VMAX IOPS 23,514 13,798

ESXi 01 Average CPU utilization 16 percent 21 percent

ESXi 02 Average CPU utilization 2 percent 40 percent

Swingbench TPS 6,653 8,590

Average Oracle Oracle OLTP 35 3 response (ms)

vCPU utilization 47.6 percent 89 percent

SQL01 latency (ms) 22/4/21 3/2/3 (read/write/transfer)

SQL02 latency (ms) 21/4/21 2/3/2 (read/write/transfer) SQL Server SQL01 vCPU utilization 1.20 percent 69.84 percent

SQL02 vCPU utilization 1.46 percent 63.04 percent

Transaction/sec 2,073 6,054

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 37 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 VFCache impact on FAST VP

In this solution, we evaluated the impact of VFCache on FAST VP to measure if Overview VFCache offloading caused an impact on the existing three-tiered FAST VP.

FAST VP moves data between tiers. The ingress and egress track records the data moving in or out from each tier. If there are many ingress tracks and egress tracks per second, the back-end performance is impacted when the application is running on the corresponding tiers. Generally, FAST VP has a quality of service (QoS) setting to control the reallocation rate of the data movement. The minimum value of the setting is 10; this means 1 GB/sec is the maximum moving rate. If the ingress/egress of each tier is far less than this value, the impact to the back-end is minimal.

In the solution, the ingress/egress track of each tier (FLASH, FC, and SATA) was monitored when we enabled the VFCache on the virtual machines. The purpose was to evaluate if the VFCache causes lots of data movement between the FAST VP controlled storage pools, and if the FLASH tier can be released automatically when we enable the VFCache.

In the three-tier FAST VP implementation, after enabling VFCache, the Flash tier Three-tier FAST VP ingress/egress tracks per second were less than 200 tracks per second, as shown in behavior with Figure 15, Figure 16, and Table 20. VFCache These graphs show that VFCache started taking the load off the Flash tier of the three- tier FAST VP to the FC and SATA tier, freeing up these resources for other I/O intensive applications. This movement was controlled by FAST VP.

Figure 15. Track ingress of three tiers after enabling VFCache on the three-tiered FAST VP

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 38 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5

Figure 16. Track egress of three tiers after enabling VFCache on the three tiered FAST VP

Table 20. Average track ingress/egress per second

FC tier Flash tier SATA tier FAST ingress track per second 78.03 1.68 55.68

FAST egress track per second 49.55 78.30 18.95

The maximum ingress or egress data between tiers was approximately 200 x 64 KB = 13 MB/sec. The moving rate has a minimum impact to the back-end disk workload. Since FAST VP has a long demotion period (demoting the cold data from the Flash or FC tier to the lower tier), the running workload with VFCache does not actively demote the data. This means that although the previously frequently accessed data can be served by VFCache and the workload is freed from the Flash tier, the Flash capacity can still be occupied. The Flash tier in the FAST VP can be proactively freed to handle more IOPS.

While VFCache can accelerate the performance for the SQL Server or Oracle OLTP workloads, FAST VP with Flash tier is still complementary to VFCache. The central storage can be a cost-effective solution with VFCache. When the read-intensive I/Os are served by VFCache, FAST VP with Flash tier enabled can handle other I/O requests.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 39 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 VFCache with DSS workload

We tested VFCache with a DSS workload as the cache and as the direct attached Overview storage (DAS) store for the SQL Server instance tempdb.

We carved a 200 GB VFCache LUN from the 700 GB VFCache card, which was tested VFCache as the as a cache device to accelerate the TPC-H-like database data LUNs, as shown in cache Figure 17.

The Max IO parameter, highlighted in Figure 17, is the maximum cached I/O size for VFCache, which means that when considering using VFCache, users can adjust the maximum cached I/O size for different application I/O patterns. In this solution, it was set to 128 KB for DSS workload. This means that any I/O size less than 128 KB is cached; whereas all other I/O greater than 128 KB is bypassed by VFCache.

Figure 17. LUN creation Figure 18 illustrates the space allocation for DSS from the 700 GB VFCache card.

Figure 18. VFCache allocation for DSS

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 40 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 We performed the baseline SQL Server DSS performance test with other application (Oracle and SQL Server OLTP) workloads running together with FAST VP enabled. We defined a performance baseline to represent the DSS environment after applying the FAST VP policies, which stabilized the workload as follows:

• The average bandwidth was 650 MB/s • The peak bandwidth was more than 1.2 GB/s After enabling VFCache, no actual query bandwidth increase was observed. The reason is that the VFCache is not large enough to cache the entire data/index of the 2 TB DSS database.

In this case, we tried to cache the DSS workload which is not very effective because of the nature of the workload. Another option is to use the split-card functionality of VFCache, where a part of the PCIe card can be used for caching and the other part of the PCIe card can be used for storing the temporary data structures of the application. The performance benefits of this scenario are shown in VFCache as the DAS store.

SQL Server tempdb was heavily used for sorting while the DSS query was running. In VFCache as the this test, a 200 GB VFCache was carved from the 700 GB disk and was used as the DAS store tempdb database data and log store. One TPC-H-like query (Q2 in the 22 TPC-H-like queries) was picked up as the DSS query to measure the performance for VFCache as the DAS store for tempdb, with five concurrent executions. Table 21 shows the performance results before and after using VFCache as the DAS store for tempdb.

Table 21. Performance comparison before and after using VFCache

Without VFCache With VFCache Bandwidth (MB/sec) 270 396

Average LUN latency(ms) 13 1

Average read latency 13 1

Average write latency 3 1

Maximal LUN latency 84 20

74.7 vCPU utilization percent 89.4 percent

As the tempdb store for DSS workloads, the VFCache DAS store can:

• Increase the actual query bandwidth from 270 MB/sec to 396 MB/sec • Reduce the average tempdb data LUN latency from 13 ms to 1 ms • Reduce the peak tempdb data LUN latency from 84 ms to 20 ms Because of the faster I/O for tempdb to store the TPC-H-like query intermediate results, the application was able to execute more queries and consequently the virtual machine CPU utilization increased accordingly.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 41 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 VFCache with SQL Server failover cluster instance

This section introduces VFCache card supports for a SQL Server FCI on an active and Overview passive Windows clustered environment for OLTP applications.

SQL Server within a SQL Server failover cluster (with multiple server cluster nodes) Microsoft failover can also be accelerated using VFCache, as the data is first written to the shared clustering storage device, in this case VMAX, and then synchronously to the VFCache device. In (active/passive) the case of a failover, the SQL Server virtual instance can be started on the standby support node and stopped on the active node if accessible, and continues to write to the shared LUN, as normal operation. If the new node has VFCache, SQL Server I/O caching now begins on that new cluster node. The previous node's (failover from) VFCache device no longer receives I/O, as the application has moved.

When the SQL Server fails back to the original node, the application retrieves data from the cache device, but now, this device can contain stale data. Configuring the supplied VFCache clustering script ensures that stale data is never retrieved. The scripts use Cluster Management events that relate to an application service start/stop transition to trigger a mechanism that purges VFCache application cache.

Cluster support is currently provided for clusters configured to operate in active/standby mode, where VFCache is installed and operating on the single, active node and on any combination of the standby nodes. Only one application in a cluster can use VFCache. To use VFCache for several applications, you must configure these applications as separate resources in a single application, so that they fail over between hosts as a single unit.

The following steps and recommendations outline how to configure the VFCache for Microsoft Cluster Service:

1. Create the VFCache resource in the SQL Server instance that uses VFCache using Add a resource > Generic Script.

2. Unzip the VFCache_State_Control.vbs under the EMC VFCache installation folder and save it into the same folder on the active and passive node of the clustered SQL Server. For example, put it under C:\Program Files\EMC\VFC\VFCache_Cluster_Support_1.5\VFCache_Cluster_Support\Mic rosoft_Cluster_Service.

3. Bring the resource (Generic Script) online. 4. Set the dependence: ƒ For the VFCache resource, set it depending on the accelerated shared source LUNs.

ƒ For the SQL Server service, set it depending on the VFCache resource.

5. On the passive node, run the script to release all source devices and enable their acquisition by an active node:

vfcmt set -clustermode passive For more information, refer to EMC VFCache Installation and Administration Guide.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 42 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5

The following steps are used to validate the A/P Microsoft Cluster Service clustering Validation supported and accelerated by VFCache.

1. On the primary SQL Server, create a test table and insert 40,000 rows. One column type is the varchar.

2. Set the VFCache cache and source LUN to accelerate the source LUN stored in the database and the test table.

3. Query all data in the SQL Server instance to get the ‘to-be-stale’ data into the VFCache.

4. Fail over the SQL Server instance to the passive node and update each row to set the varchar column to a new string.

5. Fail back the SQL Server instance, query the table, and validate it with and without the VFCache clustering script enabled.

Table 22 shows results without and with the VFCache clustering script enabled.

Table 22. Results with and without VFCache clustering script

No VFCache clustering Enable VFCache script clustering script Dirty read Yes No

User database Suspect and need to Healthy status manually restore

Shared LUN status The file system structure Healthy on the disk is corrupt and unusable

SQL Server service Offline because of the Online disk error

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 43 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 Conclusion

This EMC solution has shown the implementation of multiple, business-critical Summary applications in a VMware private cloud environment hosted by VMAX 40K storage and VFCache installed on the ESXi server. Each application had different workload characteristics and placed varying demands on the underlying storage. VFCache provides better performance for the applications that involve heavy read I/O:

• With three-tier FAST VP configuration, VFCache significantly offloads array IOPS. Arrays were then freed up for other I/O requests.

• With two-tier FAST VP configuration, VFCache can improve application performance with excellent response times.

The key findings of the tests show that: Findings • VFCache improves OLTP performance by offloading much of the read I/O traffic from the storage array. In this solution, 70 percent of IOPS is offloaded to VFCache from the storage array.

• VFCache solidly supports OLTP workloads. When the SAN-based central storage has limited spindles to support the read I/O intensive workload or removes the Flash tier from FAST VP for other applications, the impact to FAST VP is minimal. In this solution, the average response time for the Oracle database decreased to 3 ms from 35 ms which is almost a 12 times improvement. The SQL Server database TPS increased to 162 from 56 which is an almost three times enhancement.

• VFCache works well with a failover clustered SQL Server instance and ensures source LUN acceleration while guaranteeing data integrity.

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 44 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5 References

For additional information, see the white papers listed below. White papers • VFCache Installation and Administration Guide v1.5 • VFCache Installation Guide for VMware 1.5 • VFCache Troubleshooting Guide 1.5 • VFCache Troubleshooting Guide for VMware v1.5 • VFCache VMware VSI Plug-in Guide 1.5 • Implementing Virtual Provisioning on EMC Symmetrix VMAX with Oracle Database 10g and 11g—Applied Technology

• EMC Mission Critical Infrastructure for Microsoft SQL Server 2012 • Provisioning EMC Symmetrix VMAXe Storage for VMware vSphere Environments

• Maximize Operational Efficiency for Oracle RAC with EMC Symmetrix FAST VP (Automated Tiering) and VMware vSphere—An Architectural Overview

• EMC Symmetrix Virtual Provisioning—Applied Technology • FAST VP Theory and Practices for Planning and Performance—Technical Notes • Best Practices for Fast, Simple Capacity Allocation with EMC Symmetrix Virtual Provisioning—Technical Notes

• Implementing Fully Automated Storage Tiering for Virtual Pools (FAST VP) for EMC Symmetrix VMAX Series Arrays

• EMC Storage Optimization and High Availability for Microsoft SQL Server 2008 R2

For additional information, see EMC Solutions Enabler Symmetrix Array Controls CLI Product Version 7.4 Product Guide. documentation

For additional information, see the documents listed below. Other documentation • SQL Server Best Practices • Oracle Grid Infrastructure Installation Guide 11g Release 2 (11.2) for Linux • Oracle Real Application Clusters Installation Guide 11g Release 2 (11.2) for Linux

• Oracle Database Installation Guide 11g Release 2 (11.2) for Linux • Oracle Database Storage Administrator's Guide 11g Release 2 (11.2)

EMC Infrastructure for High Performance Microsoft and Oracle Database Systems 45 EMC Symmetrix VMAX 40K, EMC VFCache, NEC Express5800/A1080a-E, and VMware vSphere 5