VPLEX: INTERESTING USE CASES

EMC Proven Professional Knowledge Sharing 2011

Sandeep Choudhary Puneet Goyal Infrastructure Management Associate Senior Systems Specialist – [email protected] and Storage HCL Comnet [email protected]

Table of Contents EXECUTIVE SUMMARY ...... 3 WHAT’S NEW IN MY GENERATION? ...... 3 ABSTRACT...... 4 INTRODUCTION ...... 6 ARRAY-BASED WITH VPLEX ...... 7 USE CASE 1: BASIC ARRAY-BASED CLONES WITH VPLEX ...... 7 USE CASE 2: ADVANCED ARRAY-BASED CLONES WITH VPLEX ...... 10 USE CASE 3: BASIC ARRAY-BASED RESTORE AND RECOVERY WITH VPLEX ...... 14 USE CASE 4: ADVANCED ARRAY-BASED RESTORE WITH VPLEX...... 17 CACHE IS KING!! ...... 21 I/O IMPLEMENTATION ...... 21 CACHE LAYERING ROLES ...... 21 CACHE COHERENCE ...... 22 META-DIRECTORY ...... 22 HOW A READ IS HANDLED ...... 23 HOW A WRITE IS HANDLED ...... 24 WHAT HAPPENS WHEN CLOUDS RAIN? ...... 26 MANAGING THE RISK ...... 26 INCREASED DATA SECURITY ...... 27 IN FUTURE, CDP USING VPLEX ...... 28 DATA PROTECTION CHALLENGES ...... 28 PROTECT AND RECOVER ...... 29 DATA PROTECTION OPTIONS ...... 30 CONTINUOUS DATA PROTECTION WITH APPLICATION BOOKMARKS ...... 31 CDP USING VPLEX ...... 31 WHAT IS THE USE OF SRM IF YOU HAVE A VPLEX? ...... 35 GLOSSARY ...... 37

Disclaimer: The views, processes, or methodologies published in this article are those of the authors. They do not necessarily reflect EMC Corporation‘s views, processes, or methodologies.

EMC Proven Professional Knowledge Sharing 2

EXECUTIVE SUMMARY For years, users have relied on ―physical storage‖ to meet their information needs. Now, evolving changes, such as virtualization and the adoption of Private Cloud computing, have placed new demands on how storage and information is managed.

To meet these new requirements, storage must evolve to deliver capabilities that free information from a physical element to a virtualized resource that is fully automated, integrated within the infrastructure, consumed on demand, cost effective and efficient, always on, and secure. The technology enablers needed deliver this combine unique EMC capabilities such as FAST, federation, and storage virtualization.

The result is a next-generation Private Cloud infrastructure that allows users to: 1. Move thousands of virtual machines (VMs) over thousands of miles. 2. Batch process in low-cost energy locations. 3. Enable boundary/boundary-less workload balancing and relocation. 4. Aggregate separate data centers to form big data centers. 5. Deliver ―24 x forever‖ – and run or recover applications without ever having to restart.

WHAT’S NEW IN MY GENERATION? Streamline storage refreshes, Within cross and between data consolidations, and migrations. centers over distance.

Simplify multi-array allocation, And enable information to be management, and provisioning ―ACCESSED ANYWHERE‖.

Pool storage capacity to extend And provide ―JUST-IN-TIME‖ useful life for N-1 storage assets storage services via scale out.

EMC Proven Professional Knowledge Sharing 3

ABSTRACT Both VPLEX™ Local and VPLEX Metro deliver significant value to customers. Some use cases are:

For VPLEX Local 1. Data mobility between EMC and non-EMC storage platforms. VPLEX allows users to federate heterogeneous storage arrays and transparently move data across them to simplify and expedite data movement, including ongoing technology refreshes and/or lease rollovers. 2. Simplified management of multi-array storage environments. VPLEX provides simple tools to provision and allocate virtualized storage devices to standardize LUN presentation and management. The ability to pool and aggregate capacity across multiple arrays can also help improve storage utilization. 3. Increased storage resiliency. This allows storage to be mirrored across mixed platforms without requiring host resources. Leveraging this capability can increase protection and continuity for critical applications.

Figure 1 Use cases

For VPLEX Metro 1. In a single data center for moving, mirroring, and managing storage. 2. In multiple data centers for workload relocation, disaster avoidance, and data center maintenance.

EMC Proven Professional Knowledge Sharing 4

For Metro-Plex 1. Mobility and relocations between locations over synchronous distances. In combination with VMware and VMotion over distance, VPLEX Metro allows users to transparently move and relocate virtual machines and their corresponding applications and data over distance. This provides a unique capability allowing users to relocate, share, and balance infrastructure resources between data centers. 2. Distributed and shared data access within, between, and across clusters within synchronous distances. A single copy of data can be accessed from multiple users across two locations. This allows instant access to information in real time, and eliminates the operational overhead and time required to copy and distribute data across locations. 3. Increased resiliency with mirror volumes within and across locations. VPLEX Metro provides non-stop application availability in the event of a component failure.

Figure 2 Cluster configuration

EMC Proven Professional Knowledge Sharing 5

INTRODUCTION EMC VPLEX is a storage network-based federation solution that provides nondisruptive, heterogeneous data movement and volume management functionality. VPLEX is an application-based solution that connects to SAN Fibre Channel switches. The VPLEX architecture is designed as a highly available solution and as with all data management products, high availability (HA) is a major component in most deployment strategies.

EMC VPLEX encapsulates traditional physical storage array devices and applies three layers of logical abstraction to them. The logical relationships of each layer are shown in Figure 3.

Extents are the mechanism VPLEX uses to divide storage volumes. Extents may be all or part of the underlying storage volume. EMC VPLEX aggregates extents and applies RAID protection in the device layer. Devices are constructed using one or more extents and can be combined into more complex RAID schemes and device structures as desired. At the top layer of the VPLEX storage structures are virtual volumes. Virtual volumes are created from devices and inherit the size of the underlying device. Virtual volumes are the elements VPLEX exposes to hosts using its FE ports. Access to virtual volumes is controlled using storage views. Storage views are comparable to auto-provisioning groups on EMC Symmetrix® or to storage groups on EMC ®. They act as logical containers determining host initiator access to VPLEX FE ports and virtual volumes.

Figure 3 EMC VPLEX logical storage structures

EMC Proven Professional Knowledge Sharing 6

ARRAY-BASED REPLICATION WITH VPLEX Preserving investments in array-based storage replication is critically important in today‘s IT environment. EMC designs with this goal in mind and the VPLEX product family is no exception. By following the VPLEX best practices outlined here, array-based replication tools can continue delivering expected functionality and value within your IT infrastructure. As you will see, one of the ways VPLEX preserves array replication technologies is by mapping storage volumes in their entirety (one-to-one mapping) through VPLEX. When done in this fashion, the underlying storage devices are left untouched by VPLEX and the back-end array LUN replication technology continues to function normally.

An important key to each of the following VPLEX use case examples is the read cache invalidation step. Per-volume read cache invalidation happens when VPLEX virtual volumes are removed from storage views. The act of removing a VPLEX virtual volume from a storage view causes the read cache for the corresponding volume to be invalidated (discarded). This is of the utmost importance with array-based replication and restore due to the fact that the array is writing to the back-end storage device outside of the VPLEX I/O path. When this happens, the VPLEX read cache does not match the data on the physical disk on the array. This has the potential to cause data corruption. To avoid this scenario, it is critical that each VPLEX virtual volume is removed from the storage view it is a member of while array- based (non-VPLEX I/O path) writes are taking place. Once the replication activities involving writes are completed and the underlying back-end storage is in the desired state, the virtual volume can be added back into a storage view and be accessed normally.

USE CASE 1: BASIC ARRAY-BASED CLONES WITH VPLEX The simplest use case for array-based replication with VPLEX is a clone or full disk array-based copy. Clones can be presented through VPLEX to a backup host, test environment, or even back to the original host. The setup for this use case is illustrated in Figure 5. This figure examines both the initial presentation of the array-based clones through VPLEX and the subsequent re-synchronization of the clones. We assume each array-based clone has a one-to-one storage volume pass-through configuration (device capacity = extent capacity = storage volume capacity) to VPLEX and has a RAID 0 (single extent only) VPLEX device geometry.

EMC Proven Professional Knowledge Sharing 7

Figure 4 Basic array-based clones with VPLEX

Step-by-step plan to use an array-based clone of a VPLEX virtual volume:

1. Within the array, identify the source storage volumes from which you want to make array-based clones.

2. Follow array-based cloning procedure(s) to generate the desired array-based clones. Reference your specific array-based replication documentation for exact commands and procedures.

To perform VPLEX-specific steps: 3. Confirm that the clone devices are visible to VPLEX. As necessary, perform array LUN masking and SAN zoning for storage volumes containing clones to VPLEX back-end ports.

4. Perform one-to-one encapsulation through VPLEX: a. Claim storage volumes b. Create extents (using the –appc flag) c. Create devices (single extent RAID 0 geometry) d. Create virtual volumes for each array-based clone

5. Present virtual volumes based on clones to host(s). a. If necessary, create storage view(s). b. Add virtual volumes built from array-based copies to storage view(s). c. If necessary, perform zoning of virtual volumes to hosts following normal zoning procedures.

EMC Proven Professional Knowledge Sharing 8

To resynchronize array-based clones once they‘ve been presented through VPLEX, follow these steps: 1. Shut down any applications using the VPLEX array-based clones and, if necessary, unmount the associated virtual volumes to be resynchronized.

2. Remove access to the virtual volumes constructed from array-based clones. Be sure to note the LUN number for the virtual volume you plan to remove. /clusters//exports/storage-views> removevirtualvolume -v storage_view_name -o virtual_volume_name –f

3. Wait 30 seconds to ensure that the VPLEX read cache has been invalidated for each virtual volume. This can be done concurrently with step 4. Perform normal array-based resynchronization procedure(s).

4. Identify the source storage volumes within the array you wish to resynchronize. Follow your normal array resynchronization procedure(s) to refresh the desired array-based clones.

5. Confirm the IO Status of storage volumes based on array-based clones is ―alive‖ by doing a long listing against the storage-volumes context for your cluster.

For example:

In addition, confirm VPLEX back-end paths are healthy by issuing the ―connectivity validate-be‖ command from the VPLEX CLI. Ensure that there are no errors or connectivity issues to the back-end storage devices. Resolve any error conditions with the back-end storage before proceeding. Example output showing desired back-end status:

EMC Proven Professional Knowledge Sharing 9

6. Restore access to virtual volumes based on clone devices for host(s). Add the virtual volume back to the view, specifying the original LUN number (noted in step 2) using the VPLEX CLI: /clusters//exports/storage-views> addvirtualvolume -v storage_view_name/ -o (lun#, virtual_volume_name) -f

7. Rescan devices and restore paths (for example, powermt restore) on hosts.

8. If necessary, mount devices.

9. Restart applications.

Note: VPLEX does not change the rules for array-based clones. When presenting an array-based clone back to the same host, you will need to confirm both the host operating system and the logical volume manager to support such an operation. If the host OS does not support cloning and representation of the same volume, VPLEX will not change this situation.

USE CASE 2: ADVANCED ARRAY-BASED CLONES WITH VPLEX When VPLEX virtual volumes have RAID 1 geometry, the array-based cloning process becomes more complex. For this use case, we assume a RAID 1 device layout for VPLEX consisting of one mirror leg that is an array-based clone and a second mirror leg that is not a clone. Figure 6 illustrates the advanced array-based clone configuration use case. This configuration applies to both local (VPLEX Local) RAID 1 and distributed (VPLEX Metro) RAID 1 devices. Using array-based clones in this type of setup requires a few tweaks to the standard clone creation and clone resynchronization processes. These additional steps are critical to ensure proper mirror synchronization (for the non-clone leg) and to ensure each virtual volume‘s read cache is properly updated. Figure 6 illustrates the case of an array- based clone that is one leg of a distributed RAID 1 (two-leg) volume.

EMC Proven Professional Knowledge Sharing 10

Figure 5 Advanced array-based clones with RAID 1 VPLEX storage volumes

Prerequisites This section assumes you are using existing distributed or local RAID 1 VPLEX virtual volumes built from storage volumes that are array-based copies. In addition, the VPLEX virtual volumes must possess both of the following attributes: • Be comprised of devices that have a one-to-one storage volume pass-through configuration to VPLEX (device capacity = extent capacity = storage volume capacity). • Have a single device with single-extent RAID 1 (two single-extent devices being mirrored) geometry.

Follow these steps to use an array-based copy of a RAID 1 VPLEX virtual volume: 1. Quiesce the application environment following normal local array disk restore preparation procedures. This step is of particular importance as all paths to the VPLEX volumes being restored will become temporarily unavailable during subsequent steps. The goal is to have no read or write activity to array- based copies where the VPLEX volumes reside during the initial synchronization and subsequent resynchronization processes.

2. Remove host access to corresponding VPLEX volumes by removing them from all storage views. a) If the virtual volume is built from a local device and/or is a member of a single storage view, using the VPLEX CLI, run: /clusters//exports/storage-views> removevirtualvolume -v storage_view_name -o virtual_volume_name –f

EMC Proven Professional Knowledge Sharing 11

b) If the virtual volume is built from a distributed device and is a member of storage views in both clusters, using the VPLEX CLI, run: /clusters//exports/storage-views> removevirtualvolume -v storage_view_name -o distributed_device_name_vol -f /clusters//exports/storage-views> removevirtualvolume -v storage_view_name -o distributed_device_name_vol –f

Note: It is vital to keep track of the VPLEX-assigned LUN number for each virtual volume you plan to resynchronize. You can obtain this information by executing a long listing (ll) in the corresponding storage view context from the VPLEX CLI, or by clicking on the storage view from the VPLEX Management Console.

3. Detach the VPLEX device mirror leg that will not be updated during the array-based replication or resynchronization processes: a) For distributed RAID 1 devices, turn off logging: /distributed-storage/distributed-devices> set-log -n -d /distributed- storage/distributed-devices/distributed_device_name_vol b) Detach the mirror leg: device detach-mirror -m -d –i -f 4. Perform replication or resynchronization of array-based copies.

5. Confirm the I/O Status of storage volumes based on array-based clones is ―alive‖ by doing a long listing against the storage-volumes context for your cluster.

For example:

EMC Proven Professional Knowledge Sharing 12

In addition, confirm VPLEX back-end paths are healthy by issuing the ―connectivity validate-be‖ command from the VPLEX CLI. Ensure that there are no errors or connectivity issues to the back-end storage devices. Resolve any error conditions with the back-end storage before proceeding. Example output showing desired back-end status:

6. Reattach the second mirror leg: a) Attach the mirror: device attach-mirror -m <2nd_mirror_leg_to_attach> -d /clusters/local_cluster_name/devices/existing_raid_1_device b) Turn logging back on for distributed devices: /distributed-storage/distributed-devices> set-log -d /distributed- storage/distributed-devices/distributed_raid_1_device_1 Note: The device you are attaching is the non-clone mirror leg. It will be overwritten with the data from the clone mirror leg.

7. Restore host access to VPLEX volume(s).

If the virtual volume is built from a local RAID 1 device: /clusters//exports/storage-views> addvirtualvolume -v storage_view_name/ -o (lun#,device_Symm0191_065_1_vol/) -f If the virtual volume is built from a distributed RAID 1 device: /clusters//exports/storage-views> addvirtualvolume -v storage_view_name/ -o (lun#, distributed_device_name_vol)-f /clusters//exports/storage-views> addvirtualvolume -v storage_view_name/ -o (lun#, distributed_device_name_vol)-f

The lun# is the previously recorded value from step 2 for each virtual volume.

EMC Proven Professional Knowledge Sharing 13

Note: EMC recommends waiting at least 30 seconds after removing access from a storage view to restore access. Waiting ensures that the VPLEX cache has been cleared for the volumes. The array- based resynchronization will likely take 30 seconds, but if you are scripting, be sure to add a pause prior to performing this step.

Some hosts and applications are sensitive to LUN numbering changes. Use the information you recorded in step 3 to ensure that you use the same LUN numbering when the virtual volume access is restored. You do not need to perform full mirror synchronization prior to restoring access to virtual volumes. VPLEX will synchronize the second mirror leg in the background while using the first mirror leg as necessary to service reads to any unsynchronized blocks.

8. Rescan devices and restore paths (powermt restore) on hosts.

9. Mount devices (if mounts are used).

10. Restart applications.

USE CASE 3: BASIC ARRAY-BASED RESTORE AND RECOVERY WITH VPLEX The array-based restore process with VPLEX is similar to the previous array-based clone use cases. The primary difference is that the array is now writing data from the clone or ―gold copy‖ back to the primary production or source volume. This use case examines the array-based restore process and assumes the array-based clone or gold copy is accessible from the array containing the source device (restore target). It is further assumed that each source device has a one-to-one storage volume pass- through configuration (device capacity = extent capacity = storage volume capacity) to VPLEX and a RAID 0 (single extent only) device geometry. This is the most basic use case for array-based restore and recovery. Figure 7 illustrates the case where data is being written within a storage array from a clone (copy) or backup media to a storage volume (standard device) that is used by VPLEX.

EMC Proven Professional Knowledge Sharing 14

Figure 6 Basic array-based restore of VPLEX storage volumes

Follow these steps to perform an array-based restore to a VPLEX virtual volume: 1. Shut down any host applications using the VPLEX source devices and, as necessary, unmount the associated virtual volumes you need to restore.

2. Remove host access to the virtual volumes constructed from source devices. Be sure to note the LUN number for the virtual volume you plan to remove. /clusters//exports/storage-views> removevirtualvolume -v storage_view_name -o virtual_volume_name -f

3. Wait 30 seconds to ensure that the VPLEX read cache has been invalidated for each virtual volume. This can be done concurrently with step 4.

4. Perform normal array-based restore and/or recovery procedure(s). Identify the clones or gold copies to restore to the source devices with the array. Follow your normal array restore procedure(s) to refresh the desired source devices.

5. Confirm the I/O Status of storage volumes based on array-based clones is ―alive‖ by doing a long listing against the storage volumes context for your cluster.

EMC Proven Professional Knowledge Sharing 15

For example:

In addition, confirm VPLEX back-end paths are healthy by issuing the ―connectivity validate-be‖ command from the VPLEX CLI. Ensure that there are no errors or connectivity issues to the back-end storage devices. Resolve any error conditions with the back-end storage before proceeding.

Example output showing desired back-end status:

6. Restore access to virtual volume(s) based on source devices for host(s): Add the virtual volume back to the view, specifying the original LUN number (noted in step 2) using VPLEX CLI:

/clusters//exports/storage-views> addvirtualvolume -v storage_view_name/ -o (lun#, virtual_volume_name) -f

7. Rescan devices and restore paths (for example, powermt restore) on hosts.

8. If necessary, mount devices.

9. Restart applications.

EMC Proven Professional Knowledge Sharing 16

USE CASE 4: ADVANCED ARRAY-BASED RESTORE WITH VPLEX When VPLEX virtual volumes have RAID 1 geometry, the restore process must take into account this added complexity. This applies for both local (VPLEX Local) RAID 1 and distributed (VPLEX Metro) RAID 1 VPLEX devices. The typical array-based source device restore only restores one of the two mirror legs of a VPLEX RAID 1 device. In order to synchronize the second VPLEX device, users need to modify standard array-based restore procedures. These steps are critical to ensure proper synchronization of the second VPLEX device mirror leg (the one that is not part of the array-based restore) and to ensure each virtual volume‘s read cache is properly updated. Figure 8 illustrates the case when an array-based clone is used to restore to a distributed RAID 1 source volume. This same use case applies to a remote array-based copy being restored to a source volume.

Figure 7 Advanced array-based restore and recovery of RAID 1 VPLEX volumes

Prerequisites This section assumes users have existing distributed or local RAID 1 VPLEX virtual volumes built from the array source devices being restored. In addition, the VPLEX virtual volumes must possess both of the following attributes: • The volumes must be comprised of devices that have a one-to-one storage volume pass-through VPLEX configuration (device capacity = extent capacity = storage volume capacity). • The volumes must have a single-extent RAID-1 (two single extents being mirrored) geometry.

EMC Proven Professional Knowledge Sharing 17

Follow the next steps to perform an array-based restore to a RAID 1 VPLEX virtual volume: 1. Quiesce the application environment following normal local array disk restore preparation procedures. This step is of particular importance as all paths to the VPLEX virtual volumes being restored will become temporarily unavailable during subsequent steps. The goal is to have no read or write activity to array-based copies where the VPLEX volumes reside during the initial synchronization and subsequent resynchronization processes.

2. Remove host access to corresponding VPLEX volumes by removing them from all storage views. a) If the virtual volume is built from a local device and/or is a member of a single storage view, using the VPLEX CLI, run: /clusters//exports/storage-views> removevirtualvolume -v storage_view_name -o virtual_volume_name -f b) If the virtual volume is built from a distributed device and is a member of storage views in both clusters, using the VPLEX CLI, run: /clusters//exports/storage-views> removevirtualvolume -v storage_view_name -o distributed_device_name_vol -f /clusters//exports/storage-views> removevirtualvolume -v storage_view_name -o distributed_device_name_vol –f

Note: It is vital to keep track of the VPLEX-assigned LUN number for each virtual volume you plan to restore. You can obtain this information by executing a long listing (ll) in the corresponding storage view context from the VPLEX CLI or by clicking on the host storage view from the VPLEX Management Console.

3. Detach the VPLEX device mirror leg that will not be restored during the array-based restore processes: a) For distributed RAID 1 devices, turn off logging: /distributed-storage/distributed-devices> set-log -n -d /distributed- storage/distributed-devices/distributed_device_name_vol b) Detach the mirror leg: device detach-mirror -m -d –i -f

4. Perform the array-based clone device to source device restore process.

EMC Proven Professional Knowledge Sharing 18

5. Confirm the I/O Status of storage volumes based on array-based clones is ―alive‖ by doing a long listing against the storage volumes context for your cluster.

For example:

In addition, confirm VPLEX back-end paths are healthy by issuing the ―connectivity validate-be‖ command from the VPLEX CLI. Ensure that there are no errors or connectivity issues to the back-end storage devices. Resolve any error conditions with the back-end storage before proceeding.

Example output showing desired back-end status:

6. Reattach the second mirror leg: a) Attach the mirror: device attach-mirror -m <2nd_mirror_leg_to_attach> -d /clusters/local_cluster_name/devices/existing_raid_1_device b) Turn logging back on for distributed devices: /distributed-storage/distributed-devices> set-log -d /distributed- storage/distributed-devices/distributed_raid_1_device_1

Note: The device you are attaching in this step will be overwritten with the data from the newly restored source device.

EMC Proven Professional Knowledge Sharing 19

7. Restore host access to the VPLEX volume(s).

If the virtual volume is built from a local RAID 1 device: /clusters//exports/storage-views> addvirtualvolume -v storage_view_name/ -o (lun#,device_Symm0191_065_1_vol/) –f

If the virtual volume is built from a distributed RAID 1 device: /clusters//exports/storage-views> addvirtualvolume -v storage_view_name/ -o (lun#, distributed_device_name_vol)-f /clusters//exports/storage-views> addvirtualvolume -v storage_view_name/ -o (lun#, distributed_device_name_vol)-f

The lun# is the previously recorded value from step 2 for each virtual volume.

Note: EMC recommends waiting at least 30 seconds after removing access from a storage view to restore access. This is done to ensure that the VPLEX cache has been cleared for the volumes. The array-based restore will likely will take 30 seconds, but if you are scripting be sure to add a pause. Some hosts and applications are sensitive to LUN numbering changes. Use the information you recorded in step 3 to ensure the same LUN numbering when you restore the virtual volume access. Full mirror synchronization is not required prior to restoring access to virtual volumes. VPLEX will synchronize the second mirror leg in the background while using the first mirror leg as necessary to service reads to any unsynchronized blocks.

8. Rescan devices and restore paths (powermt restore) on hosts.

9. Mount devices (if mounts are used).

10. Restart applications.

EMC Proven Professional Knowledge Sharing 20

CACHE IS KING!! It‘s no secret that one of the main selling points in an enterprise storage array is its ability to cache workloads and subsequently serve the I/O from the memory. It‘s fast. Much faster than disk. This is the main reason we are seeing the move to SSD for high IOPS workloads, as well as EMC‘s creation of FAST CACHE and NetApp‘s Flash cache (formerly known as PAM). How about instead of buying a VMAX, with tons of cache, you get some CLARiiON and put a VPLEX in front of it to take advantage of the 32GB of cache per director * 2 directors in a single engine, equaling 64GB of cache available to front-end your data. One of the many reasons to purchase an expensive Tier-1 class array such as a VMAX or HDS USP-V is for the massive amounts of cache you can outfit them with. Will it be possible for new storage architectures to utilize multiple mid-tier/lower class arrays (instead of VMAX…CX4s…or some AX4s?!?!), front-end them with VPLEX and gain equal performance (distributing the I/O among a bunch of VPLEX FE ports in an active/active manner with tons of cache out front) and reliability (through device mirroring with the VPLEX) but at a much lower price point than a Tier-1 array? Isn‘t one of the big selling points of an enterprise array TONS of FE ports with active/active access, and TONS of cache with maximum uptime? Granted, I am over-simplifying how a solution like this would be designed with relation to just purchasing a $1M Tier-1 array and calling it a day. Time will tell.

I/O IMPLEMENTATION The VPLEX cluster utilizes a write-through mode whereby all writes are written through the cache to the back-end storage. Writes are completed to the host only after they have been completed to the back-end arrays, maintaining data integrity. This section describes the VPLEX cluster caching layers, roles, and interactions. It gives an overview of how reads and writes are handled within the VPLEX cluster and how distributed cache coherency works.

CACHE LAYERING ROLES All hardware resources (CPU cycles, I/O ports, and cache memory) are pooled in a VPLEX cluster. Each cluster contributes local storage and cache resources for distributed virtual volumes within a VPLEX cluster. As shown in Figure 8, within the VPLEX cluster, the DM (Data Management) component includes a per-volume caching subsystem that provides the following capabilities: 1. Local Node Cache: I/O management, replacement policies, pre-fetch, and flushing (CCH) capabilities. 2. Distributed Cache (DMG – Directory Manager): Cache coherence, volume share group membership (distributed registration), failure recovery mechanics (fault-tolerance), RAID, and replication capabilities.

EMC Proven Professional Knowledge Sharing 21

Figure 8 Cache layer roles and interactions

Nodes export the same volume from a share group. This share group membership is managed through a distributed registration mechanism. Nodes within a share group collaborate to maintain cache coherence.

CACHE COHERENCE Cache coherence creates a consistent global view of a volume. Distributed cache coherence is maintained using a directory. There is one directory per user volume and each directory is split into chunks (4096 directory entries within each). These chunks exist only if they are populated. There is one directory entry per global cache page, with responsibility for: 1. Tracking page owner(s) and remembering the last writer 2. Locking and queuing

META-DIRECTORY Directory chunks are managed by the meta-directory, which assigns and remembers chunk ownership. These chunks can migrate using Locality-Conscious Directory Migration (LCDM). This meta-directory knowledge is cached across the share group for efficiency.

EMC Proven Professional Knowledge Sharing 22

HOW A READ IS HANDLED When a host makes a read request, VPLEX first searches its local cache. If the data is found there, it is returned to the host. If the data is not found in local cache, VPLEX searches global cache. Global cache includes all directors that are connected to one another within the VPLEX cluster. When the read is serviced from global cache, a copy is also stored in the local cache of the director from where the request originated.

If a read cannot be serviced from either local cache or global cache, it is read directly from the back-end storage. In this case both the global and local cache is updated to maintain cache coherency.

Figure 9 How did write works

I/O flow of a read miss 1. Read request issued to virtual volume from host. 2. Look up in local cache of ingress director. 3. On miss, look up in global cache. 4. On miss, data read from storage volume into local cache. 5. Data returned from local cache to host.

I/O flow of a local read hit 1. Read request issued to virtual volume from host.

EMC Proven Professional Knowledge Sharing 23

2. Look up in local cache of ingress director. 3. On hit, data returned from local cache to host.

I/O flow of a global read hit 1. Read request issued to virtual volume from host. 2. Look up in local cache of ingress director. 3. On miss, look up in global cache. 4. On hit, data read from owner director into local cache. 5. Data returned from local cache to host.

HOW A WRITE IS HANDLED All writes are written through cache to the back-end storage. Writes are completed to the host only after they have been completed to the back-end arrays. When performing writes, the VPLEX system Data Management (DM) component includes a per-volume caching subsystem that utilizes a subset of the caching capabilities: 1. Local Node Cache: cache data management, and back-end I/O interaction. 2. Distributed Cache (DMG – Directory Manager): Cache coherence, dirty data protection, and failure recovery mechanics (fault-tolerance).

Figure 10 How did write work?

EMC Proven Professional Knowledge Sharing 24

I/O flow of a write miss 1. Write request issued to virtual volume from host. 2. Look for prior data in local cache. 3. Look for prior data in global cache. 4. Transfer data to local cache. 5. Data is written through to back-end storage. 6. Write is acknowledged to host.

I/O flow of a write hit 1. Write request issued to virtual volume from host. 2. Look for prior data in local cache. 3. Look for prior data in global cache. 4. Invalidate prior data. 5. Transfer data to local cache. 6. Data is written through to back-end storage. 7. Write is acknowledged to host.

EMC Proven Professional Knowledge Sharing 25

WHAT HAPPENS WHEN CLOUDS RAIN? There are compelling—and widely understood—arguments that cloud computing bring efficiencies and savings. There are, however, also widely-held misconceptions that cloud computing bring serious risks to business information. We need to set the record straight.

The varied benefits of cloud computing are undoubtedly worth pursuing, and range from energy savings to greater effectiveness and better staff utilisation. But let‘s be blunt: cost-cutting tops most companies‘ lists of priorities in these challenging economic times. If you want to attract the managing director‘s attention, you need to talk about money—making more or spending less.

Having turned from futuristic possibility into increasingly well-established practice, the cost of ‗outsourcing to the cloud‘ is now falling dramatically. It is no longer rare for a company to consider cloud computing rather than in-house data storage, and the chance to save money is playing an increasingly important role in that decision.

With cloud computing, a company is charged for the use of software applications, and for data storage, accessed over the Internet, just like being charged for electricity. In only paying for the resources used, therefore, operating costs can be reduced. After all, in in-house data centres, 85 to 90 percent of available capacity is typically left idle. Cloud computing can lead to energy savings too, removing from individual companies the costly burden of running a data centre plus generator back-up and uninterruptible power supplies.

MANAGING THE RISK So cloud computing is on many people‘s radar this year, not least because of the attractions to budget- conscious and performance-orientated businesses. Which arguments, then, will win over the skeptics?

Realism helps. There are risks to cloud computing, just as there are risks to any IT migration. Managing and reducing those risks to an acceptable level is core to strategic success; first, when thinking about the options presented by cloud computing and second, when actually implementing the process.

The risk management process begins when choosing a service provider. Naturally, you need to be confident your business information will be secure. You need to carry out due diligence on the service provider before you entrust this firm with your vital data. Compliance questions should include looking at ISO27001 and European Union ‗Safe Harbor‘ certificates, Statement on Auditing Standard (SAS) 70 reports, and business continuity arrangements.

EMC Proven Professional Knowledge Sharing 26

The challenge for procurement professionals is determining which questions to ask, what assurances should be in the contracts, and how much risk is being assumed when a service is moved to the ‗cloud‘. The key is to know which paths are good for your organisation today and which paths are going to be better tomorrow.

Cloud service providers are not unified in their approaches, their methods, or their technologies. There are, for instance, as many ways to implement virtualisation as there are hardware and software manufacturers. The concept of the cloud, however, matches this diversity very nicely. Arguing about the details of whether you are paying for platform-as-a-service (PaaS) or software-as-a-service (SaaS) seems less important if you can receive both options from a single provider. By using ‗Everything-as-a- service‘ as a model, we can evaluate internal versus external hosted services for just about anything.

INCREASED DATA SECURITY Cloud computing in 2010 does not necessarily offer weaker data protection than an in-house server or data centre. In fact, cloud computing can help to defend an organisation from IT security threats such as denial-of-service attacks, viruses, and worms (self-propagating pieces of malicious software).

By moving IT functions to a shared external service provider, even the smallest companies benefit from a comprehensive range of the latest security protection systems. Those small (or medium-sized) enterprises would otherwise rarely, if ever, be in a position to buy and implement all those state-of-the- art defense systems independently. The cost—financially and in terms of time and human resources— would simply be too high.

And that question of staff utilisation is an important point for chief information officers. Outsourcing rarely-needed IT tasks and functions allows IT staff to focus on core work. Equally, rather than having an IT team spend valuable time monitoring the market for new products, and then facing the challenges of integrating those products into an organisation, cloud computing enables up-to-date software suites to be painlessly introduced across a company ‗from above‘ by the service provider.

There are a growing number of external security providers catering to the ‗cloud‘ and, because of the nature of networks, security monitoring can actually reside anywhere. Internet service providers are already capable of detecting viruses and worms in transit. A self-diagnosing and self-cleaning ‗cloud‘ might not be far behind. Cloud computing is not a new and frightening idea but an established, positive—and secure—IT option for many businesses. Not every cloud brings rain.

EMC Proven Professional Knowledge Sharing 27

IN FUTURE, CDP USING VPLEX

DATA PROTECTION CHALLENGES Protecting data is the key to a successful businesses operation. However, implementing real-time application recovery for critical data is not a simple proposition. The first step, even before analyzing data protection solutions, is to understand the current business processes and develop a clear set of objectives and plans that reflect what is required to safeguard against any disaster that could make the data at the primary site unavailable.

An evaluation of the data utilized by your business applications must be completed as part of designing a protection solution. The reason is that if the production volumes go offline due to a disaster, and your business processes must be restarted from a recovered image, how much delay and data loss can you tolerate before you are unable to restart production? According to the U.S. National Archives and Records Administration in Washington, D.C., 93 percent of companies that lost their corporate data for 10 days or more due to a disaster filed for bankruptcy within one year of the disaster. Of those companies, 50 percent filed for bankruptcy immediately. A PricewaterhouseCoopers survey calculated that a single incident of data loss costs businesses an average of $10,000 per hour.

There are two general principles that govern all recovery: recovery point objective (RPO) and return to operations (RTO). RPO defines how much data you are willing to lose when you recover data. For example, if you back up daily, the RPO would be 24 hours, which is the maximum amount of data loss that could occur between backup images. RTO defines the amount of time it takes to restart affected business applications from the recovered data. For example, once the data is recovered, it is necessary to restart the business applications based on the recovered data. This usually involves checking the recovered data for consistency, performing applicable log processing, starting the application with the recovered data, and then re-creating any missing data due to your RPO.

When evaluating a data protection solution, it is important to look at all of the capabilities of the solution. One of the first challenges to examine is the overall performance of the data protection solution. For example, will the solution handle highly transactional applications? Does the solution support the different RPO and RTO requirements for the different applications that are used? The recovery time of various solutions will also vary. As an example, recovery from a set of backup tapes may be on the order of hours to days but recovery from a host-based or array-based snapshot may take less time, on the order of minutes or days. In most use cases, complete systems and application recovery using RecoverPoint will take only minutes.

EMC Proven Professional Knowledge Sharing 28

A second area to evaluate is the costs inherent to the solution. These are not the fixed cost of the solution, but cover other items such as additional costs required to manage multiple or snapshot images. Also important to evaluate is the cost for data loss and application downtime. If there is a critical business application, such as real-time financial transactions, the business cannot afford to lose any data in the event of a disaster and it may be very important that the application comes back online in a matter of seconds without any noticeable impact to the end users. For other applications, a delay of a few minutes or hours may be tolerable. RecoverPoint dramatically reduces the amount of disk storage required for data protection and can support a zero RPO with its continuous data protection capability.

It is also important to evaluate the operational management impact of a data protection solution. Choosing the right recovery point is important to reduce the RTO. If you select an application-aware recovery point, you may considerably reduce the data consistency checking required by the application. If multiple point solutions are used, such as in a federated database environment, it is important to choose a product that can ensure consistent RPO and RTO across all the applications. Finally, while some solutions are ideal for data protection, they may offer little in the way of application-aware integration or may be challenged when it comes to supporting a test and development environment. RecoverPoint has application-aware integration with specific integration points for common applications such as Exchange, SharePoint, and SQL Server as well as for Oracle Database environments. A simple stand-alone as well as a complex federated environment can easily be supported with RecoverPoint. Finally, RecoverPoint is also integrated with other EMC software products, including NetWorker® and Replication Manager, and it offers a rich application programming interface that can be used to integrate RecoverPoint into existing customer configurations.

PROTECT AND RECOVER Companies are driven to develop operational recovery capabilities that can protect their e-mail, business applications, images, and database environments. Using RecoverPoint, customers can reduce their RPO to zero, ensuring no loss of data. With RecoverPoint‘s instant recovery capability, their RTO can be reduced to seconds, minutes, or hours as compared to the hours, days, or weeks of alternative solutions. RecoverPoint enables application-consistent or crash-consistent recovery point with granularity of a single write. Finally, RecoverPoint enables the federation of multiple critical applications across any supported storage infrastructure, enabling a true application environment restart.

EMC Proven Professional Knowledge Sharing 29

DATA PROTECTION OPTIONS Mission-critical applications usually require recovery aligned to the available RPO and RTO. For example, within any customer environment, there may be multiple applications, each with different data protection objectives for RPO and RTO. Common solutions for some of these applications include:  Daily operational backups for 24-hour operation protection with weekly full backups for longer- term archive

 Using periodic disk-based snapshots with remote replication to protect data in the event of disaster in a local site when the business needs to fail over to a remote location and be up and running in a short timeframe

 Using synchronous or asynchronous replication to enable quick recovery in the event of physical disk loss, particularly in test and development environments All of these solutions have challenges. The nightly backup may fail, or a data loss may occur 12 hours into the new backup period. If data loss occurs 12 hours into the new backup period, any new data created since the last backup is lost, since the system can be rolled back only to the last recovery point.

Data protection options

Disk-based snapshots provide a smaller recovery point window, usually as short as three hours; however there still exists a gap in recovery between the snapshots. Synchronous and asynchronous replication ensures there is no recovery window; however, both the production and mirrored data can be impacted by logical corruption.

EMC Proven Professional Knowledge Sharing 30

CONTINUOUS DATA PROTECTION WITH APPLICATION BOOKMARKS A new approach—continuous data protection (CDP)—is picking up interest in any type of environment that has short recovery objectives, including database or messaging applications such as Microsoft SQL Server or Exchange, Oracle, or SAP.

CDP uses a journal-based architecture that captures time-indexed recovery points, taking small aperture snapshots as small as a single write. Using this journal, CDP can ensure data recovery back to any point in time. Users can bookmark recovery points to recover back to specific points in time, such as the close of a quarter or a pre-patch state. It is also possible to create application-aware I/O bookmarks.

CDP USING VPLEX VPLEX already requires the use of journaling devices for the purpose of logging writes during cluster interconnect failures in the VPLEX Metro configurations. It does this so that when the cluster communication is resumed, it can sync the ―secondary‖ volume with just the writes that have occurred since the failure, instead of doing a full resync. So it wouldn‘t seem to be a large leap to think RecoverPoint type replication for the DR1 volumes utilizing the journals would be possible down the road. If this were the case, you could have superior replication scenarios available to you by dropping the traditional block based replication technologies found in current storage arrays and leveraging a ―CDP DR1″ volume through the VPLEX.

The following diagram illustrates these various storage objects again:

As you can see from the diagram, VPLEX offers a number of different ways to combine these objects:

EMC Proven Professional Knowledge Sharing 31

 You can slice a single storage volume into multiple extents, then use each of those extents to create a separate device (and then a virtual volume). The VPLEX device might only occupy one of the extents on the storage volume and other data might occupy other extents on the storage volume. This is the option on the left side of the figure.  You can combine extents from different storage volumes together into a single device (and then a virtual volume). This is the option on the right side of the figure.  Finally, you can create a single extent occupying all of a storage volume and use it to create a single device. This is the middle option illustrated above and, as you‘ll see shortly, is the generally recommended way of doing it.

Now that I‘ve reviewed the storage concepts again, let me delve into the real meat of this post. Since the introduction of EMC VPLEX and its storage federation functionality (which, apparently, I‘m not supposed to call storage federation), organizations have another choice in their disaster avoidance/disaster recovery (DA/DR) plans. In addition to utilizing data replication solutions, organizations now have the option of integrating VPLEX‘s data synchronization functionality into their DA/DR solution. Is there room for VPLEX in an organization‘s DA/DR planning? The answer is yes! While there has been a great deal of confusion about how, if at all, VPLEX could be used in conjunction with (rather than instead of) existing replication solutions, it is possible and it is fully supported (with a few caveats).

So what would a supported configuration that uses both VPLEX and replication look like? The following diagram graphically depicts a supported solution that integrates both EMC VPLEX and a data replication solution such as SRDF® or RecoverPoint:

EMC Proven Professional Knowledge Sharing 32

The key to a supported solution that combines VPLEX and replication lies in the phrase ―single extent/single device‖. In order to use both replication as well as VPLEX in the same solution, you must use a 1:1:1 mapping between storage volumes, extents, and devices. In other words, you must use a ―single extent/single device‖ approach, where each storage volume has only a single extent (occupying the entire storage volume), and devices are built from that single extent on a single storage volume.

If you think about it, you can easily see why this is the case. Let‘s consider a scenario in which you are wishing to combine VPLEX with RecoverPoint:

 What is the smallest unit of replication for RecoverPoint? A single LUN. Sure, you can replicate multiple LUNs or groups of LUNs, but the smallest unit of replication is a single LUN. You can‘t replicate part of a LUN.  What is the smallest unit of federation for VPLEX? An extent, which might be part of a back-end storage volume (a LUN). You can federate multiple extents or groups of extents, but you can‘t federate a part of an extent.  How do we get these two units lined up with each other? If RecoverPoint works only on entire LUNs and VPLEX works on extents, then the way to line them up is to ensure that each extent

EMC Proven Professional Knowledge Sharing 33

represents an entire LUN. This makes RecoverPoint‘s basic unit (an entire LUN) the same as VPLEX‘s basic unit (an entire LUN).

With your replication solution replicating entire LUNs and VPLEX working on entire LUNs, creates a situation where I/O can pass through VPLEX to the back-end array where the replication solution can pick it up and replicate it off to a third location. This gives organizations the best of both worlds—support for disaster avoidance using VPLEX at synchronous distances and support for disaster recovery using a replication solution such as RecoverPoint or SRDF at longer distances.

EMC Proven Professional Knowledge Sharing 34

WHAT IS THE USE OF SRM IF YOU HAVE A VPLEX? There is still a use for SRM. If you have a VPLEX Metro configuration, and one of your sites completely goes down, you still need some kind of process to bring up the VMs on your secondary site. Yes, the storage is available. Yes, you do not have to bring the volumes into r/w state and rescan them into virtual center. But you still need some way to power on the VMs. Since you are not stretching a VMware cluster across two sites (because it is bad practice for many reasons), you cannot rely on VMware HA to restart these virtual machines. SRM can still fill the role of powering up the servers in a consistent, automated, and pre-determined fashion to bring the business back on-line. Hence EMC is developing an SRM SRA for VPLEX. It all depends on the failure scenario, as there would definitely be some that would NOT benefit from SRM in the environment as well.

The really cool thing is that even if the VPLEX were purchased for any of the above use cases, you STILL get the capability of doing all the other neat stuff such as data mobility, data migrations, and so on, while being set up for a private cloud down the road. The way EMC has priced the solution, I think it will be attractive to some customers for these use cases. The neat thing about technology is there are always many uses for the same gadget.

If money were no object, or if you needed VPLEX anyway, where does that leave VMware‘s Site Recovery Manager?

Points to keep in mind:

1. First, and perhaps most importantly, VPLEX and VMware SRM is not an “OR” discussion, it‘s an “AND” discussion. There is a VPLEX SRA currently planned. There are some issues to work through—such as the current requirement by VMware SRM for two vCenter Server instances— but I‘m confident these issues will be resolved. 2. Second, the behavior of VPLEX in the event of an unplanned outage lends itself well to VMware SRM-like behavior. In the event that the VPLEX cluster in Site A (your primary site) loses connectivity to the VPLEX cluster in Site B (your secondary/DR/failover site), a set of rules defined by the user control which cluster will continue to have read/write access to distributed devices. The other cluster will suspend I/O to distributed devices until a user manually resumes I/O. This makes VPLEX behavior in an unplanned outage act a lot like VMware SRM already, and is probably one of the reasons why a VPLEX SRA is under development. As long as there are steps that can be automated in some programmatic way, there is continued value for VMware SRM. 3. Third, it‘s important to keep in mind that the requirements for vMotion over distance include spanned Layer 2 VLANs (using something like Overlay Transport Virtualization from Cisco); EMC Proven Professional Knowledge Sharing 35

VMware SRM has no such requirement. Further, VPLEX is currently limited to synchronous distances; VMware SRM is limited only by the underlying replication mechanisms. This means that VMware SRM continues to be a very valid deployment option even in organizations that may also deploy VPLEX.

The more applications you virtualize, the more you will be able to take advantage of products like VMware SRM and VPLEX. So what are you waiting for? Get virtualizing!

There are many solutions to a problem, and many problems to solve. It‘s all about marrying business requirements with technology. Surely, there are dozens of other non-standard use cases that this could be used for; these are but a few that came to mind for us.

EMC Proven Professional Knowledge Sharing 36

GLOSSARY A AccessAnywhere The breakthrough technology that enables VPLEX clusters to provide access to information between clusters that are separated by distance. active/active A cluster with no primary or standby servers, because all servers can run applications and interchangeably act as backup for one another. active/passive A powered component that is ready to operate upon the failure of a primary component. array A collection of disk drives where user data and parity data may be stored. Devices can consist of some or all of the drives within an array. asynchronous Describes objects or events that are not coordinated in time. A process operates independently of other processes, being initiated and left for another task before being acknowledged. For example, a host writes data to the blades and then begins other work while the data is transferred to a local disk and across the WAN asynchronously.

B bandwidth The range of transmission frequencies a network can accommodate, expressed as the difference between the highest and lowest frequencies of a transmission cycle. High bandwidth allows fast or high-volume transmissions. block The smallest amount of data that can be transferred following SCSI standards, which is traditionally 512 bytes. Virtual volumes are presented to users as a contiguous list of blocks.

C cache Temporary storage for recent writes and recently accessed data. Disk data is read through the cache so that subsequent read references are found in the cache. cache coherency Managing the cache so data is not lost, corrupted, or overwritten. With multiple processors, data blocks may have several copies, one in the main memory and one in each of the cache memories. Cache coherency propagates the blocks of multiple users throughout the system in a timely fashion, ensuring the data blocks do not have inconsistent versions in the different processors caches. cluster Two or more VPLEX directors forming a single fault-tolerant cluster, deployed as one to four engines.

COM The intra-cluster communication (Fibre Channel). The communication used for cache coherency and replication traffic. EMC Proven Professional Knowledge Sharing 37

controller A device that controls the transfer of data to and from a computer and a peripheral device.

D director A CPU module that runs GeoSynchrony, the core VPLEX software. There are two directors in each engine, and each has dedicated resources and is capable of functioning independently. disk cache A section of RAM that provides cache between the disk and the CPU. RAMs access time is significantly faster than disk access time; therefore, a disk-caching program enables the computer to operate faster by placing recently accessed data in the disk cache. distributed device A RAID 1 device whose mirrors are in geographically separate locations.

E extent A slice (range of blocks) of a storage volume.

G (GFS) A shared-storage cluster or distributed file system.

M metadata Data about data, such as data quality, content, and condition. metavolume A storage volume used by the system that contains the metadata for all the virtual volumes managed by the system. There is one metadata storage volume per cluster.

Metro-Plex Two VPLEX Metro clusters connected within metro (synchronous) distances, approximately 60 miles or 100 kilometers. miss An operation where the cache is searched but does not contain the data, so the data instead must be accessed from disk.

P plex A VPLEX single cluster.

R RAID leg A copy of data, called a mirror, that is located at a user's current location. remote direct memory access (RDMA) Allows computers within a network to exchange data using their main memories and without using the processor, cache, or operating system of either computer.

EMC Proven Professional Knowledge Sharing 38

S storage view A combination of registered initiators (hosts), front-end ports, and virtual volumes, used to control a hosts access to storage.

T tool command language (TCL) A scripting language often used for rapid prototypes and scripted applications.

W write-through mode A caching technique in which the completion of a write request is communicated only after data is written to disk. This is almost equivalent to non-cached systems, but with data protection.

EMC Proven Professional Knowledge Sharing 39