Using Disks in Backup Environment and Virtual Tape Library (VTL) Implementation Models Emin ÇALIKLI
Total Page:16
File Type:pdf, Size:1020Kb
Using Disks in Backup Environment and Virtual Tape Library (VTL) Implementation Models Emin ÇALIKLI EMC Proven Profesional Knowledge Sharing 2009 Emin ÇALIKLI Senior Technical Consultant Gantek Technologies – Turkey [email protected] Table of Contents Introduction .......................................................................................................................... 3 Do we really know the exact meaning of RTO & RPO?....................................................... 3 What do you expect from disks?.......................................................................................... 4 I/O pattern difference between OLTP Systems and Backup Systems................................. 6 Disk Space Utilization Problems on Every Tier.................................................................... 8 VTL implementation types.................................................................................................... 9 Summary............................................................................................................................ 13 References;........................................................................................................................ 13 Biography........................................................................................................................... 14 Disclaimer: The views, processes or methodologies published in this compilation are those of the authors. They do not necessarily reflect EMC Corporation’s views, processes, or methodologies Introduction IT professionals have compared storing data on tape and disk for many years; the demands remain the same. Most believe that disk storage will overcome tape technology, and we will use disks to meet all of our storage requirements. It is difficult to meet long term data retention needs with disk solutions due to cost. Still, a disk is less expensive than tape. Of course, it depends on your point of view and your application characteristics. Many learn about backup only when they need a consistent restore. We live in a digital world, and we have to protect critical information in every way possible. Do we really know the exact meaning of RTO & RPO? Unless we discover a time machine, it is impossible to predict disasters. If a disaster is possible, it is only a matter of time before it happens. Data Protection is a crucial service requiring high level commitment between the business and IT. There are two important terms to identify the “money loss rate” caused by a disaster. These are Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO). RPO represents “data loss tolerance” and RTO represents “how can I return my last consistent state in a given time?” The following picture represents a typical disaster flow: Usually, people focus on the RPO side and calculate their loss using RPO values. This is not sufficient. For me, RTO is the most important process for calculating monetary loss. I propose this basic formula: Amount of Loss = VLD + LTC + OC VLD (Value of Lost Data): The cost of data which is already processed until the disaster strikes. The loss rate depends on time rather than the amount of data. LTC (Loss Transaction Cost): The cost of missed data that could not be handled during the recovery process. (Ex: Customer transactions, deal processes, etc.) OC (Other Costs): Related Systems Downtime Costs, Operational Costs, Hidden Costs The ‘realize and determine’ processes make it impossible to recover the systems(s) in “0” seconds. If the subject is “RPO” we can approach “0” seconds using synchronous replication, Continuous Data Protection, Mirroring etc. Let’s use the following picture to review data protection levels. There are many data protection methods; usage behavior depends on your application availability requirements. What do you expect from disks? The largest effort is to obtain maximum throughput from the system on every tier. Fast response means fast data delivery; fast data delivery means satisfied customers. No one likes to wait in any type of queue; this is the key consideration for performance management. Disks are the only the single storage solution that we can use effectively for online applications. Memory and Solid State Disk (SSD) devices are expensive and have limited capacity. Expectations from disks include: 1- Throughput (IOPS) 2- Capacity (GB) 3- Bandwidth (MB/s) This order represents OLTP systems requirements from disk systems. Also, disks are designed for random access I/O types and are capable, if designed correctly, of obtaining sequential access requirements. We are usually combining disks to achieve improved performance and a reliable storage environment. Disks are not appropriate devices for long-term data storage purposes as they are more expensive than tapes. Furthermore, disks are very sensitive to physical shocks. If we are talking about tape, our priorities should be: 1- Bandwidth (MB/s) 2- Capacity (GB) 3- Throughput (IOPS) These requirements are the opposite of disk requirements. Tapes are not designed for random access operations; they are designed for sequential access. Backup operations require high bandwidth and sequential access. Also, tapes are the correct devices for long-term data retention needs, because no one expects rapid file delivery so they do not need to spin like disks. Tapes are very useful for backup and archive operations due to cost and performance. I/O pattern difference between OLTP Systems and Backup Systems Let’s see how the “Throughput (IOPS)” and “Bandwidth (MB/s)” performance are changing against I/O block size changes. This is a conceptual comparison of Throughput and Bandwidth; the graph lines are not as smooth in practice. This illustration tells us a lot about disk usage behavior in every tier of IT. Small Block Size means less bandwidth usage and high throughput; larger block size means high bandwidth and low throughput. This is why we have to consider I/O requirements when designing disk storage for an OLTP environment. Architects or designers should integrate the applications I/O characteristics and I/O distribution (amount of read and write requests) types for successful storage design. Disks could act as I/O-centric devices or Capacity-centric devices for the systems. Some applications require high throughput and some require high disk capacity. If we are talking about backup, disks are a good solution to cover near term data storage purposes for the backups. Cache is an indispensable device for storage systems since it improves read and write performance dramatically. However, if the subject is heavy “sequential access” via “big block sizes,” cache becomes a big hindrance to the system. Also, it depends on systems data characteristics and I/O handling performance. If we want use disks in a backup environment, we have to clarify following questions: ‐ How are the disks using by backup software? ‐ Do I have to use (or configure) storage cache? ‐ What is the backup applications’ block size? ‐ What are the estimated bandwidth requirements? We have to remember that backup is not a business requirement; availability and continuity are the business requirements. Disk Space Utilization Problems on Every Tier Storage utilization is another problem. It is difficult, and sometimes impossible, to move unused disk spaces between systems. This situation is called “White Space Utilization” or “Poor Utilization.” Storage administrators and server administrators often use different terms when talking about disk space utilization. Storage Administrators usually talk about “LUNs” and are not interested in the contents. System Administrators care about “Disk Sizes” or “Disk Performance” and are not interested in where it is located or how it is protected. Tapes are still less expensive than disks. Tape drives (writing heads), however, are very expensive so we need to use tape drives efficiently during backup operations. The problem is that typical backup clients cannot saturate tape drives efficiently. We need a staging area for low speed backup clients. At this point, a disk looks like a good choice. So, I need an intelligent backup device that has to solve these problems: ‐ Poorly utilized tape drive resources, ‐ Long backup windows, ‐ Higher tape costs, ‐ Tape management and vaulting problems, (Tape Security, Theft or lost tapes, damage risk etc.) A traditional “backup-to-disk-to-tape” (B2D2T) approach will not solve all of these problems. I don‘t want to be busy with Volume Management, RAID Configuration, Space Management, Disk-to-Tape Operation etc., I want to copy the (consistent) data to a secondary (or maybe third) storage area as soon as possible. Approximately 12 years ago, IBM© developed the first Virtual Tape Library (VTL) [1] for its mainframe systems. VTL is a (hypervisor like) “Virtualization Layer” that represents disks as a tape library. This new approach helped to improve backup performance and reduce the number of tape drives. Several years later, ISVs (Independent Software Vendors) developed similar VTL’s for Open Systems. In addition, VTL (in EMC’s naming terminology, EDL) is a good solution for reducing RPO / RTO values in backup environments. If you are using VTL you can eliminate these problems: ‐ Tape Drive hardware problems ‐ Tape Library partitioning problems ‐ Decreasing tape drive costs ‐ Highly utilized tape drive resources (VTL staging or post processing) ‐ Reducing backup windows ‐ Security problems, (backup encryption) Without de-duplication, replicating backup to another