Current Trends in Data Storage Backup and Restoration

February 13, 2003 Tom Coughlin Coughlin Associates www.tomcoughlin.com Outline vStorage Demand Drivers vBackup and Recovery Trends vMajor Trends in Backup

O Storage Hierarchy and Data Lifecycle

O Tape Storage

O Enhanced Backup

O Disk Drive for Backup/Recovery

O Form Factor Changes

O Electrical Interface Development

Information Details v Roughly 8 EB of digital data produced in 2002. v 90% of data on disk is never or seldom accessed after 90 days+ v 90% of digital data is on removable storage* v 80% of digital data is replicated data* v Disk utilization is often as low at 35-45% ^ v Disk storage is the most expensive component in the data center

+Horison Information Services *UC Berkeley ^Gartner/Credit Suisse Need for Storage Administration Data Protection v Provide Business Continuity Even If Data Is:

O Accidentally Erased or Modified

O Maliciously or Accidentally Modified

O Corrupted

O Catastrophically Lost v Maintain an Accurate, Up-to-Date Copy of the Data v Do Not Allow This Copy to Get Modified, Corrupted, or Lost v Use This Copy to Get Back in Business Quickly Disaster recovery Depends upon effective backup and rapid data recovery. Costs of Site Downtime Brokerage $5.6M - $7.3M Credit Card Authorization $2.2M - $3.1M Home Shopping $87k - $140k Airline Reservations $67k - $112k Subway Ticket Sales $56k - $82k Parcel Shipping $24k - $32k ATM $12k - $17k

This is why rapid recovery is critical!

Gartner Group / Dataquest Many Backups are through Networks

SANs connect: v Storage to Servers in the data center

IP connects v Users to Servers on the LAN or Internet

Data Lifecycle (modified from StorageTek)

Capacity Disk Migration Recovery Time vs. Cost (from StorageTek) Tape Applications vLargest single application is in back-up (>75%). Remainder is archive vAbout half of average system price is for the autoloader systems and half is for the drives themselves vMost backup using Veritas or Legato backup software, little NT or Unix. vBiggest growth area is libraries for NAS or SAN systems StorageTek Tape Library Major Backup Tape Formats

AIT

DLT

LTO Tape Benefits

vGood Archival Medium

O Shock Resistance

O Packing Density

O Transportability

vCheap Media Cost Tape Challenges v Sequential Access

O Slow data restoration v Degradation During Long DLT Tapes Needed Term Storage to Back-Up typical High-End NetApp Filer 40 O Re-tensioning, bleed 30 through, … 20 3X v Lack of Scalability with 10 0 Data Growth 1997 2003

O Capacity

O Throughput v Periodic Verification Difficult

O Especially if Offline Tape Capacity Growth Trend vs. Technology

100000 AIT (GB) DDS (GB) 10000 DLT LTO 30% CAGR 1000 60% CAGR 100% CAGR 100 120% CAGR Tape Capacity (GB) Tape Capacity

10

1 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 Tape Market Observations vTape prices tend to be very stable, <5% price erosion on systems per year vAverage drive price is about $5k (S-DLT) vAverage tape price is about $50 (S-DLT) vTechnology changes such as areal density growth and data rate improvements much slower than disk drives (<60% CAGR in Areal Density growth) Enhanced Backup v More than 80% of the cost of backup is operational costs, mostly manpower, to support backup. v Since the core rate of tape technology development is different than disk backup, solutions with tape alone are scaling more slowly than the primary storage. v This leads to a “backup crisis!” v By enhancing traditional tape backup with disk based solutions we can help customers avoid a “backup crisis” and provide enhanced performance improvements as well.

Enhanced Backup Exploit the Advantages of Disks to Protect Data

O Random Access •Fast Data Restoration

O Reliable

O Scalable

O Online Reliability Verification Backup Paradigm Shift Tape Tape

Offsite Offsite Backup Backup Archive Archive

Immediate Immediate Business Business Continuance Disk Continuance

??? Several Levels of Enhanced Backup

Level 3: Continuous Backup with Read-Write Access

Level 2: Changed-Block Backup with Read Access

Level 1: Backup to Disk as Tape Image Enhanced Backup - Level 1 Backup to Disk as Tape Image

O Data on Primary Storage Is Backed up to Nearline Disk Storage Using Traditional Backup Software

O Data on Nearline Storage Is in Proprietary Format

O Nearline Storage Is Backed up to Tape for Archiving Enhanced Backup - Level 1

UNIX Windows Server Server = File-level transfers

Daily Incremental

Network Backup Server

Weekly / Monthly Full

Disk Based Storage Tape Library Fast Data Access Enhanced Backup - Level 1 v Benefits

O Faster Restores From Random-access Disk Storage

O Eliminates the Need for Daily Incremental Backups to Tape

O Integrates Into Your Existing Infrastructure v Challenges

O Lots of Disk is Required for Full and Incremental Backups • One Changed Causes Entire File to be Backed up

O Restore Process Still Requires Human Intervention • Backup Copy Cannot Be Directly Accessed

O Backing up Remote Offices Is Not Practical Using This Approach • Requires a Robust WAN Network Enhanced Backup - Level 2

Changed-Block Backup with Read Access v Data Is Backed up to Nearline Disk Storage

O Only the Initial Backup to Nearline Storage Is a Full Backup

O All Subsequent Backups Transfer Changed Data Only • Only Changed Blocks Are Stored v Backup Data on Nearline Storage Is in File Format

O Can Be Browsed By Users Enhanced Backup (Level 2)

SnapVault NetApp Solaris Windows Storage Server Server

Hourly/Daily Incrementals

Network

Backup Server Only Remote Data Center Changed Weekly / Monthly Full Blocks Backup Server Stored

Tape Library

WAN

Disk Storage System SnapMirror Enhanced Backup (Level 2) v Benefits

O Superior Data Protection • More frequent backups can be done and kept online • Immediate verification of backup data

O Fast Backups and Restores • Shrinks/eliminates the backup window

O Lower Backup Infrastructure costs • Less storage utilized to store backup copies • User initiated file restores v Challenges

O Files Need to Be Restored Before Use • Restore Is Delayed Until a New System or Free Disk Space Can Be Located

O Doesn’t Solve Immediate Business Continuance • Separate Solution Required Enhanced Backup (Level 3) vContinuous Backup with Read-Write Access

O Backup Data on Nearline Storage Can Be Made Write-able in the Event of a Disaster

O Once the Primary Storage Is Available, the Data on the Nearline Storage Can Be Re- synced With the Primary Storage Enhanced Backup (Level 3) 1. Level 2 Backup / Replication 2. Primary Storage down; Target made read/write

Source Target Source Target

Replication Volume Volume Volume Volume (Read/Write) (Read) X (Read/Write)

3. Primary Storage available 4. Level 2 Backup / Replication Reinitiated

Source Target Source Target

Re-Sync Replication Volume Volume Volume Volume (Read) (Read/Write) (Read/Write) (Read) Enhanced Backup (Level 3) vBenefits

O Superior Data Protection • More Frequent Backups Can Be Done and Kept Online • Immediate Verification of Backup Data

O Lower Backup Infrastructure Costs • Less Storage Utilized to Store Backup Copies • User Initiated File Restores

O Solves Backup and Business Continuance Issues •One Solution vChallenges

O New Paradigm Addressing Traditional Backup Pain Points

Backup Level 1 Level 2 Level 3 Traditional Backup Pain Points to Tape Primary Storage impact during backup x 3 3 Backup window shrinking is an issue x 3 3 Restoring data takes a long time x 3 3 Takes a long time to verify backup data x x 3 3 Backups consume a lot of tape media x 3 3 Backups consume a lot of network bandwidth x x 3 3 Backup & restore process fails thereby requiring constant x 3 3 monitoring Restores normally require administrator involvement x x 3 3 Remote backups are not dependable and costly to manage x x 3 3 and administer

xx DoesDoes notnot addressaddress HelpsHelps addressaddress 33 FullyFully addressesaddresses Nearline and Enterprise Drives

Seagate Cheetah Product Caviar Product 73.4 GB, 15,000 RPM, FC/SCSI 200 GB, 7,200 RPM, PATA

Maxtor MaxLine Product Western Digital Raptor Product 320 GB, 5,400 RPM, SATA 36.7 GB, 10,000 RPM, SATA ATA-Based Storage Systems

Quantum DX30 The DX30 separates backup functions from archive functions to optimize the data protection process.

Nexsan ATABeast Nexsan's STK Bladestore product 14 TB for 7 cents a MB uses 5-3.5 inch drives on blade acting as one drive to a fibre channel output Nearline Storage Disk Drive Trends v Increasing storage and lower $/GB

O Currently 60 and 80 GB/3.5 inch disk • Maxtor 320 GB, 4 disk, 5400 RPM • Maxtor, WD 200+ GB 7200 RPM

O Next year 120-160 GB/3.5 inch disk

O Within 2-3 years 1 TB 4-disk drive will happen! v New serial interfaces

O Serial ATA (SATA)

O Serial SCSI (SAS) v Growing use of external drive boxes with USB or 1394 interfaces v New small form factor drives for mobile devices

O 1.8 inch 20+ GB drives and small drive developments External Drives (USB or Firewire) or with small NAS devices on a LAN

Maxtor PS5000 SNAP Storage with one-touch Appliances backup iVDRiVDR IInformationnformation VVersatileersatile DDiskisk forfor RRemovableemovable usageusage — Common HDD platform for PC and Consumer AV usage regardless of products and manufactures — Compact and Removable — Large Capacity and High-Speed Access — Content/Data Protection — Open Standard Possible Backup NAS Device using iVDR drives 1000

900

800

Estimated700 ASP Trends

600 ENTERPRISE PORTABLE 500 DESKTOP

400

Price ($) 300

200

100

0 1990E 1992E 1994E 1996 1998 2000 2002E AREAL DENSITY PROGRESSION (Source: PRC, 2002)160 140 n 120 100 80 60 40 20 Areal Density (Gb/i 0 Q1 2000 Q1 2000 Q2 2000 Q3 2000 Q4 2001 Q1 2001 Q2 2001 Q3 2001 Q4 2002 Q1 2002 Q2 2002 Q3 2002 Q4 TECHNOLOGY PRODUCT

19 SHIPPING PRODUCT AREAL DENSITY PROJECTIONS

Year Areal Density CAGR 95mm Avg. Capacity Per Platter 2000 120% 15 2001 100% 30 2002 90% 60 2003 80% 108 2004 70% 184 2005+ 60% 294

64 Disk Cost Trends 3.5 Inch ATA Network Storage

Drive Capacity2000 and Price/GB5 1800 4.5 Drive 1600 Capacity 4 $/GB 1400 3.5 G 1200 3

1000 2.5 B

800 2 $/G 600 1.5

Drive Capacity ( 400 1 200 0.5 0 0 2001 2002 2003 2004 2005 As low cost disk drive storage decreases in price it offers greater economy to disk to disk backup and the use of disk drives for backup cache.

1000 Tape Drives 100 Tape Drive + 100 Media 10 IDE RAID $/GB 1 Tape Media 0.1 IDE Drives 0.01 1996 1998 2000 2002 2004 2006

Tape Drive + 100 Media IDE Drive Ghetto RAID Comparison of Straw Man DLT Tape vs. IDE Disk Backup System (Note that Tape has 2:1 Compressed capacity vs. disk drive native capacity) Attribute DLT Tape IDE Drive Drive Libray Access Time 60 sec <15 ms (>4000 X faster) Data Rate 6 MB/s >46 MB/s (>7 X faster) Removability Yes Could be (drive (Cartridges) carriers) A. D. CAGR <60% >80% Sequential Random Access Access DATA PROTECTION MARKET OPPORTUNITY v Backup Arrays include Disk Arrays Used in Backup Revenue Forecast in $Billions O Virtual Tape, D2D Backup, Point-in-time Backup, $5.0 Snapshot Backup v Backup Array revenue grows to $5.1B in 2005 offsetting the $4.0 Tape Library Market

O Tape Library growth $3.0 reaches $3.1B in 2005 v Disk usage expands as a secondary data protection device $2.0 relegating tape to an archive role O Tape libraries are the central automated archive $1.0 repository O 60%+ of mainframe data is $0.0 now protected by disk – 2001 2002 2003 2004 2005 Virtual Tape Backup Disk Arrays Tape Libraries

Strategic Research Corp., Nov. 2002 Transition to Smaller Form Factors

v 2.5 inch most popular mobile computer drive form factor. v 1.8 inch mobile computers now appearing, smaller size drives??? v 60-65-mm disks used in 15k RPM enterprise disk drives (although not yet in 2.5 inch form factor box). Cooling issues v For new consumer products size and volume will become important. v Dense server and storage environments favor many more smaller drives. This also gives better performance since the time to data is faster for smaller form factors v New consumer electronics initiatives using smaller form factor disk drives such as the Japaneses iVDR consortium. v In volume 2.5 inch drives should be as inexpensive or less expensive per box compared to 3.5 inch disk drives. Capacity5000 vs. Form Factor (Same Areal Density, 4 Disks)

4000 95 mm high end 65 mm high end 48 mm high end 27 mm high end 3000 2002 95 mm

2000 Capacity (GB)

1000

0 2002 2003 2004 2005 2006 2007 2008 Volumetric Density Comparison

18.0

16.0 65 mm, Enterprise

) 95 mm, Nearline 65 mm Enterprise 14.0 95 mm, Enterprise 2 disk, mobile form factor 12.0

10.0 95 mmNearline 4 disks 8.0

6.0 95 mm Enterprise 6 disks 4.0 Volumetric Density (MB/sq. mm (MB/sq. Density Volumetric

2.0

0.0 2002 2003 2004 2005 2006 2007 2008 Disk Drive Form Factor Changes

100

10

Percentage (%) 1

0.1 2000 2001 2002 2003 2004 2005 2006

<1.8 inch 2.5 inch 3.5 inch 5.25 inch

TimeDrive Interface Migrations

2001 Overall HDD Market Parallel Serial 10% ATA ATA Enterprise ATA is cost-optimized for Desktop non-mission critical applications

2001 Enterprise HDD Parallel Serial Attached Market SCSI SCSI 9% P-SCSI Serial Attached SCSI addresses the performance and Fibre Channel reliability needs of enterprise environments Other

Fibre Serial Attached SCSI & Channel Fibre Channel Fibre Channel continues to pursue long-distance and connectivity solutions associated with SANs Fibre Channel Speeds and Feeds v1 Gigabit per second (100 MB) since 1996

O Physical layer adopted by Gigabit Ethernet v2 Gigabit per second (200 MB) since 1999

O Gigabit Ethernet won’t go there v4 Gigabit per second (400 MB) in 2003

O Only a disk drive interface – not fabrics v10 Gigabit per second (1200 MB) in 2003

O Physical Layer adopted from 10 Gig Ethernet Interface Technology Comparison

SerialSerial Attached Attached SerialSerial ATA ATA SCSISCSI Full-duplexFull-duplex with with Half-duplexHalf-duplex LinkLink Aggregation Aggregation PerformancePerformance 1.51.5 Gb/sec Gb/sec 3.03.0 Gb/sec Gb/sec (3.0(3.0 Gb/sec Gb/sec announced) announced) InternalInternal only only 6m6m external external cable cable ConnectivityConnectivity OneOne device device >128>128 devices devices NoNo peer-to-peer peer-to-peer Peer-to-peerPeer-to-peer

Single-portSingle-port HDDs HDDs Dual-portDual-port HDDs HDDs AvailabilityAvailability Single-hostSingle-host Multi-initiatorMulti-initiator

DriverDriver SoftwareSoftware transparent transparent SoftwareSoftware transparent transparent ModelModel withwith Parallel Parallel ATA ATA withwith Parallel Parallel SCSI SCSI CE Interface Speed Comparison

Serial ATA USB 2.0 1394 Serial ATA Gen 2 Interface 480 Mbps 400 Mbps 1500 Mbps 3000 Mbps speed

Time to Copy 40 sec 33 sec 11 sec 5 sec 2GB File

Download 16 360 sec 300 sec 97 sec 48 sec GB HD (6 min) (5 min) (1.6 min) (0.8 min) Movie

Back-up 1600 sec 1333 sec 427 sec 213 sec 80GB drive (27 min) (22 min) (7.1 min) (3.6 min) General SATA & SAS Timelines 2002 2003 2004 2005 1H 2H 1H 2H 1H 2H 1H 2H

SATA Controllers Dual Mode SATA/SAS Controllers

NAS/Nearline ⇒ Desktop Bridge SATA Demos FCS

SATA 1.0 SATA 2.0 • 1.5 Gb/s @1m cabling • SATA 1.0, plus •P-ATA Features • 3.0 Gb/s @1m cabling • Hot-plug enabled •SATA Command Queuing • Additional features

Server ⇒ Subsystems Spec Proposal Demo Qual SAS to ANSI T10 Units Units FCS

SAS 1.0 •3.0 Gb/s • >9m cabling • Parallel SCSI Features • 128 device addressing • Dual port Enabling Choices For Customers

- OR -

• A “properly designed” backplane can accommodate either SAS or SATA disk drives • SATA/High-Capacity disk drives can be used to enable “near-line” or tape augmentation applications • SAS/High-Performance disk drives can be used to enable “on-line” and performance-oriented applications • Enables OEMS, VARs & Integrators the ability to re-use designs and more easily broaden their product offerings Enabling Choices for Customers: SATA-SAS Subsystem Example

When drives can share a common controller & backplane, system designers & integrators are given more opportunities…

SATA drives with dual port, switched carriers for networked file storage Dual port SAS drives for main stream server applications Add-on JBOD or RAID storage with mixed drive classes

SATA drives integrate disk to disk backup in the server to shorten backup and restore times

SAS SATA drives drives

Conclusions v Data storage continues to grow. More things made digital. v Greater need than ever to preserve our digital assets through backup and archive. v Tremendous financial incentives tied to rapid recovery. v Disk based backup will displace tape in many backup and restoration applications to create Enhanced Backup Storage. v Three phases of Enhanced Backup Storage discussed, each leading to greater automation of backup and restore operations v Changes in disk areal density and interfaces will lead to higher performance and less costly backup storage. v Digital backup and archive remain a major component in data storage growth.