A Primer on Nearline and Archival Storage Solutions

WHITE PAPER

A Primer on Nearline and Archival Storage Solutions

Nearline storage represents the position in the storage hierarchy between online and offline storage. Nearline storage is almost instantaneously accessible through the use of automated robotics-based removable media handling. No human intervention is required.

Explosive Data Growth

The value of nearline storage has been prompted by the phenomenal growth of electronic data in recent years. Applications such as e-mail, multimedia, databases and e-commerce are contributing to the explosive data growth at the enterprise level. Organizations are now doubling the amount of their data every 12 to 18 months. This ongoing need to store more and more data is creating additional inefficiencies, as the infrastructure for storage management is unable to keep up with the growth in data storage.

Figure 1 – The rise of electronic data growth (data source IDC and SNIA)

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 1 http://www.komnetworks.com A Primer on Nearline and Archival Storage Solutions

Concerns and Issues

Backup Issues

The explosive growth in data results in longer cycles for the simple reason that there is more data that needs to be backed up. This is becoming problematic given that organizations have smaller windows of opportunity to conduct backup operations. Reasons for this include: Increasing access requirements and 24 x 7 operations. If storage requirements are doubling every 12-18 months then the load on the backup system will also be doubling every 12-18 months. Backup system scalability is not an easy undertaking at the best of times. It has become even more difficult to scale given that the data load is constantly growing. Unless we can either reduce the amount of data that regularly gets backed up, or increase the allocated time for backup operations, we are heading down a dangerous path where backup jobs are not regularly run. This leads to a low probability of successful data and an unpredictable ability to recover data.

Scenario – Cost of data recovery

XYZ Tax Services generates $10,000,000 per year from preparing tax returns using their own tax software application. 1,000 tax specialists regularly use the software to generate revenue for XYZ. The average hourly wage for an XYZ employee is $15/hour.

Let’s assume an average data recovery time of 4 hours should the system go down. The business operates 100 days per year and a typical outage would result in 20% of the data being lost. Recovery costs average $20,000.

However this recovery cost does not truly represent the cost of the down time to the company. There are two other costs to consider: The cost of lost data and the opportunity cost associated with employees getting paid for work they are not able to perform.

Annual Revenue AverageCost of Lost Data= × %of DataLost = = $20,000 OperatingCost

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 2 http://www.komnetworks.com A Primer on Nearline and Archival Storage Solutions

Opportunity Cost = Employee Time Lost = Average Data Recovery Time ×

Average Hourly Rate × Number of Employees = $60,000

Figure 2 - Cost of Outage, % of Total Revenue and Incidents/year for XYZ Tax Service (Data Source SNIA)

As outlined in above example, few outages during a year can cause significant impact on the bottom line of the companies.

Business Industry Hourly Downtime Cost Brokerage Operations Finance $6,450,000 Credit Card Sales Authorization Finance $2,600,000 Home Shopping (TV) Retail $113,000 Pay-per-view Media $150,000 Catalog Sales Retail $90,000 Airline Reservation Transport $90,000 Tele-ticket Sales Media $69,000 Package Shipping Transport $28,000 ATM fees Finance $14,500 Source: Fibre Channel Industry Association

Table 1 – Typical hourly downtime cost for businesses

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 3 http://www.komnetworks.com A Primer on Nearline and Archival Storage Solutions

Inactive Data

As the amount of data storage grows, so too does the amount of inactive data. Studies consistently show that only a small portion of data is frequently accessed. A typical file server contains 20% active data and 80% inactive data. This implies a non-optimal use of primary storage resources.

One of the major reasons why organizations generate significant amounts of inactive data is because they tend to add capacity without taking into consideration the management of data and what gets stored where. Because of the tremendous time pressures IT administrators are often under, they often don’t take the time to plan for moving or migrating data that is no longer current, or infrequently accessed.

Figure 3 – 80% of the data in a typical File Server is accessed infrequently

Aged Data

In most instances, the value and relevance of data to an organization decreases over time. “Aged” data may continue to be stored for extended time periods due to taxation, regulatory or other business- related reasons. However, maintaining inactive and aged data on online storage resources is both unproductive and costly. There is a tremendous need to implement a suitable storage strategy to handle most of an organization’s inactive and aged data.

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 4 http://www.komnetworks.com A Primer on Nearline and Archival Storage Solutions

Figure 4 – Value of Data with Time

Storage Cost

Inactive data creates longer backup windows, requiring greater resources and increased management costs. It also results in longer recovery times. Keeping rarely accessed data online can be very expensive for several reasons.

With the rapid decline in prices of online storage such as NAS devices, it may appear that online storage is very inexpensive. The hardware cost alone doesn’t show a complete picture. Total Cost of Ownership (TCO) based comparison includes hard costs such as cost of hardware, software, and maintenance and soft costs such as management cost for monitoring, scheduling, reviewing, managing off-site vaulting, installing and configuration of new hardware. An analysis based on TCO will show that online storage is much more expensive than either nearline and offline storage.

Typically online storage capacity is not scalable without purchasing another online storage device. Whereas purchasing additional media at a fraction of the cost can expand the capacity of nearline and offline storage devices.

As well, one also has to consider the costs due to the disruption in operations. This occurs all to often when organizations run out of space on their online storage devices because of the explosive growth of their data.

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 5 http://www.komnetworks.com A Primer on Nearline and Archival Storage Solutions

DAS DAS Virtualized Distributed Centralized Centralized

Figure 5 – Storage Management with various storage strategies

Up until now, majority of organizations have implemented a distributed management strategy for their direct-attached storage (DAS) infrastructure. This is a simple strategy that comes with high management cost. With this strategy, the storage management cost as well as administrative complexity grows with growth in storage capacity.

The centralization of storage management activities offers improved total cost of management. Centralized DAS infrastructure management strategy has the potential to allow administrators to manage twice the storage capacity managed in distributed DAS infrastructure using the same resources. This strategy can result in a reduction of 40% in management cost.

The pooling of storage “virtualization” offers further improvement in total cost of management of storage infrastructure. A centralized and virtualized storage environment has potential to allow administrators to manage eight times more storage capacity while reducing the management cost by 70% over a distributed DAS environment.

Overall, the cost of managing storage far exceeds the cost of storage hardware.

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 6 http://www.komnetworks.com A Primer on Nearline and Archival Storage Solutions

A Solution: Nearline Storage

Organizations can save significant amounts of time and money by migrating less frequently accessed data to nearline storage devices such as automated jukeboxes and libraries for removable storage media. The key point to note here is that there is no performance loss – through data migration, the system becomes more efficient and nearline storage offers relatively fast data accessibility. It’s a win-win situation!

Clients

Media Library

Storage Server

Figure 6 – A typical nearline storage solution

User Impact

With most nearline storage solutions, users and applications are unaware of the physical data location. As well, the mapping of logical storage objects to physical storage objects is transparent to users. The data always appears to be online and the logical storage object appears to have near-infinite storage capacity. The individual or group of removable storage media appears as logical storage objects.

Nearline storage solutions allow archived ‘inactive’ data to be accessible to users and applications while devoting online storage to ‘active’ data. The only concern is the potential delay times in accessing the data, which has been migrated to removable media. This concern can be greatly alleviated by using a removable storage system that provides reasonable access time.

Sometimes, users and applications need to know where to look for archived and migrated data. Administrators must keep this in mind when setting up their storage environments.

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 7 http://www.komnetworks.com A Primer on Nearline and Archival Storage Solutions

Elimination of Inactive Data from Backups

This solution reduces the backup time and in turn management cost by eliminating the repetitive backup of static data. A copy of the static data can be created and removed from a jukebox for off-site storage for disaster recovery purposes. Only new data gets backed up. The elimination of migrated data can reduce backup time by as much as 90%.

Incorporating Infinite Storage Capacity

Storage capacity can be increased and organized by incorporating additional removable media. It appears to users and applications that they have access to near-infinite storage capacity. This is because when a piece of media is full you have the option of storing it offsite and putting a “fresh” piece of media back into the storage device.

A Cost Effective Solution

This solution offers increased media utilization and better data manageability. It focuses on better storage resource utilization for active data and a decreased need to add online storage. Centralized management functionality and elimination of duplication of resources also result in administrative cost savings.

Nearline Storage Applications

Typical applications where a nearline storage approach is extensively used are: Government data archiving, and storage of medical records, financial records and document imaging files.

In many cases it is a legal requirement that data be preserved on storage media that has long-term data integrity (i.e. Magneto-optical, DVD, CD). For example, imagine a busy medical facility and the needs to accurately track and store the medical history of its patients. By law, medical records have to be stored for 7 years on an approved storage media. Not only is it important to be able to access this information from the patient perspective, it can also have tremendous consequences in the case of malpractice lawsuits.

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 8 http://www.komnetworks.com A Primer on Nearline and Archival Storage Solutions

Three key strategies for adopting a nearline storage approach are: Archival Strategy, Data Migration Strategy and Data Preservation Strategy.

Archival Strategy

With an archival strategy there are two separate solutions - one for managing the archiving and retrieval of information and another for managing the secondary storage sub-system. Both solutions must work together seamlessly. By utilizing archival software, the primary archiving concern of needing to know where to look for archived data is no longer an issue. Typically, the archival software populates a database with information about the archived data on a secondary storage device. The management software manages the secondary media in the library and fulfils requests from the archival software for archiving or retrieving the data from specific media.

Data Migration Strategy

A data migration strategy is important when considering what data should be stored on what storage device. Generally speaking data is migrated from primary to secondary storage devices if it is not considered to be “mission-critical” or it is infrequently accessed. A sound data migration strategy enables organizations to conserve resources on more expensive, faster devices as well as to store files with similar attributes on the same device.

Data migration software runs on a server, which determines what data to migrate to secondary storage based on policies set by an administrator. Policies are set to distribute various types of data, based on accessibility and usefulness, to appropriate storage devices.

The automatic data migration functionality allows active data to remain on online storage while keeping archived data on secondary storage. Secondary storage is accessible to users and applications without human intervention -the data always appears to be online. This strategy requires a secondary storage system that provides access to data within a reasonable timeframe.

For example, imagine a busy medical facility that needs to store its medical files in a convenient and accessible electronic form. Assume that the facility’s IT administrator has implemented a data migration strategy. The facility’s frequently accessed files might be stored in a RAID tower while those files that are less frequently accessed would be

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 9 http://www.komnetworks.com A Primer on Nearline and Archival Storage Solutions stored in an optical jukebox. A policy could be set such that any medical file, which hadn’t been accessed in at least 10 days, would automatically be migrated from the RAID tower to the optical jukebox.

File Disposal Data Creation

Offline Storage

Cache

Nearline Storage Online Storage

Figure 7– File Lifecycle

A data migration strategy allows efficient utilization of existing online storage capacity and dynamic changes in storage capacity without access disruption.

Data Preservation Strategy

One advantage of a nearline storage solution is that a “long-term data preservation” strategy can also be implemented without the need for a separate hardware solution.

Typically this strategy is used for preserving information which needs to be stored for very long time periods (i.e. 7 to 15 years) such as government records, scientific data, corporate and business requirements and regulatory compliance information. The aforementioned data can be stored on Write-Once-Read-Many (WORM) media such as WORM MO, CD-R and DVD-R for long-term preservation.

Solutions, which are capable of writing to WORM media as well as Rewritable media, can use the same library hardware. No additional hardware purchases are required for the two media types.

By implementing a data preservation strategy, data can remain online in the library if frequent access is required. The data can also be exported for offline/offsite storage if it is rarely accessed. This approach allows more secondary storage media to be inserted into the library.

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 10 http://www.komnetworks.com A Primer on Nearline and Archival Storage Solutions

Types of Media Rewritable Double- Comments Archival Capacity (RW) /Write- sided Media once (WO) CD-R 650- WO No Good for personal data storage, 700MB archiving and distribution CD-RW 650- RW; up to No Ideal for personal data storage, 700MB 1,000 times multimedia presentations, and digital image storage DVD-RAM Up to RW; up to Yes, Type I Fast data access; slow data 9.4GB 100,000 and IV only transfer rate times DVD-RW 4.7GB RW; up to No Sequential read/write media; 1,000 times designed for video recording and streaming DVD-R 4.7GB WO No Compatible with CLV (constant linear velocity – video) and CAV (constant angular velocity – computer data) 5.25” MO 9.1GB RW; archival Yes Faster than DVD-RAM with Rewritable life 40 years comparable capacity; Ideal for or more applications demanding large capacity archiving 5.25” MO 9.1GB WO; archival Yes Ideal for audit trails and jukebox WORM life 40 years use; Good solution for archiving or more financial, legal and medical files where audit trail or development of file is important 12” Optical 30GB WO Yes Drives can read both sides of WORM disk simultaneously; fastest data access time; high cost MO – Magneto-Optical RW – Rewritable WO – Write Once WORM – Write Once Read Many

Table 2 - Types of Archival Media

Nearline Storage Management Solutions from KOM NETWORKS

KOM NETWORKS Enterprise Storage Solutions provide comprehensive storage management with unprecedented flexibility, reliability, and availability to your applications.

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 11 http://www.komnetworks.com A Primer on Nearline and Archival Storage Solutions

Figure 8 - KOM Nearline Management Products

Whether you are financial or insurance organization, government or medical institution through KOM NETWORKS solutions you will maximize your return on investment (ROI), minimize your corporate risks and insure your business continuity.

All KOM NETWORKS nearline storage solutions offer:

Easy Device Integration and Installation

KOM storage solutions allow for seamless integration of removable storage devices into the host operating system. The installation procedure is quick and easy allowing administrators to begin utilizing the solution within a very short period of time.

Complete Device Control

KOM storage solutions give you complete control over your nearline storage devices, permitting you to setup all parameters to suite your specific situation including online import and export.

Effective Hardware Management

Nearline storage devices can remain in production by allowing the flexibility of taking defective components offline until suitable downtimes are scheduled for repairs.

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 12 http://www.komnetworks.com A Primer on Nearline and Archival Storage Solutions

Transparent Disk Management

KOM storage solutions utilize the administrative tools of host operating system to provide transparent disk management functionalities – create, format, assign drive letters and share names.

Media Portability

Removable media is treated as standard magnetic disk without a need for an external database to access data on the media.

Native File System Support

OPTISTORM™ and OPTISERVER® based solutions do not use a proprietary file system, enabling you to format your removable storage media using the native file system. This extends most native file system features applicable to fixed magnetic disks including long filename support and file-level security to removable optical disks.

OPTISTORM™ for Windows NT/2000

OPTISTORM™ is a proven nearline optical storage software solution for Windows environments. It readily integrates into your current storage environment and ensures that your data is always accessible and secure.

OPTISTORM™ offers:

Drive Name Technology

OPTISTORM™ utilizes KOM patented drive name technology. This feature provides for a virtually limitless number of storage volumes to be online and accessible without the restriction of standard 26 drive letters.

Single and Multiple Surface Storage Volume

OPTISTORM™ gives you the flexibility to choose between creating one large volume spanning several removable optical platters or creating storage volumes comprised of only one surface of removable media.

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 13 http://www.komnetworks.com A Primer on Nearline and Archival Storage Solutions

Flexible Caching

OPTISTORM™ provides sophisticated but flexible caching functionalities to optimize read and write performance to the nearline storage devices. It allows for efficient use of nearline storage device and timely access to data.

Offline Media Management

This feature enables administrators to keep track of media that have been exported from a nearline storage device and allows for offline media notification and management.

OPTISTORM™ delivers:

Ensures high return-on-investment (ROI) with its flexibility to grow with your organization’s needs.

Support to Windows NT/2000 platforms.

Support for Magneto-Optical (MO) Rewritable and Write-Once- Read-Many (WORM), 12” Optical WORM, CD-ROM, DVD-ROM and DVD-RAM.

Supports native NTFS and FAT file system formats allowing enforcement of operating system (OS) security and media portability.

Remote administration enabling optical storage management tasks from one central location on the corporate network.

Multiple access methods to data on the removable optical media in the jukeboxes – Drive letters, UNC Pathname, Mount Points (similar to UNIX environment) and Drive Name, a KOM NETWORKS patented technology for accessing data on removable media.

Improved read/write performance with Hierarchical Cache Management (HCM).

KOM NETWORKS patented user access control technology to provide enhanced data protection against user errors, viruses, worms and hackers.

Ability to permanently freeze a WORM media to meet the rigorous requirement of preventing “after the fact” new data writing such as creating a past ‘fictitious’ banking transaction after discovery of no

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 14 http://www.komnetworks.com A Primer on Nearline and Archival Storage Solutions

transaction or creating winning lottery numbers after the numbers are drawn.

Time travel feature on WORM for audit trail and tracking file content development lifecycle.

OPTISERVER® for UNIX and OpenVMS

OPTISERVER® for UNIX and OpenVMS are nearline optical storage software solutions for HP-UX, Sun Solaris (SPARC/Intel) and Tru64 UNIX and Alpha and VAX OpenVMS environments.

OPTISERVER® offers:

Seamless Integration

OPTISERVER® integrates seamlessly into UNIX and OpenVMS platforms. It allows data to be written directly to optical disks, without the need to first write files to magnetic disks.

True Device Driver

OPTISERVER® functions as a true device driver and integrates nearline storage devices into your host system. The users continue to apply standard system utilities to nearline data storage operations.

OPTISERVER® delivers:

Ensures high return-on-investment (ROI) with its flexibility to grow with your organization’s needs.

Support for HP-UX, Tru64, Sun Solaris (SPARC/Intel) UNIX platforms.

Support for Magneto-Optical (MO) Rewritable and Write-Once- Read-Many (WORM), 12” Optical WORM, CD-ROM, DVD-ROM and DVD-RAM.

Supports native file system formats (UFS, Files-11, etc.) allowing enforcement of operating system (OS) security and media portability.

Ability to permanently freeze a WORM media to meet the rigorous requirement of preventing “after the fact” new data writing such as creating a past ‘fictitious’ banking transaction after discovery of no

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 15 http://www.komnetworks.com A Primer on Nearline and Archival Storage Solutions

transaction or creating winning lottery numbers after the numbers are drawn.

Time travel feature on WORM media for audit trail and tracking file content development lifecycle.

OPTIWORX™ for Windows 2000

OPTIWORX™ brings a breakthrough storage pooling technology to nearline optical storage devices. With OPTIWORX™ you get a phenomenal storage return on investment through the flexibility to expand or shrink your storage environment, and the ability to implement a long-term data storage strategy. Operational efficiency is improved through automated file migration policies to seamlessly move files between storage devices. OPTIWORX™ operates Windows 2000 environments.

OPTIWORX™ delivers

Maximum return on investment (ROI) with efficient utilization of your existing removable storage devices.

Support for Windows 2000 platform.

Effective storage pooling technology to integrate different type of removable optical media and devices in to one homogeneous storage repository.

Dynamic expansion and shrinkage of storage volume capacity with no user access interruption.

Retirement of old removable storage devices and addition of new removable storage devices without system downtime.

Remote administration enabling optical

storage management tasks from one central KOM NETWORKS Inc. location on the corporate network http://www.komnetworks.com

Phone: (613) 599-7205 Fax: (613) 599-7206 The KOM NETWORKS™ logo, OPTISTORM™, OPTISERVER®, OPTIWORX™, SHIELDWORX™ and KOMWORX™, are trademarks of KOM NETWORKS Inc. Sales: 888-556-6462 Support: 800-668-1777 KOM NETWORKS Inc. disclaims any proprietary interest in trademarks and trade names other than its own. Email: [email protected]

Copyright © 2002 KOM NETWORKS Inc., All Rights Reserved 16 http://www.komnetworks.com