Storage Area Network V M Gunda

Storage Area Network

Contents 1 What is SAN ...... 3 1 Universal Storage: ...... 3 2 Direct Attach Storage ...... 4 3 Network Attached Storage Network ...... 5 4 Storage Area Network ...... 7 2 What Makes SAN Best ...... 9 3 SAN Software ...... 10 4 SAN Software Architecture ...... 11 5 Applications for SAN ...... 11 6 Storage Network Architecture ...... 13 7 Storage System ...... 14

2

Storage Area Network

1 What is SAN

A storage area network (SAN) is any high-performance network whose primary purpose is to enable storage devices to communicate with computer systems and with each other.

 It doesn¶t say that a SAN¶s only purpose is communication between computers and storage. Many organizations operate perfectly viable SANs that carry occasional administrative and other application traffic.  It doesn¶t say that a SAN uses or or any other specific interconnect technology. A growing number of network technologies have architectural and physical properties that make them suitable for use in SANs.  It doesn¶t say what kind of storage devices are interconnected. Disk and tape drives, RAID subsystems, robotic libraries, and file servers are all being used productively in SAN environments today. One of the exciting aspects of SAN technology is that it is encouraging the development of new kinds of storage devices that provide new benefits to users. Some of these will undoubtedly fail in the market, but those that succeed will make lasting improvements in the way digital information is stored and processed.

The beauty of SANs is that they connect a lot of storage devices to lot of servers and place in the administrator¶s hands the choice of which gets to access which storage devices. Storage Area Technology involves:

y Architecture y Storage y Networking y Software

1 Universal Storage: The problem with universal client ± Server storage is that, if application B wants to access the application A stored data then it is like you need copy the data from Application A storage to Application B storage. It not only makes cumbersome but it is like make duplication of data as well which is not good. Here comes to Storage area network (SAN)

3

Storage Area Network

2 Direct Attach Storage

Direct attached storage is the simplest and most commonly used storage model found in most standalone PCs, workstations and servers. A typical DAS configuration consists of a computer that is directly connected to one or several hard disk drives (HDDs) or disk arrays. Standard buses are used between the HDDs and the computers, such as SCSI, ATA, Serial-ATA (SATA), or Fibre Channel (FC). Some of the bus cabling definitions allow for multiple HDDs to be daisy chained together on each host bus adapter (HBA), host channel adapter, or integrated interface controller on the host computer.

This is what direct Attach storage looks like

4

Storage Area Network

The software layers of a DAS system are illustrated in Figure 2. The directly attached storage disk system is managed by the client . Software applications access data via file I/O system calls into the Operating System. The file I/O system calls are handled by the , which manages the directory data structure and mapping from files to disk blocks in an abstract space. The Volume Manager manages the resources that are located in one or more physical disks in the Disk System and maps the accesses to the logical disk block space to the physical volume/cylinder/sector address. The Disk System ties the Operating System to the Disk controller or Host Bus Adapter hardware that is responsible for the transfer of commands and data between the client computer and the disk system. 3 Network Attached Storage Network After seeing the consequences of binding storage to individual computers in the DAS model, the benefits of sharing storage resources over the network become obvious. NAS and SAN are two ways of sharing storage over the network. NAS is generally referred to as storage that is directly attached to a (LAN) through network file system protocols such as NFS and CIFS. The difference between NAS and SAN is that NAS does ³file-level I/O´ while SAN does ³blocklevel I/O´ over the network. For practical reasons, the distinction

5

Storage Area Network

between block level access and file level access is of little importance and can be easily dismissed as implementation details.

Logically, the NAS storage system involves two types of devices: the client computer systems, and the NAS devices. There can be multiple instances of each type in a NAS network. The NAS devices present storage resources onto the LAN network that are shared by the client computer systems attached to the LAN. The client Application accesses the virtual storage resource without knowledge of the whereabouts of the resource. In the client system, the application File I/O access requests are handled by the client Operating System in the form of systems calls, identical to the systems calls that would be generated in a DAS system. The difference is in how the systems calls are processed by the Operating System. The systems calls are intercepted by an I/O redirector layer that determines if the accessed data is part of the remote file system or the local attached file system. If the data is part of the DAS system, the systems calls are handled by the local file system. If the data is part of the remote file system, the file director passes the commands onto the Network File System Protocol stack that maps the file access system calls into command messages for accessing the remote file servers in the form of NFS or CIFS messages. These remote file access messages are then passed onto the TCP/IP protocol stack, which ensures reliable transport of the message across the network. The NIC driver ties the TCP/IP stack to the Traditional computer storage is defined by each server and 6

Storage Area Network

attached the storage device, Ethernet Network Interface card. The Ethernet NIC provides the physical interface and media access control function to the LAN network.

In the NAS device, the Network Interface Card receives the Ethernet frames carrying the remote file access commands. The NIC driver presents the datagrams to the TCP/IP stack. The TCP/IP stack recovers the original NFS or CIFS messages sent by the client system. The NFS file access handler processes the remote file commands from the NFS/CIFS messages and maps the commands into file access system calls to file system of the NAS device. The NAS file system, the volume manager and disk system device driver operate in a similar way as the DAS file system, translating the file I/O commands into block I/O transfers between the Disk Controller/ HBA and the Disk System that is either part of the NAS device or attached to the NAS device externally. It is important to note that the Disk System can be one disk drive, a number of disk drives clustered together in a daisy-chain or a loop, an external storage system rack, or even the storage resources presented to a SAN network that is connected with the HBA of the NAS device. In all cases, the storage resources attached to the NAS device can be accessed via the HBA or Disk controller with block level I/O.

There are several techniques for moving data between computers: , file transfer, and inter process communication, to name a few. But the real issue is that the information services organization has to acquire and manage the extra resources required both to copy data from Computer A to Computer B and to store it at both sites. There¶s no business reason for this duplication of effort, other than that a computer needs data that was produced by another computer.

4 Storage Area Network You can see in the above picture, that all devices are connected to one storage line called Storage Area Network. All servers are connected to storage SAN, with this architecture any application any data from the storage devices which any of the application stored.

7

Storage Area Network

Benefits of Storage Area Network:

 Reducing the cost of providing today¶s information services or providing or enabling new services that contribute positively to overall enterprise goals.  If all online storage is accessible by all computers, then no extra temporary storage is required to stage data that is produced by one computer and used by others. This can represent a substantial capital cost saving.  Reduces total enterprise capital cost for information processing without diminishing the delivered as reusing of TapeDrive which is little bit expensive  Administrative and operational savings in not having to implement and manage procedures for copying data from place to place. This can greatly reduce the cost of people²the one component cost of providing information services that doesn¶t go down every year  SAN connectivity enables the grouping of computers into cooperative clusters that can recover quickly from equipment or application failures and allow data processing to continue 24 hours a day, every day of the year.

8

Storage Area Network

2 What Makes SAN Best

Rule 1: When designing a SAN to access critical enterprise data, make sure the SAN is highly available (i.e., can survive failures of both components in it and components attached to it) and make sure it can grow well beyond anticipated peak performance needs without disruption.

Rule 2: When evaluating SAN implementation options, once the basic capacity, availability, and performance requirements can be met, look for advanced functionality available in the chosen architecture and consider how it might be used to further reduce cost or enhance the information services delivered to users.

Rule: 3 Hardware makes SANs possible; software makes SANs happen.

9

Storage Area Network

3 SAN Software

System applications: These applications build upon the basic SAN properties to provide a functionally enhanced execution environment for business applications. System applications include

y clustering, y data replication, y direct data copy between devices and y The utility functions that use it, and so forth.

Management applications. These applications manage the inherently more complex distributed system environment created by the presence of SANs.

y Zoning, device discovery y Allocation y RAID subsystem configurations are examples of applications that fall into this category.

SAN Software Capability

y Sharing Tape Drives: Tape drives indeed a very expensive devices.What is to keep a computer from (accidentally) writing to a tape while another computer is doing a backup? For two or three computers, an administrator can personally schedule tape drive usage so that this doesn¶t happen. Here we required the software take care of two or more application scheduling the writing to it. y Sharing Online Storage Devices: These are in housed RAID Sub System is similar to sharing Tape Drives. Except that more of it goes on and requirements for configuration changes are more dynamic. A typical enterprise RAID subsystem makes the online storage capacity of one or more arrays of disks appear to be one or more very large, very fast, or very reliable disks. o Application Failover o Sharing Data o Direct Data Movement between Devices

10

Storage Area Network

4 SAN Software Architecture

SAN is essentially the same as the software architecture of a DAS system. The key difference here is that the disk controller driver is replaced by either the Fibre Channel protocol stack, or the iSCSI/TCP/IP stack that provides the transport function for block I/O commands to the remote disk system across the SAN network. Using Fibre Channel as an example, the block I/O SCSI commands are mapped into Fibre Channel frames at the FC-4 layer (FCP). The FC-2 and FC-1 layer provides the signaling and physical transport of the frames via the HBA driver and the HBA hardware. As the abstraction of storage resources is provided at the block level, the applications that access data at the block level can work in a SAN environment just as they would in a DAS environment. This property is a key benefit of the SAN model over the NAS, as some high performance applications, such as database management systems, are designed to access data at the block level to improve their performance. Some database management systems even use proprietary file systems that are optimized for database applications. For such environments, it is difficult to use NAS as the storage solution because NAS provides only abstraction of network resources at the file system level for standard file systems that the Database Management System may not be compatible with. However, such applications have no difficulty migrating to a SAN model, where the proprietary file systems can live on top of the block level I/O supported by the SAN network. In the SAN storage model, the operating system views storage resources as SCSI devices. Therefore, the SAN infrastructure can directly replace Direct Attach Storage without significant change to the operating system.

5 Applications for SAN The new applications enabled by SAN technlogoy are actually new ways of organizing, accessing, and managing data. It is not that SAN technology introduces new accounting, transaction processing, Web server, electronic business or other techniques.

11

Storage Area Network

y Application 1: Backup: Electronic business has essentially eliminated computer system idle time and sent developers on a search for less obtrusive solutions to the backup problem. With today¶s technology, applications and database management systems must usually be made quiescent for an instant in order to get a backup started. This is an area for future development. protect against hardware failures, software mal functions, and user errors. Volume managers and RAID subsystems are tending to assume more of a role in protecting against hardware failures. The role of backup in creating point in time images of enterprise data remains. o LAN free backup: Bulk data travels over a SAN. LAN free backup removes backup data transfers from the LAN, with the result that backup no longer affects the performance of un involved clients and servers. Use of SCSI. o Tape Drive Sharing: As you know that tape sharing is an expensive device. It is connected to directly to SAN where it can accessed by backup application to backup. o The Designated Backup Server: There is a designated server which will schedule the backup. o Server less backup: In this case, instead of the copying the data to server from the disk, server directly give the command to RAID asking this much of bulk data has to be copy, so it is directly handled. o Off ± Host NAS Backup

y Application 2: Highly Available Data: Highly availability of data is a must for continuous computing. SAN Technology can be used to protect against a wider variety of threats to data integrity. RAID subsystem based protection against hardware failure can be augmented by server based volume management to create snapshots of operational data for mining and impact free backup purposes. o Mirroring 101: Technique of mirroring the data when one server writes data on to one disk will be automatically replicated to other.

y Application 3: Disaster Recoverability: The data is stored in two different physical locations where data distribution will be started.

y Application 4: Clusters ± Continuous Computing: Cluster technology organizes the resources required to run an application and manages failover from one server to another. Failover may be automatic, as when a server fails and its applications are restarted on an alternate server, or manual, in which

12

Storage Area Network

an administrator forces an application to migrate from one server to another for maintenance purpose. y Application 5: Data Replication: Data replication technology can be used to enable recovery from site disasters, for publication or consolidation of data in a distributed enterprise, or for moving data from one server to another. Variations of replication technology are used to replicate volumes, files and directory and databases.

Replication technology differs from mirroring in that it is designed to work with unreliable connections between primary and secondary sites and designed for situations in which the time required to propagate updates to secondary sites cannot be part of applications response time.

6 Storage Network Architecture The Volume: Disks are often logically combined by software that uses mirroring, RAID, and striping techniques to improve their net storage capacity, reliability, or I/O performance characteristics. Software that combines disks in this way presents disk-like storage entities to its clients. These disk-like entities are commonly known as volumes, logical units (abbreviated LUNs, for the numbers by which logical units are identified in popular I/O protocols), or virtual disks.

BASIC SAN MODEL

The figure illustrates the basic SAN data access model for a simple system with a SAN In this system, the file system runs in the application server. Volume functionality would typically be provided by external RAID controllers attached to the SAN, possibly augmented by a server-based volume manager.

13

Storage Area Network

7 Storage System In order to understand the storage systems, it is important to understand the disk drives, the most common building block of system systems, and the disk drive interface technologies used.

The storage devices include RAID disk arrays, Just a Bunch Of Disks (JBODs), tape systems, Network Attached Storage Systems, systems etc. The type of interfaces provided on these devices includes SCSI, Fibre Channel, and Ethernet.

14

Storage Area Network

y Disk Drive Interfaces o ATA o SATA o SCSI o SAS o Fibre Channel o JBOD o RAID

ATA: ATA is the primary internal storage interface for the PC, connecting the host system to peripherals such as hard drives, optical drives, CD-ROMs SATA: Serial ATA is the next generation internal storage inter connect designed to replace Ultra ATA. The SATA interface is an evolution of the ATA interface from parallel bus to serial bus architecture. The serial bus architecture overcomes the difficult electrical constraints hindering continued speed enhancement of the parallel ATA bus. SCSI: Small Computer System Interface (SCSI) defines a universal, parallel system interface, called the SCSI bus, for connecting up to eight devices along a single cable. SCSI is an independent and intelligent local I/O bus through which a variety of different devices and one or more controllers can communicate and exchange information independent of the rest of the system. SAS: To overcome the barrier of parallel bus cabling, the Serial Attached SCSI (SAS) is being defined to replace the physical layer of SCSI with serial bus technology. SAS is a new near-cabinet and disk interface technology that leverages the best of the SCSI and serial ATA interfaces to bring new capabilities, higher levels of availability, and more scalable performance to future generations of servers and storage systems. SAS uses a serial, point-to-point topology to overcome the performance barriers associated with storage systems based on parallel bus or architectures Fibre Channel: Fibre Channel Disk Drives typically come with single or dual Fibre Channel Arbitrated Loop (FC-AL) interfaces. The dual FC-AL interface is useful in storage systems for providing redundant cabling. Some drives even allow concurrent access from the two interfaces to increase the bandwidth to the drive. fabric. Using the FC-AL protocol, a large number of disk drives can be connected together for large capacity storage systems. All the devices on the loop share the bandwidth of the loop. To prevent a single point of failure in the physical loop topology, the disk drives in a storage system are typically connected to a central Port Bypass Controller (PBC) in a star topology. RAID: RAID (Redundant Array of Independent Disks) is a technology to combine multiple small, independent disk drivers into an array that looks like a single, big disk drive to the system. Simply putting n disk drives together (as in JBOD) results in a system with a failure rate that is n times the failure rate of a single disk. The high failure rate makes the concept impractical for addressing the high reliability and large capacity needs of enterprise storage.

15