
RAID is an acronym first defined by David A. Patterson, Garth A. Gibson and Randy Katz at the University of California, Berkeley in 1987 to describe a Redundant Array of Inexpensive Disks,[1] a technology that allowed computer users to achieve high levels of storage reliability from low-cost and less reliable PC-class disk- drive components, via the technique of arranging the devices into arrays for redundancy. More recently, marketers representing industry RAID manufacturers reinvented the term to describe a Redundant Array of Independent Disks as a means of disassociating a "low cost" expectation from RAID technology.[2] "RAID" is now used as an umbrella term for computer data storage schemes that can divide and replicate data among multiple hard disk drives. The different schemes/architectures are named by the word RAID followed by a number, as in RAID 0, RAID 1, etc. RAID's various designs all involve two key design goals: increased data reliability or increased input/output performance. When multiple physical disks are set up to use RAID technology, they are said to be in a RAID array. This array distributes data across multiple disks, but the array is seen by the computer user and operating system as one single disk. RAID can be set up to serve several different purposes. Purpose and basics Redundancy is achieved by either writing the same data to multiple drives (known as mirroring), or writing extra data (known as parity data) across the array, calculated such that the failure of one (or possibly more, depending on the type of RAID) disks in the array will not result in loss of data. A failed disk may be replaced by a new one, and the lost data reconstructed from the remaining data and the parity data. Organizing disks into a redundant array decreases the usable storage capacity. For instance, a 2-disk RAID 1 array loses half of the total capacity that would have otherwise been available using both disks independently, and a RAID 5 array with several disks loses the capacity of one disk. Other types of RAID arrays are arranged so that they are faster to write to and read from than a single disk. There are various combinations of these approaches giving different trade-offs of protection against data loss, capacity, and speed. RAID levels 0, 1, and 5 are the most commonly found, and cover most requirements. RAID 0 (striped disks) distributes data across several disks in a way that gives improved speed and no lost capacity, but all data on all disks will be lost if any one disk fails. Although such an array has no actual redundancy, it is customary to call it RAID 0. RAID 1 (mirrored settings/disks) duplicates data across every disk in the array, providing full redundancy. Two (or more) disks each store exactly the same data, at the same time, and at all times. Data is not lost as long as one disk survives. Total capacity of the array equals the capacity of the smallest disk in the array. At any given instant, the contents of each disk in the array are identical to that of every other disk in the array. RAID 5 (striped disks with parity) combines three or more disks in a way that protects data against loss of any one disk; the storage capacity of the array is reduced by one disk. RAID 6 (striped disks with dual parity) (less common) can recover from the loss of two disks. RAID 10 (or 1+0) uses both striping and mirroring. "01" or "0+1" is sometimes distinguished from "10" or "1+0": a striped set of mirrored subsets and a mirrored set of striped subsets are both valid, but distinct, configurations. RAID can involve significant computation when reading and writing information. With traditional "real" RAID hardware, a separate controller does this computation. In other cases the operating system or simpler and less expensive controllers require the host computer's processor to do the computing, which reduces the computer's performance on processor-intensive tasks (see "Software RAID" and "Fake RAID" below). Simpler RAID controllers may provide only levels 0 and 1, which require less processing. RAID systems with redundancy continue working without interruption when one (or possibly more, depending on the type of RAID) disks of the array fail, although they are then vulnerable to further failures. When the bad disk is replaced by a new one the array is rebuilt while the system continues to operate normally. Some systems have to be powered down when removing or adding a drive; others support hot swapping, allowing drives to be replaced without powering down. RAID with hot- swapping is often used in high availability systems, where it is important that the system remains running as much of the time as possible. RAID is not a good alternative to backing up data. Data may become damaged or destroyed without harm to the drive(s) on which they are stored. For example, part of the data may be overwritten by a system malfunction; a file may be damaged or deleted by user error or malice and not noticed for days or weeks; and, of course, the entire array is at risk of physical damage. [edit]Principles RAID combines two or more physical hard disks into a single logical unit by using either special hardware or software. Hardware solutions often are designed to present themselves to the attached system as a single hard drive, so that the operating system would be unaware of the technical workings. For example, you might configure a 1TB RAID 5 array using three 500GB hard drives in hardware RAID, the operating system would simply be presented with a "single" 1TB disk. Software solutions are typically implemented in the operating system and would present the RAID drive as a single drive to applications running upon the operating system. There are three key concepts in RAID: mirroring, the copying of data to more than one disk; striping, the splitting of data across more than one disk; and error correction, where redundant data is stored to allow problems to be detected and possibly fixed (known as fault tolerance). Different RAID levels use one or more of these techniques, depending on the system requirements. RAID's main aim can be either to improve reliability and availability of data, ensuring that important data is available more often than not (e.g. a database of customer orders), or merely to improve the access speed to files (e.g. for a system that delivers video on demand TV programs to many viewers). The configuration affects reliability and performance in different ways. The problem with using more disks is that it is more likely that one will fail, but by using error checking the total system can be made more reliable by being able to survive and repair the failure. Basic mirroring can speed up reading data as a system can read different data from both the disks, but it may be slow for writing if the configuration requires that both disks must confirm that the data is correctly written. Striping is often used for performance, where it allows sequences of data to be read from multiple disks at the same time. Error checking typically will slow the system down as data needs to be read from several places and compared. The design of RAID systems is therefore a compromise and understanding the requirements of a system is important. Modern disk arrays typically provide the facility to select the appropriate RAID configuration. [edit]Standard levels Main article: Standard RAID levels A number of standard schemes have evolved which are referred to as levels. There were five RAID levels originally conceived, but many more variations have evolved, notably several nested levels and many non-standard levels (mostly proprietary). Following is a brief summary of the most commonly used RAID levels.[3] Space efficiency is given as amount of storage space available in an array of n disks, in multiples of the capacity of a single drive. For example if an array holds n=5 drives of 250GB and efficiency is n-1 then available space is 4 times 250GB or roughly 1TB. Level Description Minimum # of disks Space Efficiency Image RAID 0 "Striped set without parity" or "Striping". Provides improved performance and additional storage but no redundancy or fault tolerance. Any disk failure destroys the array, which has greater consequences with more disks in the array (at a minimum, catastrophic data loss is twice as severe compared to single drives without RAID). A single disk failure destroys the entire array because when data is written to a RAID 0 drive, the data is broken into fragments. The number of fragments is dictated by the number of disks in the array. The fragments are written to their respective disks simultaneously on the same sector. This allows smaller sections of the entire chunk of data to be read off the drive in parallel, increasing bandwidth. RAID 0 does not implement error checking so any error is unrecoverable. More disks in the array means higher bandwidth, but greater risk of data loss. 2 n RAID 1 'Mirrored set without parity' or 'Mirroring'. Provides fault tolerance from disk errors and failure of all but one of the drives. Increased read performance occurs when using a multi-threaded operating system that supports split seeks, very small performance reduction when writing. Array continues to operate so long as at least one drive is functioning. Using RAID 1 with a separate controller for each disk is sometimes called duplexing. 2 n/2 RAID 2 Hamming code parity.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages4 Page
-
File Size-