RAID

John Williams & Kevin Forkl Brief History of RAID ● Developed by a team from the University of California, Berkely ● Lead by David A. Patterson ● Project began in 1987 ● Team had been working on RISC processors ○ Thought "Processors are going to start getting fast, improving faster than they have in the past. So what are we going to do about I/O?" ● Saw smaller disks as a building block ● Wrote "The Case for Redundant Arrays of Inexpensive Disks" ○ Advocated replacing larger disks by lots of smaller ones ● Originally RAID was geared towards performance ○ PC community saw it as dependability oriented The Term RAID

● When first coined the term RAID stood for Redundant Array (of) Inexpensive Disks ● System was so expensive that the term had to be changed to be able to market it ● Was changed to also mean Redundant Array (of) Independent Disks Striping vs Mirroring

Striping ● Process of writing date across all drives in the array ● Data is written in round-robin fashion ● Data is also read in the same way

Mirroring ● Stores an exact replica of all data on a separate disk or disks ● Using mirroring causes the system to take a performance hit when performing writes ○ The more mirrored disks the worse the performance ● Read operations improve since they can be done in parallel Software RAID vs Hardware RAID

Software RAID (Best for RAID 0, 1) ● More Flexible ● Cheaper ● Uses CPU ● Unprotected at boot Hardware RAID (Best for RAID 5,6) ● System independent ○ Safe from viruses ○ Calculations are done on the RAID card ● Disk hot-plug ● Protection from power loss ● Movable between operating systems Types of RAID to be covered

RAID 0 RAID 1 RAID 2 RAID 3 RAID 4 RAID 5 RAID 6 RAID 1+0 RAID 0+1 RAID 0

Operating System Concepts (8th ed) by Silberschatz, Galvin and Gagne

● Known as stripe set or striped volume ● Splits data evenly across two or more disks ● No parity information ● Not one of the orginal RAID levels ● No data redundancy ● Normally used to increase performance ● Can be created using disks of differing sizes ○ Is limited to the size of the smallest disk though ○ If a 500 GB disk and a 750 GB disk were used: ■ Size = 2* min(500 GB, 750 GB) = 2* 500 GB = 1000 GB RAID 0 (Performance)

Picture Source: http://tweakers.net/reviews/515/1/raid-0-hype-or-blessing-pagina-1.html RAID 1

Operating System Concepts (8th ed) by Silberschatz, Galvin and Gagne

● Known as mirroring or shadowing ● Uses twice as many disks as a non-redundant ● When data is written to one disk the same data must be also written to a redundant disk ● Data is read from the disk with the shorter queuing, seek and rotational delays ● If one disk fails then the other is used ● Used frequently in database applications ○ Availability and transaction time of higher importance ○ Storage efficiency is of less importance RAID 2

Operating System Concepts (8th ed) by Silberschatz, Galvin and Gagne ● Stripes data at the bit level ● Uses Hamming code for error correction ● Disks are synchronized by controller to spin at the same angular orientation ● Extremely high data transfer rates possible ● The read/write level error correction code used later became standard firmware feature on hard drives ● No longer had an advantage over other RAID levels ● No longer is used RAID 3

Operating System Concepts (8th ed) by Silberschatz, Galvin and Gagne

● Byte-level striping with a dedicated parity disk ● Required all disks operate in lockstep ○ All spindles are synchronized ○ Added design design considerations ○ No significant advantage over other RAID levels RAID 4

Operating System Concepts (8th ed) by Silberschatz, Galvin and Gagne

● Block-level striping with a dedicated parity disk ● When data is written to an array disk an algorithm generates recovery information ○ Recovery information is written to parity drive ● If a single disk fails the algorithm is reversed and missing data is automatically generated based on remaining data and the parity information RAID 5

Operating System Concepts (8th ed) by Silberschatz, Galvin and Gagne ● Block-level striping with parity data distributed across all disks ● Low cost of redundancy ● Advantage over RAID 4 since it does not have to write parity information to a single drive during each write operation ● Allows all disks to be used when servicing read operations ○ Parity disk systems do not use the parity disk on reads ● Best performance for small reads and large writes of a redundancy disk array ● Small writes are less efficient than mirroring ○ Needs to perform read-modify-write operations to update parity RAID 5 (Performance)

● Random Read Performance - Excellent: Better with large stripe sizes. Parity information is unused during normal reads. ● Random Write Performance - Fair: Computing the slows down writes; better then RAID 3 and 4 dur to lack of dedicated parity drive. ● Sequential Read Performance - Good: Better with small stripe sizes. ● Sequential Write Performance - Fair: Better then RAID 1+0. RAID 6

Operating System Concepts (8th ed) by Silberschatz, Galvin and Gagne ● Block-level striping with two parity blocks distributed across all disks ○ Simply adds another parity block to RAID 5 ● Uses Reed-Solomon codes to protect against up to two disk failures using a minimum of two redundant disk arrays RAID 6 (cont)

● Random Read Performance: Excellent; Better for larger stripe sizes.

● Random Write Performance: Poor; Dual parity overhead and complexity.

● Sequential Read Performance: Good; Better for smaller stripe sizes.

● Sequential Write Performance: Fair; Slightly better then random write. RAID 1+0 and Raid 0+1

● Uses both striping and mirroring

Operating System Concepts (8th ed) by Silberschatz, Galvin and Gagne Common RAID Disk Data Format

● Specification defines a standard data structure describing how data is formatted across the disks in a RAID group ● Allows a basic level of operation between different suppliers of RAID technology ● Benefits storage users by enabling data-in-place migration among systems from different vendors ● SNIA (Storage Networking Industry Association) currently uses the Common Raid Disk Data Format (DDF) Specification v2.0 Questions? Works Cited

● Common RAID Disk Data format ○ http://www.snia.org/tech_activities/standards/curr_standards/ddf ● RAID level information ○ http://books.google.com/books? id=RM4tahggCVcC&pg=PA6&dq=raid+2+implementation&hl=en#v=onepage&q=raid%202% 20implementation&f=false ○ http://www.ecs.umass.edu/ece/koren/architecture/Raid/basicRAID.html ○ http://www.chicago-data-recovery.com/raid-levels.php ○ http://tweakers.net/reviews/515/1/raid-0-hype-or-blessing-pagina-1.html ○ http://sqlblog.com/blogs/linchi_shea/archive/2007/02/07/is-raid-5-really-that-bad.aspx ○ http://www.pcguide.com/ref/hdd/perf/raid/levels/singleLevel5-c.html ○ http://www.pcguide.com/ref/hdd/perf/raid/levels/singleLevel6-c.html ● RAID Background information ○ http://www.computerworld.com/s/article/87093/The_Story_So_Far