
• Decentralized Services ◆ We want to take a look at decentralization as a way towards survivability ◆ The case study is: Survivable Storage Systems – What is Survivable Storage • – Have we seen “flavors” of such concept before? » RAID technology can be considered survivable » However, malicious concepts were not considered – We want to look at the PASIS project » basis for the discussion on Survivable Storage are the PASIS papers ■ http://www.pdl.cmu.edu/Pasis/ 1 CS448/548 Sequence 16 • Decentralization – Before discussing Survivable Storage, I would like to briefly discuss the concept of RAIDs and how it plays into “thinking survivable” • » The basis for the RAID discussion is the 1988 paper by Patterson ■ Patterson, D.A., et. al., “A Case for Redundant Arrays of Inexpensive Disks (RAID)”, ACM SIGMOD Records, International Conference on Management of Data, Vol.~17, No.~3, pp.~109-116, June~1988. » The following material is probably to detailed for our discussion. I will only outline the basic concepts of RAID, as they will help to get a feeling for the performance issues associated with survivable storage » Note that the Patterson paper is very dated, yet, there are very interesting issues that are still valid! 2 CS448/548 Sequence 16 • RAID ◆ RAID Redundant Arrays of Inexpensive Disks ◆ Motivation – single chip computers improved in performance by 40% per year • – RAM capacity quadrupled capacity every 2-3 years – Disks (magnetic technology) » capacity doubled every 3 years » price cut in half every 3 years » raw seek time improved 7% every year – Note: values presented in Pattersons’ paper are dated! – Note: paper discusses “pure” RAID, not smarter implementations, e.g. caching. 3 CS448/548 Sequence 16 • RAID – Amdahl’s Law: Effective Speedup • » f = fraction of work in fast mode » k = speedup while in fast mode Example: » assume 10% I/O operation » if CPU 10x => effective speedup is 5 » if CPU 100x => effective speedup is 10 ■ 90 % of potential speedup is wasted 4 CS448/548 Sequence 16 • RAID ◆ Motivation – compare “mainframe mentality” with “today's” possibilities, e.g. cost, configuration • • CPU Mainframe • CPU Small Computer • • Memory Channel •Memory • DMA •Controller • SCSI • • • • • • • • • • • • • • • • 5 CS448/548 Sequence 16 • RAID – Reliability Bad news! – e.g. MTTFdisk = 30,000 h • MTTF100 = 300 h ( < 2 weeks) MTTF1000 = 30 h – Note, that these numbers are very dated. Todays drives are much better. MTBF > 300,000 to 800,000 hours. – Is that really true though????? – even if we assume higher MTTF of individual disks, the problem stays. 6 CS448/548 Sequence 16 • RAID ◆ RAID Reliability – partition disks into reliability groups and check disks » D = total number of data disks » G = # data disks in group » C = # check disks in group • 7 CS448/548 Sequence 16 • RAID ◆ Target Systems – Different RAID solutions will benefit different target system configurations. – Supercomputers • » larger blocks of data, i.e. high data rate – Transaction processing » small blocks of data » high I/O rate » read-modify-write sequences 8 CS448/548 Sequence 16 • RAID ◆ 5 RAID levels – RAID 1: mirrored disks – RAID 2: hamming code for ECC – RAID 3: single check disk per group • – RAID 4: independent read/writes – RAID 5: no single check disk 9 CS448/548 Sequence 16 • RAID ◆ other RAIDs (derived after the paper) – RAID 0 » employs striping with no redundancy at all » claim of fame is speed alone » has best write performance, but not the best read performance • ■ why? (other RAIDs can schedule requests on the disk with the shortest expected seek and rotational delay) – RAID 6 (P + Q Redundancy) » uses Reed-Solomon code to protect against up to 2 disk failures using the bare minimum of 2 redundant disks. 10 CS448/548 Sequence 16 RAID 0 (non-redundant) 14 RAID 1 (mirrored) 15 RAID 2 (redundancy through Hamming code) 16 RAID 3 (bit-interleaved parity) 17 RAID 4 (block-level parity) 18 RAID 5 (block-level distributed parity) 19 RAID 6 (dual redundancy) 20 RAID 10 RAID 10 is sometimes also called RAID 1+0 source: http://www.illinoisdataservices.com/raid-10-data-recovery.html © 2013 A.W. Krings 18 RAID 0+1 source: http://www.illinoisdataservices.com/raid-10-data-recovery.html © 2013 A.W. Krings 19 • RAID ◆ RAID level 1: Mirrored Disks – Most expensive option – Tandem doubles controllers too – Write to both disks • – Read from one disk – Characteristics: » S = slowdown. In synchronous disks spindles are synchronized so that the corresponding sectors of a group of disks can be accessed simultaneously. For synchr. disks S = 1. » Reads = 2D/S, i.e. concurrent read possible » Write = D/S, i.e. no overhead for concurrent write of same data » R-Modify-Write = 4D/(3S) » Pat88 Table II (pg. 112) 20 CS448/548 Sequence 16 • RAID Pat88 Table II • 21 CS448/548 Sequence 16 • RAID ◆ RAID level 2: Hamming Code – DRAM => problem with α-particles » Solution, e.g. parity for SED, Hamming code for SEC – Recall Hamming Code • – Same idea using one disk drive per bit – Smallest accessible unit per disk is one sector » access G sectors, where G = # data disks in a group – If operation on a portion of a group is needed: 1) read all data 2) modify desired position 3) write full group including check info 22 CS448/548 Sequence 16 • Recall Hamming Code • m = data bits k = parity bits 23 CS448/548 Sequence 16 • Compute Check • 24 CS448/548 Sequence 16 • RAID – Allows soft errors to be corrected “on the fly”. – Useful for supercomputers, not useful for transaction processing e.g. used in Thinking Machine (Connection Machine) • “Data Vault” with G = 32, C = 8. – Characteristics: » Pat88 Table III (pg 112) 25 CS448/548 Sequence 16 • RAID Pat88 Table III • 26 CS448/548 Sequence 16 • RAID ◆ RAID level 3: Single Check Disk per Group – Parity is SED not SEC! – However, often controller can detect if a disk has failed • » information of failed disk can be reconstructed » extra redundancy on disk, i.e. extra info on sectors etc. – If check disk fails » read data disks to restore replacement – If data disk fails » compute parity and compare with check disk » if parity bits are equal => data bit = 0 » otherwise => data bit = 1 27 CS448/548 Sequence 16 • RAID – Since less overhead, i.e. one check disk only => Effective performance increases – Reduction in disks over L2 decreases maintenance • – Performance same as L2, however, effective performance per disk increases due to smaller number of check disks – Better for supercomputers, not good for transaction proc. – Maxtor, Micropolis introduced first RAID-3 in 1988 – Characteristics: » Pat88 Table IV (pg 113) 28 CS448/548 Sequence 16 • RAID Pat88 Table IV (pg 113) • 29 CS448/548 Sequence 16 • RAID ◆ RAID level 4: Independent Reads/Writes – Pat88 fig 3 pg. 113 compares data locations • 30 CS448/548 Sequence 16 • RAID ◆ RAID level 4: Independent Reads/Writes – Disk interleaving has advantages and disadvantages – Advantage of previous levels: » large transfer bandwidth • – Disadvantages of previous levels: » all disks in a group are accessed on each operation (R,W) » spindle synchronization ■ if none => probably close to worse case average seek times, access times (tracking + rotation) – Interleave data on disks at sector level – Uses one parity disk 31 CS448/548 Sequence 16 • RAID – for small accesses » need only access to 2 disks, i.e. 1 data & parity » new parity can be computed from old parity + old/new data » compute: Pnew = dataold XOR datanew XOR Pold • – e.g. small write 1) read old data + parity •in parallel 2) write new data + parity – Bottleneck is parity disk – e.g. small read » only read one drive (data) – Characteristics: » Pat88 Table V (pg 114) 32 CS448/548 Sequence 16 • RAID Pat88 Table V (pg 114) • 33 CS448/548 Sequence 16 • RAID ◆ RAID level 5: No Single Check Disk – Distributes data and check info across all disks, i.e. there are no dedicated check disks. – Supports multiple individual writes per group • – Best of 2 worlds » small Read-Modify-Write » large transfer performance » 1 more disk in group => increases read performance – Characteristics: » Pat88 Table VI (pg 114) 34 CS448/548 Sequence 16 • RAID ◆ Pat88 Table VI (pg 114) • 35 CS448/548 Sequence 16 • RAID ◆ Patterson Paper – discusses all levels on pure hardware problem – refers to software solutions and alternatives, e.g. disk buffering • – with transfer buffer the size of a track, spindle synchronization of groups not necessary – improving MTTR by using spares – low power consumption allows use of UPS – relative performance shown in Pat88 fig. 5 pg. 115 36 CS448/548 Sequence 16 • RAID ◆ relative performance shown in Pat88 fig. 5 pg. 115 • 37 CS448/548 Sequence 16 • RAID ◆ Summary – Data Striping for improved performance » distributes data transparently over multiple disks to make them appear as a single fast, large disk » improves aggregate I/O performance by allowing multiple I/Os to be serviced in parallel ■ • independent requests can be serviced in parallel by separate disks ■ single multiple-block block requests can be serviced by multiple disks acting in coordination – Redundancy for improved reliability » large number of disks lowers overall reliability of disk array » thus redundancy is necessary to tolerate disk failures and allow continuous operation without data loss 38 CS448/548 Sequence 16 3.5.4 Orthogonal RAID To this point in the paper, we have ignored the issue of how to connect disks to the host com- puter. In fact, how one does this can drastically affect performance and reliability. Most computers connect multiple disks via some smaller number of strings. A string failure thus causes multiple, simultaneous disk failures. If not properly designed, these multiple failures can cause loss of data. For example, consider the 16-disk array in Figure 7 and two options of how to organize mul- • tiple error-correction groups. Option 1 combines each string of four disks into a single error-cor- rection group.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages23 Page
-
File Size-