Redundant Disk Arrays: Reliable, Parallel Secondary Storage Garth Alan Gibson
Total Page:16
File Type:pdf, Size:1020Kb
https://ntrs.nasa.gov/search.jsp?R=19920010046 2020-03-17T12:45:44+00:00Z Redundant Disk Arrays: Reliable, Parallel Secondary Storage Garth Alan Gibson (NASA-CR-189962) REDUNDANT DISK ARRAYS: N92-19288 RELIABLE, PARALLEL SECONDARY STORAGE Ph.D. Thesis (California Univ.) 267 p CSCL 09B Unclas G3/&0 0073849 Report No. UCB/CSD 91/613 December 1990 Computer Science Division (EECS) University of California, Berkeley Berkeley, California 94720 Redundant Disk Arrays: Reliable, Parallel Secondary Storage By Garth Alan Gibson B.Math.(Hons.) (University of Waterloo) 1983 M.S. (University of California) 1987 DISSERTATION Submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in COMPUTER SCIENCE in the GRADUATE DIVISION of the UNIVERSITY OF CALIFORNIA at BERKELEY ********************************** Redundant Disk Arrays: Reliable, Parallel Secondary Storage by Garth Alan Gibson Abstract During the past decade, advances in processor and memory technology have given rise to increases in computational performance that far outstrip increases in the performance of secon- dary storage technology. Coupled with emerging small-disk technology, disk arrays provide the cost, volume, and capacity of current disk subsystems but, by leveraging parallelism, many times their performance. Unfortunately, arrays of small disks may have much higher failure rates than the single large disks they replace. Redundant Arrays of Inexpensive Disks (RAID) use simple redundancy schemes to provide high data reliability. This dissertation investigates the data encoding, performance, and reliability of redundant disk arrays. Organizing redundant data into a disk array is treated as a coding problem in this disserta- tion. Among alternatives examined, codes as simple as parity are shown to effectively correct single, self-identifying disk failures. David A. Patterson The performance advantages of striping data across multiple disks are reviewed in this dissertation. For large transfers this parallelism reduces response time. Striping data also automatically distributes independent, small accesses across disks to increase throughput This dissertation evaluates the performance lost to the maintenance of redundant data. This loss is negligible for large transfers but can be significant for small writes because of increases in aggregate disk service time. Because disk arrays include redundancy to protect against the high failure rates caused by large numbers of disk components, it is crucial that disk failures be characterized. This disser- tation provides evidence that disk lifetimes can be modeled as exponential random variables. Building on an exponential model for disk lifetimes, this dissertation presents analytic models for disk-array lifetime, evaluates these against event-driven simulation, and applies them to an example redundant disk array. These models incorporate the effects of independent and dependent disk failures (shared support hardware) as well as the effects of on-line spare disks. For the example redundant disk array, these models show mat a 10% overhead for an N+l -parity encoding plus a 10% overhead for on-line spares can provide higher reliability than the 100% overhead of conventional mirrored disks. Redundant Disk Arrays: Reliable, Parallel Secondary Storage Copyright © 1991 by Garth Alan Gibson Dedicated to my grandparents Ida Maud (Babcock) Stocks You are my Mnemosyne, my goddess of memory. Your reminiscences have given me a family mythology: ' 'candies down the pea belt'' ' 'Billy Bounce'' and' 'United Empire Loyalists'' So much of my identity is based on your retrospections. Herbert Samuel Albert Stocks Although I never knew you, I am forever influenced by your presence in the warm waters of my favourite place, the Madawaska. Ruth Eliza (Warren) Gibson You left me pearls in these words from your poem, Time: ' 'There is nothing so precious as time It is neither yours nor mine To use it well, we have to learn For yesterday's time will never return." Henry Walker Gibson Always a smile, always a tease, always a kind and wise word, it was always so comfortable to be with you, even if you did always win at horseshoes. li Acknowledgements The time I have spent at Berkeley has been richly rewarding primarily because of the efforts of David Patterson. He has given me guidance, enthusiasm, and wisdom as well as countless hours of time. The challenges that he laid out for me first attracted me to Berkeley, his patient council brought me through the darkest days of the development of SPUR, and his keen intuition provoked my research into I/O systems. I have benefited immeasurably from his tutelage. All three members of my committee, David Patterson, John Ousterhout, and Ronald Wolff, have contributed generously to the research reported in this dissertation. Although Dave propelled me through this research, all might have been in vain without John's insights into sys- tems and Ron's patient coaching on modeling. Of course, all three deserve a special note of thanks for wading through drafts of this dissertation. This dissertation also benefited substantially from the efforts of two others, Tom Philippi and Barbara Beach. Their careful proofreading, extensive editorial suggestions, and gentle humour made my task much more bearable. Contributing to the contents of Chapter 3 in mis dissertation have been many gracious col- leagues. The basic ideas for arrays of small-diameter disks grew out of work I did with David Patterson and Randy Katz. The extensions to binary, double-erasure correcting codes is from collaborative work with Richard Karp, Lisa Hellerstein, David Patterson, and Randy Katz. David Patterson, Randy Katz, and I also contributed to the RAID measurement experiment car- ried out by Peter Chen with the assistance of Amdahl Corporation. Peter Chen and Ed Lee also iii contributed insight into their woric on data striping-unit selection and parity placement stra- tegies, respectively. Thanks go to Thinking Machines Corporation for making available to me the disk failure data reported in Chapter 4. Rick Epstein, Sue Maxwell, and Raymond Mak were instrumental in making this possible. Thanks also go to Sue Maxwell and Joseph Yarmus for reading and improving a draft of this chapter. The contents of Chapter 5 grew out of research from Martin Schulze's Masters report and his subsequent work with David Patterson, Randy Katz, and myself. In developing the models in this chapter, I relied heavily on the experience and insight of Ronald Wolff. On a broader note, there have been many people who contributed to my understanding of the material in this dissertation. Early in my research I was fortunate to encounter the vision of Michael Stonebraker, Dieter Gawlick (then with Amdahl, now with DEC), and Jim Gray (then with Tandem, now with DEC). As you will see in Chapter 3, Jim has made a significant and provocative contribution to my understanding of disk array performance. I have also greatly benefited from the expert advice of Mike Mitoma, Jim Brady, and Jai Menon of IBM Almaden Research Center, Dave Gordon of Array Technology, Dave Anderson of Imprimis (now Seagate), and David Tweten of NAS A/NAS at Ames. Here at Berkeley, my research developed in parallel with the RAID project Many of its members, notably Randy Katz, David Patterson, Ken Lutz, Peter Chen, Ed Lee, Ann Cher- venak, Rich Drewes, Ethan Miller, and Martin Schulze, contributed to my understanding of disk arrays. Their efforts in the construction of a prototype, called RAID-I, were important to our collective education. Together with the RAID group, Berkeley's POSTGRES and Sprite research groups formed the XPRS research environment which combined database, operating systems, and I/O systems research into a wide learning experience for me. Michael Stonebraker, Margo Seltzer, and Mark Sullivan, members of POSTGRES, had a great influence on my thinking. iv I have been involved and enriched by the Sprite project since it began. Sprite, as an experimental operating system, has provided me with an powerful simulation engine though its transparent process migration and its large number of high-performance workstations. Sprite is the result of the efforts of John Ousterhout, Brent Welch, Mike Nelson, Fred Douglas. Andrew Cherenson, Mendel Rosenblum, Mary Baker, John Hartman, Ken Sniniff, Mike Kupfer, Bob Bruce, and Adam de Boor. Fred and Adam, in particular, are responsible for the massive amount of simulation Sprite transparently provided for me. Although this research does not include any material directly pertaining to my earlier involvement with the SPUR multiprocessor workstation project, this project and its members were instrumental to my approach to systems research. Thanks to David Patterson, Randy Katz, David Hodges, George Taylor, Mark Hill, David Wood, Susan Eggers, Deog-Kyoon Jeong, Shing Kong, Corinna Lee, Joan Pendleton, Scott Ritchie, Walter Beach, Daebum Lee, Douglas Johnson, and Ken Lutz. I think that an important, and often overlooked source of education in graduate school is the interaction between junior and senior students. For me, Mark Hill, Tony DeRose, George Taylor, Susan Eggers, David Wood, Gregg Foster, and Prabhakar Ragde were notable peer mentors. As I have tried to pass on this fine tradition to Peter Chen, Ed Lee, and Ann Cher- venak, I have learned anew how educating collaboration with bright new minds can be. Special thanks go to those who have helped me enjoy life during the preparation of this dissertation. Henry Moreton and the Mystery! gang gave me a weekly excuse for diversion, and Ania and Tina Upstill and their parents Steve and Krys were my family away from home. Although distant, my family at home are the root of the strength and conviction that gets me through trying times. Their love for me and confidence in me shield me from all dangers. My final thanks go to Barbara Beach. Barbara was the one who, many years ago, sug- gested that I pursue a PhD degree. It was she, more than most, who suffered as I followed through on this suggestion.