Scalable and Manageable Storage Systems

Total Page:16

File Type:pdf, Size:1020Kb

Scalable and Manageable Storage Systems Scalable and manageable storage systems Khalil S. Amiri December, 2000 CMU-CS-00-178 Department of Electrical and Computer Engineering Carnegie Mellon University Pittsburgh, PA 15213 Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Thesis Committee: Garth A. Gibson, Chair Gregory R. Ganger M. Satyanarayanan Daniel P. Siewiorek John Wilkes, HP Laboratories This research is sponsored by DARPA/ITO, through DARPA Order D306, and issued by Indian Head Division under contract N00174-96-0002. Additional support was provided by generous contributions of the member companies of the Parallel Data Consortium including 3COM Corporation, Compaq, Data General, EMC, Hewlett-Packard, IBM, Intel, LSI Logic, Novell, Inc., Seagate Technology, StorageTek, Quantum Corporation and Wind River Systems. The views and conclusions contained in this document are those of the author and should not be interpreted as representing the official policies or endorsements, either expressed or implied, of any supporting organization or the US Government. Keywords: storage, network-attached storage, storage management, distributed disk arrays, concurrency control, recovery, function partitioning. Abstract Emerging applications such as data warehousing, multimedia content distribution, electronic com- merce and medical and satellite databases have substantial storage requirements that are growing at 3X to 5X per year. Such applications require scalable, highly-available and cost-effective storage systems. Traditional storage systems rely on a central controller (file server, disk array controller) to access storage and copy data between storage devices and clients which limits their scalability. This dissertation describes an architecture, network-attached secure disks (NASD), that elimi- nates the single controller bottleneck allowing throughput and bandwidth of an array to scale with increasing capacity up to the largest sizes desired in practice. NASD enables direct access from client to shared storage devices, allowing aggregate bandwidth to scale with the number of nodes. In a shared storage system, each client acts as its own storage (RAID) controller, performing all the functions required to manage redundancy and access its data. As a result, multiple controllers can be accessing and managing shared storage devices concurrently. Without proper provisions, this concurrency can corrupt redundancy codes and cause hosts to read incorrect data. This dissertation proposes a transactional approach to ensure correctness in highly concurrent storage device arrays. It proposes distributed device-based protocols that exploit trends towards increased device intelligence to ensure correctness while scaling well with system size. Emerging network-attached storage arrays consist of storage devices with excess cycles in their on-disk controllers, which can be used to execute filesystem function traditionally executed on the host. Programmable storage devices increase the flexibility in partitioning filesystem function between clients and storage devices. However, the heterogeneity in resource availability among servers, clients and network links causes optimal function partitioning to change across sites and with time. This dissertation proposes an automatic approach which allows function partitioning to be changed and optimized at run-time by relying only on the black-box monitoring of functional components and of resource availability in the storage system. i ii This dissertation is dedicated to my mother, Aicha, and to my late father, Sadok. iii iv Acknowledgements The writing of a dissertation can be a daunting and tasking experience. Fortunately, I was lucky enough to have enjoyed the support of a large number of colleagues, family members, faculty and friends that have made this process almost enjoyable. First and foremost, however, I am grateful to God for blessing me with a wonderful family and for giving me the opportunity and ability to complete my doctorate. I would like to thank my advisor, Professor Garth Gibson, for his advice and encouragement during my time at Carnegie Mellon. I am also grateful to the members of my thesis committee, Gregory Ganger, Dan Siewiorek, Satya and John Wilkes for agreeing to serve on the committee and for their valuable guidance and feedback. I am also indebted to John Wilkes for his mentoring and encouragement during my internships at HP labs. My officemates, neighbors, and collaborators on the NASD project, Fay Chang, Howard Go- bioff, Erik Riedel and David Rochberg were great colleagues and wonderful friends. I am also grateful to the members of the RAIDFrame and NASD research groups and to the members of the entire Parallel Data Laboratory, who always proved ready to engage in challenging discussions. In particular, I would like to thank Jeff Butler, Bill Courtright, Mark Holland, Eugene Feinberg, Paul Mazaitis, David Nagle, Berend Ozceri, Hugo Patterson, Ted Wong, and Jim Zelenka. In addition, I would like to thank LeAnn, Andy, Charles, Cheryl, Chris, Danner, John, Garth, Marc, Sean, and Steve. I am grateful to Joan Digney for her careful proofreading of this document, for her valuable comments, and for her words of encouragement. I am also grateful to Patty for her friendship, support, assistance, and for running such a large laboratory so well. I am equally grateful to Karen for helping me with so many tasks and simply for being such a wonderful person. During my time at Carnegie Mellon, I had several fruitful collaborations with colleagues within and outside the University. In particular, I would like to thank David Petrou, who worked with me on ABACUS, for always proving ready to engage in challenging discussions or to start any design or v vi implementation task we thought was useful. I would also like to thank Richard Golding who worked with me on the challenging scalability and fault-tolerance problems in storage clusters. Richard was always a source of good ideas, advice and encouragement, in addition to being great company. I am deeply grateful to the members of the Parallel Data Lab’s Industrial Consortium for helping me understand the practical issues surrounding our research proposals. In particular, I am grateful to Dave Anderson and Chris Mallakapali of Seagate, to Elizabeth Borowsky then of HP labs, to Don Cameron of Intel, and to Satish Rege of Quantum. I can not imagine my six years in graduate school without the support of many friends in the United States and overseas. I would like to thank in particular Adel, Ferdaws, Gaddour, Karim, Najed, Rita, Sandra, Sean, and Sofiane. Last but not least, I am infinitely grateful to my mother, Aicha, and my late father, Sadok. They provided me with unconditional love, advice, and support. My late father and his father before him presented me with admirable role models of wisdom and perseverance. Despite our long- distance relationship, my brothers and sisters, Charfeddine, Amel, Ghazi, Hajer, Nabeel, and Ines overwhelmed me with more love and support throughout my years in college and graduate school than what I could have ever deserved. Many thanks go to my uncles and aunts and their respective families, for their unwavering support. Khalil Amiri Pittsburgh, Pennsylvania December, 2000 Contents 1 Introduction 1 1.1 The storage management problem . 2 1.1.1 The ideal storage system . 2 1.1.2 The shortcomings of current systems . 3 1.2 Dissertation research . 4 1.3 Dissertation road map . 6 2 Background 9 2.1 Trends in technology . 9 2.1.1 Magnetic disks . 10 2.1.2 Memory . 12 2.1.3 Processors . 13 2.1.4 Interconnects . 14 2.2 Application demands on storage systems . 19 2.2.1 Flexible capacity . 19 2.2.2 High availability . 20 2.2.3 High bandwidth . 21 2.3 Redundant disk arrays . 23 2.3.1 RAID level 0 . 24 2.3.2 RAID level 1 . 24 2.3.3 RAID level 5 . 24 2.4 Transactions . 26 2.4.1 Transactions . 27 2.4.2 Serializability . 28 2.4.3 Serializability protocols . 30 2.4.4 Recovery protocols . 37 vii viii CONTENTS 2.5 Summary . 38 3 Network-attached storage devices 41 3.1 Trends enabling network-attached storage . 42 3.1.1 Cost-ineffective storage systems . 42 3.1.2 I/O-bound large-object applications . 44 3.1.3 New drive attachment technology . 44 3.1.4 Convergence of peripheral and interprocessor networks . 44 3.1.5 Excess of on-drive transistors . 45 3.2 Two network-attached storage architectures . 45 3.2.1 Network SCSI . 45 3.2.2 Network-Attached Secure Disks (NASD) . 47 3.3 The NASD architecture . 48 3.3.1 Direct transfer . 49 3.3.2 Asynchronous oversight . 50 3.3.3 Object-based interface . 51 3.4 The Cheops storage service . 54 3.4.1 Cheops design overview . 54 3.4.2 Layout protocols . 57 3.4.3 Storage access protocols . 58 3.4.4 Implementation . 58 3.5 Scalable bandwidth on Cheops-NASD . 59 3.5.1 Evaluation environment . 60 3.5.2 Raw bandwidth scaling . 61 3.6 Other scalable storage architectures . 63 3.6.1 Decoupling control from data transfer . 63 3.6.2 Network-striping . 64 3.6.3 Parallel and clustered storage systems . 66 3.7 Summary . 68 4 Shared storage arrays 71 4.1 Ensuring correctness in shared arrays . 72 4.1.1 Concurrency anomalies in shared arrays . 73 4.1.2 Failure anomalies in shared arrays . 74 CONTENTS ix 4.2 System description . 75 4.2.1 Storage controllers . 76 4.2.2 Storage managers . 77 4.3 Storage access and management using BSTs . 78 4.3.1 Fault-free mode . 79 4.3.2 Degraded mode . 80 4.3.3 Migrating mode . 80 4.3.4 Reconstructing mode . 83 4.3.5 The structure of BSTs . 83 4.4 BST properties . 85 4.4.1 BST Consistency . 85 4.4.2 BST Durability . 85 4.4.3 No atomicity . 86 4.4.4 BST Isolation . 86 4.5 Serializability protocols for BSTs . 87 4.5.1 Evaluation environment .
Recommended publications
  • VIA RAID Configurations
    VIA RAID configurations The motherboard includes a high performance IDE RAID controller integrated in the VIA VT8237R southbridge chipset. It supports RAID 0, RAID 1 and JBOD with two independent Serial ATA channels. RAID 0 (called Data striping) optimizes two identical hard disk drives to read and write data in parallel, interleaved stacks. Two hard disks perform the same work as a single drive but at a sustained data transfer rate, double that of a single disk alone, thus improving data access and storage. Use of two new identical hard disk drives is required for this setup. RAID 1 (called Data mirroring) copies and maintains an identical image of data from one drive to a second drive. If one drive fails, the disk array management software directs all applications to the surviving drive as it contains a complete copy of the data in the other drive. This RAID configuration provides data protection and increases fault tolerance to the entire system. Use two new drives or use an existing drive and a new drive for this setup. The new drive must be of the same size or larger than the existing drive. JBOD (Spanning) stands for Just a Bunch of Disks and refers to hard disk drives that are not yet configured as a RAID set. This configuration stores the same data redundantly on multiple disks that appear as a single disk on the operating system. Spanning does not deliver any advantage over using separate disks independently and does not provide fault tolerance or other RAID performance benefits. If you use either Windows® XP or Windows® 2000 operating system (OS), copy first the RAID driver from the support CD to a floppy disk before creating RAID configurations.
    [Show full text]
  • D:\Documents and Settings\Steph
    P45D3 Platinum Series MS-7513 (v1.X) Mainboard G52-75131X1 i Copyright Notice The material in this document is the intellectual property of MICRO-STAR INTERNATIONAL. We take every care in the preparation of this document, but no guarantee is given as to the correctness of its contents. Our products are under continual improvement and we reserve the right to make changes without notice. Trademarks All trademarks are the properties of their respective owners. NVIDIA, the NVIDIA logo, DualNet, and nForce are registered trademarks or trade- marks of NVIDIA Corporation in the United States and/or other countries. AMD, Athlon™, Athlon™ XP, Thoroughbred™, and Duron™ are registered trade- marks of AMD Corporation. Intel® and Pentium® are registered trademarks of Intel Corporation. PS/2 and OS®/2 are registered trademarks of International Business Machines Corporation. Windows® 95/98/2000/NT/XP/Vista are registered trademarks of Microsoft Corporation. Netware® is a registered trademark of Novell, Inc. Award® is a registered trademark of Phoenix Technologies Ltd. AMI® is a registered trademark of American Megatrends Inc. Revision History Revision Revision History Date V1.0 First release for PCB 1.X March 2008 (P45D3 Platinum) Technical Support If a problem arises with your system and no solution can be obtained from the user’s manual, please contact your place of purchase or local distributor. Alternatively, please try the following help resources for further guidance. Visit the MSI website for FAQ, technical guide, BIOS updates, driver updates, and other information: http://global.msi.com.tw/index.php? func=faqIndex Contact our technical staff at: http://support.msi.com.tw/ ii Safety Instructions 1.
    [Show full text]
  • Designing Disk Arrays for High Data Reliability
    Designing Disk Arrays for High Data Reliability Garth A. Gibson School of Computer Science Carnegie Mellon University 5000 Forbes Ave., Pittsbugh PA 15213 David A. Patterson Computer Science Division Electrical Engineering and Computer Sciences University of California at Berkeley Berkeley, CA 94720 hhhhhhhhhhhhhhhhhhhhhhhhhhhhh This research was funded by NSF grant MIP-8715235, NASA/DARPA grant NAG 2-591, a Computer Measurement Group fellowship, and an IBM predoctoral fellowship. 1 Proposed running head: Designing Disk Arrays for High Data Reliability Please forward communication to: Garth A. Gibson School of Computer Science Carnegie Mellon University 5000 Forbes Ave. Pittsburgh PA 15213-3890 412-268-5890 FAX 412-681-5739 [email protected] ABSTRACT Redundancy based on a parity encoding has been proposed for insuring that disk arrays provide highly reliable data. Parity-based redundancy will tolerate many independent and dependent disk failures (shared support hardware) without on-line spare disks and many more such failures with on-line spare disks. This paper explores the design of reliable, redundant disk arrays. In the context of a 70 disk strawman array, it presents and applies analytic and simulation models for the time until data is lost. It shows how to balance requirements for high data reliability against the overhead cost of redundant data, on-line spares, and on-site repair personnel in terms of an array's architecture, its component reliabilities, and its repair policies. 2 Recent advances in computing speeds can be matched by the I/O performance afforded by parallelism in striped disk arrays [12, 13, 24]. Arrays of small disks further utilize advances in the technology of the magnetic recording industry to provide cost-effective I/O systems based on disk striping [19].
    [Show full text]
  • Megaraid® 1078-Based SAS RAID Controllers User's Guide
    USER’S GUIDE MegaRAID® 1078-based SAS RAID Controllers February 2007 ® 80-00157-01 Rev. A This document contains proprietary information of LSI Logic Corporation. The information contained herein is not to be used by or disclosed to third parties without the express written permission of an officer of LSI Logic Corporation. LSI Logic products are not intended for use in life-support appliances, devices, or systems. Use of any LSI Logic product in such applications without written consent of the appropriate LSI Logic officer is prohibited. Purchase of I2C components of LSI Logic Corporation, or one of its sublicensed Associated Companies, conveys a license under the Philips I2C Patent Rights to use these components in an I2C system, provided that the system conforms to the I2C standard Specification as defined by Philips. Document 80-00157-01 Rev. A, February 2007. This document describes the current versions of the LSI Logic Corporation MegaRAID SAS RAID controllers and will remain the official reference source for all revisions/releases of these products until rescinded by an update. LSI Logic Corporation reserves the right to make changes to any products herein at any time without notice. LSI Logic does not assume any responsibility or liability arising out of the application or use of any product described herein, except as expressly agreed to in writing by LSI Logic; nor does the purchase or use of a product from LSI Logic convey a license under any patent rights, copyrights, trademark rights, or any other of the intellectual property rights of LSI Logic or third parties.
    [Show full text]
  • Disk Array Data Organizations and RAID
    Guest Lecture for 15-440 Disk Array Data Organizations and RAID October 2010, Greg Ganger © 1 Plan for today Why have multiple disks? Storage capacity, performance capacity, reliability Load distribution problem and approaches disk striping Fault tolerance replication parity-based protection “RAID” and the Disk Array Matrix Rebuild October 2010, Greg Ganger © 2 Why multi-disk systems? A single storage device may not provide enough storage capacity, performance capacity, reliability So, what is the simplest arrangement? October 2010, Greg Ganger © 3 Just a bunch of disks (JBOD) A0 B0 C0 D0 A1 B1 C1 D1 A2 B2 C2 D2 A3 B3 C3 D3 Yes, it’s a goofy name industry really does sell “JBOD enclosures” October 2010, Greg Ganger © 4 Disk Subsystem Load Balancing I/O requests are almost never evenly distributed Some data is requested more than other data Depends on the apps, usage, time, … October 2010, Greg Ganger © 5 Disk Subsystem Load Balancing I/O requests are almost never evenly distributed Some data is requested more than other data Depends on the apps, usage, time, … What is the right data-to-disk assignment policy? Common approach: Fixed data placement Your data is on disk X, period! For good reasons too: you bought it or you’re paying more … Fancy: Dynamic data placement If some of your files are accessed a lot, the admin (or even system) may separate the “hot” files across multiple disks In this scenario, entire files systems (or even files) are manually moved by the system admin to specific disks October 2010, Greg
    [Show full text]
  • 6Gb/S Megaraid SAS RAID Controllers User Guide February 2013
    6Gb/s MegaRAID® SAS RAID Controllers User Guide February 2013 41450-04, Rev. B 41450-04B 6Gb/s MegaRAID SAS RAID Controllers User Guide February 2013 Revision History Version and Date Description of Changes 41450-04, Rev. B, February 2013 Updated the environmental conditions for the RAID controllers. Updated this guide to the new template. 41450-04, Rev. A, August 2012 Added the MegaRAID SAS 9270-8i, SAS 9271-4i, SAS 9271-8i, SAS 9271-8iCC, SAS 9286-8e, SAS 9286CV-8e, and SAS 9286CV-8eCC RAID controllers. 41450-03, Rev. A, April 2012 Added the MegaRAID SAS 9265CV-8i, SAS 9266-4i, SAS 9266-8i, and SAS 9285CV-8e RAID controllers. 41450-02, Rev. E, February 2011 Added the MegaRAID SAS 9260CV-4i, SAS 9260CV-8i, SAS 9265-8i, and SAS 9285-8e RAID controllers. 41450-02, Rev. D, June 2010 Added the MegaRAID SAS 9260-16i, SAS 9280-16i4e, and SAS 9280-24i4e RAID controllers. 41450-02, Rev. C, April 2010 Put this guide in the new template. 41450-02, Rev. B, November 2009 Added the MegaRAID SAS 9240-4i, SAS 9240-8i, SAS 9261-8i, and SAS 9280-4i4e RAID controllers. 41450-02, Rev. A, July 2009 Added the MegaRAID SAS 9260-4i, SAS 9260DE-8i, SAS 9280-8e, and SAS 9280DE-8e RAID controllers. 41450-01, Rev. A, June 2009 Added the MegaRAID SAS 9260-8i RAID controller. 41450-00, Rev. A, March 2009 Initial release of this document. LSI, the LSI & Design logo, CacheCade, CacheVault, Fusion-MPT, MegaRAID, and SafeStore are trademarks or registered trademarks All other brand and product names may be trademarks of their respective companies.
    [Show full text]
  • Identify Storage Technologies and Understand RAID
    LESSON 4.1_4.2 98-365 Windows Server Administration Fundamentals IdentifyIdentify StorageStorage TechnologiesTechnologies andand UnderstandUnderstand RAIDRAID LESSON 4.1_4.2 98-365 Windows Server Administration Fundamentals Lesson Overview In this lesson, you will learn: Local storage options Network storage options Redundant Array of Independent Disk (RAID) options LESSON 4.1_4.2 98-365 Windows Server Administration Fundamentals Anticipatory Set List three different RAID configurations. Which of these three bus types has the fastest transfer speed? o Parallel ATA (PATA) o Serial ATA (SATA) o USB 2.0 LESSON 4.1_4.2 98-365 Windows Server Administration Fundamentals Local Storage Options Local storage options can range from a simple single disk to a Redundant Array of Independent Disks (RAID). Local storage options can be broken down into bus types: o Serial Advanced Technology Attachment (SATA) o Integrated Drive Electronics (IDE, now called Parallel ATA or PATA) o Small Computer System Interface (SCSI) o Serial Attached SCSI (SAS) LESSON 4.1_4.2 98-365 Windows Server Administration Fundamentals Local Storage Options SATA drives have taken the place of the tradition PATA drives. SATA have several advantages over PATA: o Reduced cable bulk and cost o Faster and more efficient data transfer o Hot-swapping technology LESSON 4.1_4.2 98-365 Windows Server Administration Fundamentals Local Storage Options (continued) SAS drives have taken the place of the traditional SCSI and Ultra SCSI drives in server class machines. SAS have several
    [Show full text]
  • 6 O--C/?__I RAID-II: Design and Implementation Of
    f r : NASA-CR-192911 I I /N --6 o--c/?__i _ /f( RAID-II: Design and Implementation of a/t 't Large Scale Disk Array Controller R.H. Katz, P.M. Chen, A.L. Drapeau, E.K. Lee, K. Lutz, E.L. Miller, S. Seshan, D.A. Patterson r u i (NASA-CR-192911) RAID-Z: DESIGN N93-25233 AND IMPLEMENTATION OF A LARGE SCALE u DISK ARRAY CONTROLLER (California i Univ.) 18 p Unclas J II ! G3160 0158657 ! I i I \ i O"-_ Y'O J i!i111 ,= -, • • ,°. °.° o.o I I Report No. UCB/CSD-92-705 "-----! I October 1992 _,'_-_,_ i i I , " Computer Science Division (EECS) University of California, Berkeley Berkeley, California 94720 RAID-II: Design and Implementation of a Large Scale Disk Array Controller 1 R. H. Katz P. M. Chen, A. L Drapeau, E. K. Lee, K. Lutz, E. L Miller, S. Seshan, D. A. Patterson Computer Science Division Electrical Engineering and Computer Science Department University of California, Berkeley Berkeley, CA 94720 Abstract: We describe the implementation of a large scale disk array controller and subsystem incorporating over 100 high performance 3.5" disk chives. It is designed to provide 40 MB/s sustained performance and 40 GB capacity in three 19" racks. The array controller forms an integral part of a file server that attaches to a Gb/s local area network. The controller implements a high bandwidth interconnect between an interleaved memory, an XOR calculation engine, the network interface (HIPPI), and the disk interfaces (SCSI). The system is now functionally operational, and we are tuning its performance.
    [Show full text]
  • Architectures and Algorithms for On-Line Failure Recovery in Redundant Disk Arrays
    Architectures and Algorithms for On-Line Failure Recovery in Redundant Disk Arrays Draft copy submitted to the Journal of Distributed and Parallel Databases. A revised copy is published in this journal, vol. 2 no. 3, July 1994.. Mark Holland Department of Electrical and Computer Engineering Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA 15213-3890 (412) 268-5237 [email protected] Garth A. Gibson School of Computer Science Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA 15213-3890 (412) 268-5890 [email protected] Daniel P. Siewiorek School of Computer Science Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA 15213-3890 (412) 268-2570 [email protected] Architectures and Algorithms for On-Line Failure Recovery In Redundant Disk Arrays1 Abstract The performance of traditional RAID Level 5 arrays is, for many applications, unacceptably poor while one of its constituent disks is non-functional. This paper describes and evaluates mechanisms by which this disk array failure-recovery performance can be improved. The two key issues addressed are the data layout, the mapping by which data and parity blocks are assigned to physical disk blocks in an array, and the reconstruction algorithm, which is the technique used to recover data that is lost when a component disk fails. The data layout techniques this paper investigates are variations on the declustered parity organiza- tion, a derivative of RAID Level 5 that allows a system to trade some of its data capacity for improved failure-recovery performance. Parity declustering improves the failure-mode performance of an array significantly, and a parity-declustered architecture is preferable to an equivalent-size multiple-group RAID Level 5 organization in environments where failure-recovery performance is important.
    [Show full text]
  • ICP Disk Array Controllers
    ICP Disk Array Controllers GDT8546RZ GDT8546RZ is a 4-channel PCI RAID controller offering demands for inexpensive entry-level RAID solutions. Its the option of connecting up to four serial ATA drives. compact construction allows GDT8546RZ to be used in Parallel ATA (IDE) disk drives can be linked with an low profile systems. The controller is not only suitable optional accessory. The controller is based on the for high density and/or entry-level rackmount servers, but Serial AT Attachment (SATA) technology, satisfying the also for high-end desktops and workstations. Excellence in Controllers 64-bit/66MHz PCI - Serial ATA Hardware RAID Controller High Configuration Flexibility ● Supports RAID 0, 1, 4, 5 and 10 ● ROM-resident setup (ICP RAID Console) with ● Hot Plug, Auto Hot Plug with SAF-TE support integrated “Express Setup” function (CTRL+<G>) ● Private or Pool Hot Fix drives ● Simultaneous operation of several disk arrays ● Hardware RAID is totally independent of the host and ● Flexible capacity setting operating system ● Online capacity expansion ● Integrated acoustic alarm ● Online RAID level migration ● Complete disk array migration Versatile Connectivity Capabilities ● Controller switch and controller upgrade without ● Supports four internal independent SATA 1.5 channels reconfiguration ● Data transfer rate up to 150MB/sec. per channel ● Automatic adaptation to changes in SATA configuration ● Connection of one hard drive per channel (point-to- point configuration making cabling redundant) Integrated RAID Software ● Cable length
    [Show full text]
  • 416 Distributed Systems
    416 Distributed Systems RAID, Feb 16 2018 Thanks to Greg Ganger and Remzi Arapaci-Dusseau for slides Outline • Using multiple disks • Why have multiple disks? • problem and approaches • RAID levels and performance 3 Motivation: Why use multiple disks? • Capacity • More disks allows us to store more data • Performance • Access multiple disks in parallel • Each disk can be working on independent read or write • Overlap seek and rotational positioning time for all • Reliability • Recover from disk (or single sector) failures • Will need to store multiple copies of data to recover • So, what is the simplest arrangement? JustJust a bunch a bunch of of disks disks (JBOD) (JBOD) A0 B0 C0 D0 A1 B1 C1 D1 A2 B2 C2 D2 A3 B3 C3 D3 • Yes, it’s a goofy name • industry Yes, it’sreally a goofydoes sellname “JBOD enclosures” industry really does sell “JBOD enclosures” October 2010, Greg Ganger © 5 4 Disk Subsystem Load Balancing • I/O requests are almost never evenly distributed • Some data is requested more than other data • Depends on the apps, usage, time, ... • What is the right data-to-disk assignment policy? • Common approach: Fixed data placement • Your data is on disk X, period! • For good reasons too: you bought it or you’re paying more... • Fancy: Dynamic data placement • If some of your files are accessed a lot, the admin(or even system) may separate the “hot” files across multiple disks • In this scenario, entire files systems (or even files) are manually moved by the system admin to specific disks • Alternative: Disk striping • Stripe all
    [Show full text]
  • IBM XIV Storage System Host Attachment and Interoperability
    Front cover IBM XIV Storage System Host Attachment and Interoperability Integrate with DB2, VMware ESX, and Microsoft HyperV Get operating system specifics for host side tuning Use XIV with IBM i, N series, and ProtecTIER Bertrand Dufrasne Bruce Allworth Desire Brival Mark Kremkus Markus Oscheka Thomas Peralto ibm.com/redbooks International Technical Support Organization IBM XIV Storage System Host Attachment and Interoperability March 2013 SG24-7904-02 Note: Before using this information and the product it supports, read the information in “Notices” on page ix. Third Edition (March 2013) This edition applies to the IBM XIV Storage System (Machine types 2812-114 and 2810-114) with XIV system software Version 11.1.1. © Copyright International Business Machines Corporation 2012, 2013. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Notices . ix Trademarks . .x Preface . xi The team who wrote this book . xi Now you can become a published author, too! . xiii Comments welcome. xiii Stay connected to IBM Redbooks . xiv Summary of changes. .xv March 2013, Third Edition . .xv Chapter 1. Host connectivity . 1 1.1 Overview . 2 1.1.1 Module, patch panel, and host connectivity . 4 1.1.2 Host operating system support . 9 1.1.3 Downloading the entire XIV support matrix by using the SSIC. 9 1.1.4 Host Attachment Kits . 10 1.1.5 Fibre Channel versus iSCSI access . 12 1.2 Fibre Channel connectivity . 13 1.2.1 Preparation steps . 13 1.2.2 Fibre Channel configurations .
    [Show full text]