On-Line Data Reconstruction in Redundant Disk Arrays
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Disk Array Data Organizations and RAID
Guest Lecture for 15-440 Disk Array Data Organizations and RAID October 2010, Greg Ganger © 1 Plan for today Why have multiple disks? Storage capacity, performance capacity, reliability Load distribution problem and approaches disk striping Fault tolerance replication parity-based protection “RAID” and the Disk Array Matrix Rebuild October 2010, Greg Ganger © 2 Why multi-disk systems? A single storage device may not provide enough storage capacity, performance capacity, reliability So, what is the simplest arrangement? October 2010, Greg Ganger © 3 Just a bunch of disks (JBOD) A0 B0 C0 D0 A1 B1 C1 D1 A2 B2 C2 D2 A3 B3 C3 D3 Yes, it’s a goofy name industry really does sell “JBOD enclosures” October 2010, Greg Ganger © 4 Disk Subsystem Load Balancing I/O requests are almost never evenly distributed Some data is requested more than other data Depends on the apps, usage, time, … October 2010, Greg Ganger © 5 Disk Subsystem Load Balancing I/O requests are almost never evenly distributed Some data is requested more than other data Depends on the apps, usage, time, … What is the right data-to-disk assignment policy? Common approach: Fixed data placement Your data is on disk X, period! For good reasons too: you bought it or you’re paying more … Fancy: Dynamic data placement If some of your files are accessed a lot, the admin (or even system) may separate the “hot” files across multiple disks In this scenario, entire files systems (or even files) are manually moved by the system admin to specific disks October 2010, Greg -
Architectures and Algorithms for On-Line Failure Recovery in Redundant Disk Arrays
Architectures and Algorithms for On-Line Failure Recovery in Redundant Disk Arrays Draft copy submitted to the Journal of Distributed and Parallel Databases. A revised copy is published in this journal, vol. 2 no. 3, July 1994.. Mark Holland Department of Electrical and Computer Engineering Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA 15213-3890 (412) 268-5237 [email protected] Garth A. Gibson School of Computer Science Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA 15213-3890 (412) 268-5890 [email protected] Daniel P. Siewiorek School of Computer Science Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA 15213-3890 (412) 268-2570 [email protected] Architectures and Algorithms for On-Line Failure Recovery In Redundant Disk Arrays1 Abstract The performance of traditional RAID Level 5 arrays is, for many applications, unacceptably poor while one of its constituent disks is non-functional. This paper describes and evaluates mechanisms by which this disk array failure-recovery performance can be improved. The two key issues addressed are the data layout, the mapping by which data and parity blocks are assigned to physical disk blocks in an array, and the reconstruction algorithm, which is the technique used to recover data that is lost when a component disk fails. The data layout techniques this paper investigates are variations on the declustered parity organiza- tion, a derivative of RAID Level 5 that allows a system to trade some of its data capacity for improved failure-recovery performance. Parity declustering improves the failure-mode performance of an array significantly, and a parity-declustered architecture is preferable to an equivalent-size multiple-group RAID Level 5 organization in environments where failure-recovery performance is important. -
Data Storage and High-Speed Streaming
FYS3240 PC-based instrumentation and microcontrollers Data storage and high-speed streaming Spring 2013 – Lecture #8 Bekkeng, 8.1.2013 Data streaming • Data written to or read from a hard drive at a sustained rate is often referred to as streaming • Trends in data storage – Ever-increasing amounts of data – Record “everything” and play it back later – Hard drives: faster, bigger, and cheaper – Solid state drives – RAID hardware – PCI Express • PCI Express provides higher, dedicated bandwidth Overview • Hard drive performance and alternatives • File types • RAID • DAQ software design for high-speed acquisition and storage Streaming Data with the PCI Express Bus • A PCI Express device receives dedicated bandwidth (250 MB/s or more). • Data is transferred from onboard device memory (typically less than 512 MB), across a dedicated PCI Express link, across the I/O bus, and into system memory (RAM; 3 GB or more possible). It can then be transferred from system memory, across the I/O bus, onto hard drives (TB´s of data). The CPU/DMA-controller is responsible for managing this process. • Peer-to-peer data streaming is also possible between two PCI Express devices. PXI: Streaming to/from Hard Disk Drives RAM – Random Access Memory • SRAM – Static RAM: Each bit stored in a flip-flop • DRAM – Dynamic RAM: Each bit stored in a capacitor (transistor). Has to be refreshed (e.g. each 15 ms) – EDO DRAM – Extended Data Out DRAM. Data available while next bit is being set up – Dual-Ported DRAM (VRAM – Video RAM). Two locations can be accessed at the same time – SDRAM – Synchronous DRAM. -
RAID Technology
RAID Technology Reference and Sources: y The most part of text in this guide has been taken from copyrighted document of Adaptec, Inc. on site (www.adaptec.com) y Perceptive Solutions, Inc. RAID stands for Redundant Array of Inexpensive (or sometimes "Independent") Disks. RAID is a method of combining several hard disk drives into one logical unit (two or more disks grouped together to appear as a single device to the host system). RAID technology was developed to address the fault-tolerance and performance limitations of conventional disk storage. It can offer fault tolerance and higher throughput levels than a single hard drive or group of independent hard drives. While arrays were once considered complex and relatively specialized storage solutions, today they are easy to use and essential for a broad spectrum of client/server applications. Redundant Arrays of Inexpensive Disks (RAID) "KILLS - BUGS - DEAD!" -- TV commercial for RAID bug spray There are many applications, particularly in a business environment, where there are needs beyond what can be fulfilled by a single hard disk, regardless of its size, performance or quality level. Many businesses can't afford to have their systems go down for even an hour in the event of a disk failure; they need large storage subsystems with capacities in the terabytes; and they want to be able to insulate themselves from hardware failures to any extent possible. Some people working with multimedia files need fast data transfer exceeding what current drives can deliver, without spending a fortune on specialty drives. These situations require that the traditional "one hard disk per system" model be set aside and a new system employed. -
Memory Systems : Cache, DRAM, Disk
CHAPTER 24 Storage Subsystems Up to this point, the discussions in Part III of this with how multiple drives within a subsystem can be book have been on the disk drive as an individual organized together, cooperatively, for better reliabil- storage device and how it is directly connected to a ity and performance. This is discussed in Sections host system. This direct attach storage (DAS) para- 24.1–24.3. A second aspect deals with how a storage digm dates back to the early days of mainframe subsystem is connected to its clients and accessed. computing, when disk drives were located close to Some form of networking is usually involved. This is the CPU and cabled directly to the computer system discussed in Sections 24.4–24.6. A storage subsystem via some control circuits. This simple model of disk can be designed to have any organization and use any drive usage and confi guration remained unchanged of the connection methods discussed in this chapter. through the introduction of, fi rst, the mini computers Organization details are usually made transparent to and then the personal computers. Indeed, even today user applications by the storage subsystem presenting the majority of disk drives shipped in the industry are one or more virtual disk images, which logically look targeted for systems having such a confi guration. like disk drives to the users. This is easy to do because However, this simplistic view of the relationship logically a disk is no more than a drive ID and a logical between the disk drive and the host system does not address space associated with it. -
6Gb/S SATA RAID TB User Manual
6Gb/s SATA RAID TB T12-S6.TB - Desktop RM12-S6.TB - Rackmount User Manual Version: 1.0 Issue Date: October, 2013 ARCHTTP PROXY SERVER INSTALLATION 5.5 For Mac OS 10.X The ArcHttp proxy server is provided on the software CD delivered with 6Gb/s SATA RAID controller or download from the www.areca. com.tw. The firmware embedded McRAID storage manager can configure and monitor the 6Gb/s SATA RAID controller via ArcHttp proxy server. The Archttp proxy server for Mac pro, please refer to Chapter 4.6 "Driver Installation" for Mac 10.X. 5.6 ArcHttp Configuration The ArcHttp proxy server will automatically assign one additional port for setup its configuration. If you want to change the "archttp- srv.conf" setting up of ArcHttp proxy server configuration, for example: General Configuration, Mail Configuration, and SNMP Configuration, please start Web Browser http:\\localhost: Cfg As- sistant. Such as http:\\localhost: 81. The port number for first con- troller McRAID storage manager is ArcHttp proxy server configura- tion port number plus 1. • General Configuration: Binding IP: Restrict ArcHttp proxy server to bind only single interface (If more than one physical network in the server). HTTP Port#: Value 1~65535. Display HTTP Connection Information To Console: Select “Yes" to show Http send bytes and receive bytes information in the console. Scanning PCI Device: Select “Yes” for ARC-1XXX series controller. Scanning RS-232 Device: No. Scanning Inband Device: No. 111 ARCHTTP PROXY SERVER INSTALLATION • Mail (alert by Mail) Configuration: To enable the controller to send the email function, you need to configure the SMTP function on the ArcHttp software. -
I/O Workload Outsourcing for Boosting RAID Reconstruction Performance
WorkOut: I/O Workload Outsourcing for Boosting RAID Reconstruction Performance Suzhen Wu1, Hong Jiang2, Dan Feng1∗, Lei Tian12, Bo Mao1 1Key Laboratory of Data Storage Systems, Ministry of Education of China 1School of Computer Science & Technology, Huazhong University of Science & Technology 2Department of Computer Science & Engineering, University of Nebraska-Lincoln ∗Corresponding author: [email protected] {suzhen66, maobo.hust}@gmail.com, {jiang, tian}@cse.unl.edu, [email protected] Abstract ing reconstruction without serving any I/O requests from User I/O intensity can significantly impact the perfor- user applications, and on-line reconstruction, when the mance of on-line RAID reconstruction due to contention RAID continues to service user I/O requests during re- for the shared disk bandwidth. Based on this observa- construction. tion, this paper proposes a novel scheme, called WorkOut Off-line reconstruction has the advantage that it’s (I/O Workload Outsourcing), to significantly boost RAID faster than on-line reconstruction, but it is not practical reconstruction performance. WorkOut effectively out- in environments with high availability requirements, as sources all write requests and popular read requests orig- the entire RAID set needs to be taken off-line during re- inally targeted at the degraded RAID set to a surrogate construction. RAID set during reconstruction. Our lightweight pro- On the other hand, on-line reconstruction allows fore- totype implementation of WorkOut and extensive trace- ground traffic to continue during reconstruction, but driven and benchmark-driven experiments demonstrate takes longer to complete than off-line reconstruction as that, compared with existing reconstruction approaches, the reconstruction process competes with the foreground WorkOut significantly speeds up both the total recon- workload for I/O bandwidth. -
University of California Santa Cruz Incorporating Solid
UNIVERSITY OF CALIFORNIA SANTA CRUZ INCORPORATING SOLID STATE DRIVES INTO DISTRIBUTED STORAGE SYSTEMS A dissertation submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in COMPUTER SCIENCE by Rosie Wacha December 2012 The Dissertation of Rosie Wacha is approved: Professor Scott A. Brandt, Chair Professor Carlos Maltzahn Professor Charlie McDowell Tyrus Miller Vice Provost and Dean of Graduate Studies Copyright c by Rosie Wacha 2012 Table of Contents Table of Contents iii List of Figures viii List of Tables xii Abstract xiii Acknowledgements xv 1 Introduction 1 2 Background and Related Work 6 2.1 Data Layouts for Redundancy and Performance . 6 RAID . 8 Parity striping . 10 Parity declustering . 12 Reconstruction performance improvements . 14 iii Disk arrays with higher fault tolerance . 14 2.2 Very Large Storage Arrays . 17 Data placement . 17 Ensuring reliability of data . 19 2.3 Self-Configuring Disk Arrays . 20 HP AutoRAID . 21 Sparing . 22 2.4 Solid-State Drives (SSDs) . 24 2.5 Mitigating RAID’s Small Write Problem . 27 2.6 Low Power Storage Systems . 29 2.7 Real Systems . 31 3 RAID4S: Supercharging RAID Small Writes with SSD 32 3.1 Improving RAID Small Write Performance . 32 3.2 Related Work . 38 All-SSD RAID arrays . 39 Hybrid SSD-HDD RAID arrays . 40 Other solid state technology . 41 3.3 Small Write Performance . 41 3.4 The RAID4S System . 43 3.5 The Low Cost of RAID4S . 46 3.6 Reduced Power Consumption . 48 iv 3.7 RAID4S Simulation Results . 52 Simulated array performance . 56 3.8 Experimental Methodology & Results . -
Which RAID Level Is Right for Me?
STORAGE SOLUTIONS WHITE PAPER Which RAID Level is Right for Me? Contents Introduction.....................................................................................1 RAID 10 (Striped RAID 1 sets) .................................................3 RAID Level Descriptions..................................................................1 RAID 50 (Striped RAID 5 sets) .................................................4 RAID 0 (Striping).......................................................................1 RAID 60 (Striped RAID 6 sets) .................................................4 RAID 1 (Mirroring).....................................................................2 RAID Level Comparison ..................................................................5 RAID 1E (Striped Mirror)...........................................................2 About Adaptec RAID .......................................................................5 RAID 5 (Striping with parity) .....................................................2 RAID 5EE (Hot Space).....................................................................3 RAID 6 (Striping with dual parity).............................................3 Data is the most valuable asset of any business today. Lost data of users. This white paper intends to give an overview on the means lost business. Even if you backup regularly, you need a performance and availability of various RAID levels in general fail-safe way to ensure that your data is protected and can be and may not be accurate in all user -
Software-RAID-HOWTO.Pdf
Software-RAID-HOWTO Software-RAID-HOWTO Table of Contents The Software-RAID HOWTO...........................................................................................................................1 Jakob Østergaard [email protected] and Emilio Bueso [email protected] 1. Introduction..........................................................................................................................................1 2. Why RAID?.........................................................................................................................................1 3. Devices.................................................................................................................................................1 4. Hardware issues...................................................................................................................................1 5. RAID setup..........................................................................................................................................1 6. Detecting, querying and testing...........................................................................................................2 7. Tweaking, tuning and troubleshooting................................................................................................2 8. Reconstruction.....................................................................................................................................2 9. Performance.........................................................................................................................................2 -
Of File Systems and Storage Models
Chapter 4 Of File Systems and Storage Models Disks are always full. It is futile to try to get more disk space. Data expands to fill any void. –Parkinson’sLawasappliedto disks 4.1 Introduction This chapter deals primarily with how we store data. Virtually all computer systems require some way to store data permanently; even so-called “diskless” systems do require access to certain files in order to boot, run and be useful. Albeit stored remotely (or in memory), these bits reside on some sort of storage system. Most frequently, data is stored on local hard disks, but over the last few years more and more of our files have moved “into the cloud”, where di↵erent providers o↵er easy access to large amounts of storage over the network. We have more and more computers depending on access to remote systems, shifting our traditional view of what constitutes a storage device. 74 CHAPTER 4. OF FILE SYSTEMS AND STORAGE MODELS 75 As system administrators, we are responsible for all kinds of devices: we build systems running entirely without local storage just as we maintain the massive enterprise storage arrays that enable decentralized data replication and archival. We manage large numbers of computers with their own hard drives, using a variety of technologies to maximize throughput before the data even gets onto a network. In order to be able to optimize our systems on this level, it is important for us to understand the principal concepts of how data is stored, the di↵erent storage models and disk interfaces.Itisimportanttobeawareofcertain physical properties of our storage media, and the impact they, as well as certain historic limitations, have on how we utilize disks. -
A Secure, Reliable and Performance-Enhancing Storage Architecture Integrating Local and Cloud-Based Storage
Brigham Young University BYU ScholarsArchive Theses and Dissertations 2016-12-01 A Secure, Reliable and Performance-Enhancing Storage Architecture Integrating Local and Cloud-Based Storage Christopher Glenn Hansen Brigham Young University Follow this and additional works at: https://scholarsarchive.byu.edu/etd Part of the Electrical and Computer Engineering Commons BYU ScholarsArchive Citation Hansen, Christopher Glenn, "A Secure, Reliable and Performance-Enhancing Storage Architecture Integrating Local and Cloud-Based Storage" (2016). Theses and Dissertations. 6470. https://scholarsarchive.byu.edu/etd/6470 This Thesis is brought to you for free and open access by BYU ScholarsArchive. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of BYU ScholarsArchive. For more information, please contact [email protected], [email protected]. A Secure, Reliable and Performance-Enhancing Storage Architecture Integrating Local and Cloud-Based Storage Christopher Glenn Hansen A thesis submitted to the faculty of Brigham Young University in partial fulfillment of the requirements for the degree of Master of Science James Archibald, Chair Doran Wilde Michael Wirthlin Department of Electrical and Computer Engineering Brigham Young University Copyright © 2016 Christopher Glenn Hansen All Rights Reserved ABSTRACT A Secure, Reliable and Performance-Enhancing Storage Architecture Integrating Local and Cloud-Based Storage Christopher Glenn Hansen Department of Electrical and Computer Engineering, BYU Master of Science The constant evolution of new varieties of computing systems - cloud computing, mobile devices, and Internet of Things, to name a few - have necessitated a growing need for highly reliable, available, secure, and high-performing storage systems. While CPU performance has typically scaled with Moore’s Law, data storage is much less consistent in how quickly perfor- mance increases over time.