Linux MDADM System Test Procedure

Total Page:16

File Type:pdf, Size:1020Kb

Linux MDADM System Test Procedure Linux MDADM System Test Procedure 1) RAID-1 Test Case: On your VB-Linux system, do sudo apt-get install mdadm. NOTE: You should be very careful not use your system disk (e.g. /dev/sda) for system tests since this would likely wipe out your file systems. Use only the RAM disks to be safe and always think twice before performing any tests the write to system critical storage block devices. a) Create a simple mirrored RAID volume using MDADM for Multi-Device Administration as described in mdadm manual (also described on Wikipedia and in Linux Magazine) as follows: mdadm --create /dev/md0 --chunk=4 --level=1 --raid-devices=2 /dev/ram0 /dev/ram1 Test this new device /dev/md0 like any block device by running: dd if=/dev/md0 of=/dev/null bs=4k count=1000 dd if=/dev/zero of=/dev/md0 bs=4k count=1000 Produce a report on observed read and write performance compared to running the same tests on a simple RAM disk that is not mirrored (e.g. /dev/ram2). Note that “iostat” a utility from “sysstat” may be useful for measuring performance (I/Os per second). b) Build a file system on your mirrored volume and testing failure and recovery with MDADM as follows: mke2fs -m 0 /dev/md0 mkdir /mnt/raid1 mount /dev/md0 /mnt/raid1 dd if=/dev/zero of=/mnt/raid1/newfile bs=64k count=100 cat /proc/mdstat mdadm --detail /dev/md0 Provide sample output of the status of your /dev/md0 that has your file system using “df”. Now continue with the following test command to simulate failure of one of the two member devices in the RAID volume, writes to the volume in degraded mode, and subsequent recovery of the member after the writes: mdadm --manage --set-faulty /dev/md0 /dev/ram0 mdadm --detail /dev/md0 dd if=/dev/zero of=/mnt/raid1/newfile2 bs=64k count=100 mdadm /dev/md0 -r /dev/ram0 mdadm --detail /dev/md0 mdadm /dev/md0 -a /dev/ram0 mdadm --detail /dev/md0 Report on what you observe for the above sequence of RAID member simulated failure and recovery commands. 1. Verify that MDADM can handle intermittent loss of communication with a disk drive in the RAID1 volume. 2. Verify that MDADM restores the mirrored data that was written to the file system on the RAID volume to the faulty drive when it is added back to the volume. To make sure you can re-use the test RAM disks for future test cases (so they are not busy), stop and remove your /dev/md0 mdadm --stop /dev/md0 mdadm --remove /dev/md0 2) RAID-5 Test Case: repeat the data protection verification system tests from #1. a) Create a 3+1 RAID5 volume using MDADM for Mulit-Device Administration as follows: mdadm --create /dev/md1 --level=5 --raid-devices=4 /dev/ram0 /dev/ram1 /dev/ram2 /dev/ram3 Test this new device /dev/md0 like any block device by running: dd if=/dev/md1 of=/dev/null bs=4k count=1000 dd if=/dev/zero of=/dev/md1 bs=4k count=1000 Produce a report on observed read and write performance compared to running the same tests on a simple RAM disk that is not mirrored (e.g. /dev/ram4). Note that “iostat” a utility from “sysstat” may be useful for measuring performance (I/Os per second). How much does the the parity computation slow down read or write operations on a RAM disk? Verify that this is within expectation (which is?). b) Build a file system on the XOR parity protected volume and test failure and recovery with MDADM as follows: mke2fs -m 0 /dev/md1 mkdir /mnt/raid1 mount /dev/md1 /mnt/raid1 dd if=/dev/zero of=/mnt/raid1/newfile bs=64k count=100 cat /proc/mdstat mdadm --detail /dev/md1 Provide sample output of the status of your /dev/md0 that has your file system using “df”. Now continue with the following test command to simulate failure of one of the two member devices in the RAID volume, writes to the volume in degraded mode, and subsequent recovery of the member after the writes: mdadm --manage --set-faulty /dev/md1 /dev/ram0 mdadm --detail /dev/md1 dd if=/dev/zero of=/mnt/raid1/newfile2 bs=64k count=100 mdadm /dev/md1 -r /dev/ram0 mdadm --detail /dev/md1 mdadm /dev/md1 -a /dev/ram0 mdadm --detail /dev/md1 Report on what you observe for the above sequence of RAID member simulated failure and recovery commands. Verify that MDADM can handle intermittent loss of communication with a disk drive in the RAID1 volume. Verify that MDADM restores the mirrored data that was written to the file system on the RAID volume to the faulty drive when it is added back to the volume. To make sure you can re-use the test RAM disks for future test cases (so they are not busy), stop and remove your /dev/md0 mdadm --stop /dev/md1 mdadm --remove /dev/md1 Testing Tips Here are some tips for working with the Linux platform for this test: 1) You can list driver modules with “lsmod” 2) Use “dmesg” to see printk and other kernel log messages .
Recommended publications
  • VIA RAID Configurations
    VIA RAID configurations The motherboard includes a high performance IDE RAID controller integrated in the VIA VT8237R southbridge chipset. It supports RAID 0, RAID 1 and JBOD with two independent Serial ATA channels. RAID 0 (called Data striping) optimizes two identical hard disk drives to read and write data in parallel, interleaved stacks. Two hard disks perform the same work as a single drive but at a sustained data transfer rate, double that of a single disk alone, thus improving data access and storage. Use of two new identical hard disk drives is required for this setup. RAID 1 (called Data mirroring) copies and maintains an identical image of data from one drive to a second drive. If one drive fails, the disk array management software directs all applications to the surviving drive as it contains a complete copy of the data in the other drive. This RAID configuration provides data protection and increases fault tolerance to the entire system. Use two new drives or use an existing drive and a new drive for this setup. The new drive must be of the same size or larger than the existing drive. JBOD (Spanning) stands for Just a Bunch of Disks and refers to hard disk drives that are not yet configured as a RAID set. This configuration stores the same data redundantly on multiple disks that appear as a single disk on the operating system. Spanning does not deliver any advantage over using separate disks independently and does not provide fault tolerance or other RAID performance benefits. If you use either Windows® XP or Windows® 2000 operating system (OS), copy first the RAID driver from the support CD to a floppy disk before creating RAID configurations.
    [Show full text]
  • Building Reliable Massive Capacity Ssds Through a Flash Aware RAID-Like Protection †
    applied sciences Article Building Reliable Massive Capacity SSDs through a Flash Aware RAID-Like Protection † Jaeho Kim 1 and Jung Kyu Park 2,* 1 Department of Aerospace and Software Engineering & Engineering Research Institute, Gyeongsang National University, Jinju 52828, Korea; [email protected] 2 Department of Computer Software Engineering, Changshin University, Changwon 51352, Korea * Correspondence: [email protected] † This Paper Is an Extended Version of Paper Published in the IEEE International Conference on Consumer Electronics (ICCE) 2020, Las Vegas, NV, USA, 4–6 January 2020. Received: 14 November 2020; Accepted: 16 December 2020; Published: 21 December 2020 Abstract: The demand for mass storage devices has become an inevitable consequence of the explosive increase in data volume. The three-dimensional (3D) vertical NAND (V-NAND) and quad-level cell (QLC) technologies rapidly accelerate the capacity increase of flash memory based storage system, such as SSDs (Solid State Drives). Massive capacity SSDs adopt dozens or hundreds of flash memory chips in order to implement large capacity storage. However, employing such a large number of flash chips increases the error rate in SSDs. A RAID-like technique inside an SSD has been used in a variety of commercial products, along with various studies, in order to protect user data. With the advent of new types of massive storage devices, studies on the design of RAID-like protection techniques for such huge capacity SSDs are important and essential. In this paper, we propose a massive SSD-Aware Parity Logging (mSAPL) scheme that protects against n-failures at the same time in a stripe, where n is protection strength that is specified by the user.
    [Show full text]
  • Rethinking RAID for SSD Reliability
    Differential RAID: Rethinking RAID for SSD Reliability Asim Kadav Mahesh Balakrishnan University of Wisconsin Microsoft Research Silicon Valley Madison, WI Mountain View, CA [email protected] [email protected] Vijayan Prabhakaran Dahlia Malkhi Microsoft Research Silicon Valley Microsoft Research Silicon Valley Mountain View, CA Mountain View, CA [email protected] [email protected] ABSTRACT sult, a write-intensive workload can wear out the SSD within Deployment of SSDs in enterprise settings is limited by the months. Also, this erasure limit continues to decrease as low erase cycles available on commodity devices. Redun- MLC devices increase in capacity and density. As a conse- dancy solutions such as RAID can potentially be used to pro- quence, the reliability of MLC devices remains a paramount tect against the high Bit Error Rate (BER) of aging SSDs. concern for its adoption in servers [4]. Unfortunately, such solutions wear out redundant devices at similar rates, inducing correlated failures as arrays age in In this paper, we explore the possibility of using device-level unison. We present Diff-RAID, a new RAID variant that redundancy to mask the effects of aging on SSDs. Clustering distributes parity unevenly across SSDs to create age dispari- options such as RAID can potentially be used to tolerate the ties within arrays. By doing so, Diff-RAID balances the high higher BERs exhibited by worn out SSDs. However, these BER of old SSDs against the low BER of young SSDs. Diff- techniques do not automatically provide adequate protec- RAID provides much greater reliability for SSDs compared tion for aging SSDs; by balancing write load across devices, to RAID-4 and RAID-5 for the same space overhead, and solutions such as RAID-5 cause all SSDs to wear out at ap- offers a trade-off curve between throughput and reliability.
    [Show full text]
  • Disk Array Data Organizations and RAID
    Guest Lecture for 15-440 Disk Array Data Organizations and RAID October 2010, Greg Ganger © 1 Plan for today Why have multiple disks? Storage capacity, performance capacity, reliability Load distribution problem and approaches disk striping Fault tolerance replication parity-based protection “RAID” and the Disk Array Matrix Rebuild October 2010, Greg Ganger © 2 Why multi-disk systems? A single storage device may not provide enough storage capacity, performance capacity, reliability So, what is the simplest arrangement? October 2010, Greg Ganger © 3 Just a bunch of disks (JBOD) A0 B0 C0 D0 A1 B1 C1 D1 A2 B2 C2 D2 A3 B3 C3 D3 Yes, it’s a goofy name industry really does sell “JBOD enclosures” October 2010, Greg Ganger © 4 Disk Subsystem Load Balancing I/O requests are almost never evenly distributed Some data is requested more than other data Depends on the apps, usage, time, … October 2010, Greg Ganger © 5 Disk Subsystem Load Balancing I/O requests are almost never evenly distributed Some data is requested more than other data Depends on the apps, usage, time, … What is the right data-to-disk assignment policy? Common approach: Fixed data placement Your data is on disk X, period! For good reasons too: you bought it or you’re paying more … Fancy: Dynamic data placement If some of your files are accessed a lot, the admin (or even system) may separate the “hot” files across multiple disks In this scenario, entire files systems (or even files) are manually moved by the system admin to specific disks October 2010, Greg
    [Show full text]
  • Identify Storage Technologies and Understand RAID
    LESSON 4.1_4.2 98-365 Windows Server Administration Fundamentals IdentifyIdentify StorageStorage TechnologiesTechnologies andand UnderstandUnderstand RAIDRAID LESSON 4.1_4.2 98-365 Windows Server Administration Fundamentals Lesson Overview In this lesson, you will learn: Local storage options Network storage options Redundant Array of Independent Disk (RAID) options LESSON 4.1_4.2 98-365 Windows Server Administration Fundamentals Anticipatory Set List three different RAID configurations. Which of these three bus types has the fastest transfer speed? o Parallel ATA (PATA) o Serial ATA (SATA) o USB 2.0 LESSON 4.1_4.2 98-365 Windows Server Administration Fundamentals Local Storage Options Local storage options can range from a simple single disk to a Redundant Array of Independent Disks (RAID). Local storage options can be broken down into bus types: o Serial Advanced Technology Attachment (SATA) o Integrated Drive Electronics (IDE, now called Parallel ATA or PATA) o Small Computer System Interface (SCSI) o Serial Attached SCSI (SAS) LESSON 4.1_4.2 98-365 Windows Server Administration Fundamentals Local Storage Options SATA drives have taken the place of the tradition PATA drives. SATA have several advantages over PATA: o Reduced cable bulk and cost o Faster and more efficient data transfer o Hot-swapping technology LESSON 4.1_4.2 98-365 Windows Server Administration Fundamentals Local Storage Options (continued) SAS drives have taken the place of the traditional SCSI and Ultra SCSI drives in server class machines. SAS have several
    [Show full text]
  • 1 Configuring SATA Controllers A
    RAID Levels RAID 0 RAID 1 RAID 5 RAID 10 Minimum Number of Hard ≥2 2 ≥3 ≥4 Drives Array Capacity Number of hard Size of the smallest (Number of hard (Number of hard drives * Size of the drive drives -1) * Size of drives/2) * Size of the smallest drive the smallest drive smallest drive Fault Tolerance No Yes Yes Yes To create a RAID set, follow the steps below: A. Install SATA hard drive(s) in your computer. B. Configure SATA controller mode in BIOS Setup. C. Configure a RAID array in RAID BIOS. (Note 1) D. Install the SATA RAID/AHCI driver and operating system. Before you begin, please prepare the following items: • At least two SATA hard drives or M.2 SSDs (Note 2) (to ensure optimal performance, it is recommended that you use two hard drives with identical model and capacity). (Note 3) • A Windows setup disk. • Motherboard driver disk. • A USB thumb drive. 1 Configuring SATA Controllers A. Installing hard drives Connect the SATA signal cables to SATA hard drives and the Intel® Chipset controlled SATA ports (SATA3 0~5) on the motherboard. Then connect the power connectors from your power supply to the hard drives. Or install your M.2 SSD(s) in the M.2 connector(s) on the motherboard. (Note 1) Skip this step if you do not want to create RAID array on the SATA controller. (Note 2) An M.2 PCIe SSD cannot be used to set up a RAID set either with an M.2 SATA SSD or a SATA hard drive.
    [Show full text]
  • Techsmart Representatives
    Wave TechSmart representatives RAID BASICS ARE YOUR SECURITY SOLUTIONS FAULT TOLERANT? Redundant Array of Independent Disks (RAID) is a Enclosure: The "box" which contains the controller, storage technology used to improve the processing drives/drive trays and bays, power supplies, and fans is capability of storage systems. This technology is called an "enclosure." The enclosure includes various designed to provide reliability in disk array systems and controls, ports, and other features used to connect the to take advantage of the performance gains offered by RAID to a host for example. an array of mulple disks over single-disk storage. Wave RepresentaCves has experience with both high- RAID’s two primary underlying concepts are (1) that performance compuCng and enterprise storage, providing distribuCng data over mulple hard drives improves soluCons to large financial instuCons to research performance and (2) that using mulple drives properly laboratories. The security industry adopted superior allows for any one drive to fail without loss of data and compuCng and storage technologies aGer the transiCon without system downCme. In the event of a disk from analog systems to IP based networks. This failure, disk access will conCnue normally and the failure evoluCon has created robust and resilient systems that will be transparent to the host system. can handle high bandwidth from video surveillance soluCons to availability for access control and emergency Originally designed and implemented for SCSI drives, communicaCons. RAID principles have been applied to SATA and SAS drives in many video systems. Redundancy of any system, especially of components that have a lower tolerance in MTBF makes sense.
    [Show full text]
  • Softraid Boot
    softraid boot Stefan Sperling <[email protected]> EuroBSDcon 2015 Introduction to softraid OpenBSD's softraid(4) device emulates a host controller which provides a virtual SCSI bus uses disciplines to perform I/O on underlying disks: RAID 0, RAID 1, RAID 5, CRYPTO, CONCAT borrows the bioctl(8) configuration utility from the bio(4) hardware RAID abstraction layer softraid0 at root scsibus4 at softraid0: 256 targets sd9 at scsibus4 targ 1 lun 0: <OPENBSD, SR RAID 1, 005> SCSI2 0/direct fixed sd9: 1430796MB, 512 bytes/sector, 2930271472 sectors (RAID 1 softraid volume appearing as disk sd9) OpenBSD softraid boot 2/22 Introduction to softraid OpenBSD's softraid(4) device uses chunks (disklabel slices of type RAID) for storage records meta data at the start of each chunk: format version, UUID, volume ID, no. of chunks, chunk ID, RAID type and size, and other optional meta data # disklabel -pm sd2 [...] # size offset fstype [fsize bsize cpg] c: 1430799.4M 0 unused d: 1430796.9M 64 RAID # bioctl sd9 Volume Status Size Device softraid0 0 Online 1500298993664 sd9 RAID1 0 Online 1500298993664 0:0.0 noencl <sd2d> 1 Online 1500298993664 0:1.0 noencl <sd3d> (RAID 1 softraid volume using sd2d and sd3d for storage) OpenBSD softraid boot 3/22 Introduction to softraid softraid volumes can be assembled manually with bioctl(8) or automatically during boot softraid UUID ties volumes and chunks together disk device names and disklabel UUIDs are irrelevant when softraid volumes are auto-assembled volume IDs are used to attach volumes in a predictable order stable disk device names unless disks are added/removed chunk IDs make chunks appear in a predictable order important for e.g.
    [Show full text]
  • RAID — Begin with the Basics How Does RAID Work? RAID Increases Data Protection and Performance by What Is RAID? Duplicating And/Or Spreading Data Over Multiple Disks
    RAID — Begin With The Basics How Does RAID Work? RAID increases data protection and performance by What is RAID? duplicating and/or spreading data over multiple disks. DRIVE 1 RAID stands for Redundant Array of Inexpensive Disks. RAID is a method of logically treating Mirroring Mirroring & Striping several hard drives as one unit. It can offer fault tolerance and higher throughput levels than a Duplicates data from primary Mirrors data that is striped, spread single hard drive or group of independent hard drives. DRIVE 2 drive to secondary drive evenly across multiple disks Why Do We Need It? RAID provides real-time data recovery when a hard drive fails, increasing system uptime and DRIVE 1 DRIVE 1 DRIVE 3 availability while protecting against loss of data. Multiple drives working together also increase system performance. DRIVE 2 DRIVE 2 DRIVE 4 Levels of RAID DRIVE 1 DRIVE 3 RAID Level Description Minimum # of Drives Benefit RAID 0 Data striping (no data protection) 2 Highest performance DRIVE 2 DRIVE 4 RAID 1 Disk mirroring 2 Highest data protection RAID 1E Disk mirroring 3 Highest data protection for an odd number of disks RAID 5 Data striping with distributed parity 3 Best cost/performance balance for multi-drive environments RAID 5EE Data striping with distributed parity with 4 The cost/performance balance of RAID 5 without setting aside a dedicated hotspare disk hotspare integrated into the array RAID 6 Data striping with dual distributed parity 4 Highest fault tolerance with the ability to survive two disk failures RAID 10 Data
    [Show full text]
  • ZFS Basics Various ZFS RAID Lebels = & . = a Little About Zetabyte File System ( ZFS / Openzfs)
    = BDNOG11 | Cox's Bazar | Day # 3 | ZFS Basics & Various ZFS RAID Lebels. = = ZFS Basics & Various ZFS RAID Lebels . = A Little About Zetabyte File System ( ZFS / OpenZFS) =1= = BDNOG11 | Cox's Bazar | Day # 3 | ZFS Basics & Various ZFS RAID Lebels. = Disclaimer: The scope of this topic here is not to discuss in detail about the architecture of the ZFS (OpenZFS) rather Features, Use Cases and Operational Method. =2= = BDNOG11 | Cox's Bazar | Day # 3 | ZFS Basics & Various ZFS RAID Lebels. = What is ZFS? =0= ZFS is a combined file system and logical volume manager designed by Sun Microsystems and now owned by Oracle Corporation. It was designed and implemented by a team at Sun Microsystems led by Jeff Bonwick and Matthew Ahrens. Matt is the founding member of OpenZFS. =0= Its development started in 2001 and it was officially announced in 2004. In 2005 it was integrated into the main trunk of Solaris and released as part of OpenSolaris. =0= It was described by one analyst as "the only proven Open Source data-validating enterprise file system". This file =3= = BDNOG11 | Cox's Bazar | Day # 3 | ZFS Basics & Various ZFS RAID Lebels. = system also termed as the “world’s safest file system” by some analyst. =0= ZFS is scalable, and includes extensive protection against data corruption, support for high storage capacities, efficient data compression and deduplication. =0= ZFS is available for Solaris (and its variants), BSD (and its variants) & Linux (Canonical’s Ubuntu Integrated as native kernel module from version 16.04x) =0= OpenZFS was announced in September 2013 as the truly open source successor to the ZFS project.
    [Show full text]
  • RAID with Windows Server 2012 O RAID 0: Stripe Set Without Parity O RAID 1: Mirror Set Without Parity O RAID 5: Stripe Set with Distributed Parity
    Lesson 3: Configuring Local Storage MOAC 70-410: Installing and Configuring Windows Server 2012 Overview • Exam Objective 1.3: Configure Local Storage • Planning Server Storage • Understanding Windows Disk Settings • Working with Disks © 2013 John Wiley & Sons, Inc. 2 Planning Server Storage Lesson 3: Configuring Local Storage © 2013 John Wiley & Sons, Inc. 3 Planning Server Storage When planning storage solutions for a server, you must consider many factors: • The amount of storage the server needs • The number of users that will be accessing the server at the same time • The sensitivity of the data to be stored on the server • The importance of the data to the organization © 2013 John Wiley & Sons, Inc. 4 How Many Servers Do I Need? When is one big file server preferable to several smaller ones? Consider the storage limitations of Windows Server 2012 ReFS. Attribute Limit based on the on-disk format Maximum size of a single file 264-1 bytes Format supports 278 bytes with 16KB cluster size. Maximum size of a single volume Windows stack addressing allows 264 bytes Maximum number of files in a directory 264 Maximum number of directories in a volume 264 Maximum file name length 32K unicode characters Maximum path length 32K Maximum size of any storage pool 4 petabytes Maximum number of storage pools in a system No limit Maximum number of spaces in a storage pool No limit © 2013 John Wiley & Sons, Inc. 5 Estimating Storage Requirements The amount of space you need in a server depends on a variety of factors, not just the requirements of your applications and users: • Operating system: Depends on roles and features chosen • Paging file: Depends on RAM and number of VMs • Memory dump: Space to hold the contents of memory + 1MB • Log files: From Event Viewer • Shadow copies: Can utilize up to 10% of space • Fault tolerance: Disk mirroring versus parity © 2013 John Wiley & Sons, Inc.
    [Show full text]
  • Which RAID Level Is Right for Me?
    STORAGE SOLUTIONS WHITE PAPER Which RAID Level is Right for Me? Contents Introduction.....................................................................................1 RAID 10 (Striped RAID 1 sets) .................................................3 RAID Level Descriptions..................................................................1 RAID 50 (Striped RAID 5 sets) .................................................4 RAID 0 (Striping).......................................................................1 RAID 60 (Striped RAID 6 sets) .................................................4 RAID 1 (Mirroring).....................................................................2 RAID Level Comparison ..................................................................5 RAID 1E (Striped Mirror)...........................................................2 About Adaptec RAID .......................................................................5 RAID 5 (Striping with parity) .....................................................2 RAID 5EE (Hot Space).....................................................................3 RAID 6 (Striping with dual parity).............................................3 Data is the most valuable asset of any business today. Lost data of users. This white paper intends to give an overview on the means lost business. Even if you backup regularly, you need a performance and availability of various RAID levels in general fail-safe way to ensure that your data is protected and can be and may not be accurate in all user
    [Show full text]