Improve the Performance and Scalability of RAID-6 Systems Using Erasure Codes

Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2012 Improve the Performance and Scalability of RAID-6 Systems Using Erasure Codes Chentao Wu Virginia Commonwealth University Follow this and additional works at: https://scholarscompass.vcu.edu/etd Part of the Engineering Commons © The Author Downloaded from https://scholarscompass.vcu.edu/etd/2894 This Dissertation is brought to you for free and open access by the Graduate School at VCU Scholars Compass. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of VCU Scholars Compass. For more information, please contact [email protected]. c by Chentao Wu, 2012 All Rights Reserved. i Improve the Performance and Scalability of RAID-6 Systems Using Erasure Codes A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at Virginia Commonwealth University. by Chentao Wu B. E., Huazhong Univ. of Sci. and Tech., Computer Sci. and Tech. M. E., Huazhong Univ. of Sci. and Tech., Software Engineering Director: Dr. Xubin He, Associate Professor Department of Electrical and Computer Engineering Virginia Commonwealth University Richmond, Virginia November 2012 Acknowledgements I would like to thank my dissertation advisor Dr. Xubin He, for his constant support and guidance at VCU in these years. He not only teaches me academically and shares his profound knowledge on data storage systems, but also encourages me in my daily life as an older brother. I would like to thank Dr. Wei Zhang, Dr. Meng Yu, Dr. Robert H. Klenke and Dr. Preetam Ghosh for serving on my advisory committee. The Storage Technology and Architecture Research (STAR) Lab at VCU have been very supportive in these years, but a special thank to Mr. Guanying Wu, Dr. Shenggang Wan, Dr. Huailiang Tan, Mr. Chao Yu and Mr. Tao Lyu for their continuous help and support. I would also like to thank Prof. Changsheng Xie, Dr. Qiang Cao, Dr. Jianzhong Huang, Dr. Zhihu Tan, Dr. Jiguang Wan, etc., who gave me a lot support on both academic research and daily life at Wuhan National Lab for Optoelectronics (WNLO), Huazhong University of Science and Technology (HUST), Wuhan, China. Finally, I would like to thank my parents and my younger brother for their understand- ing on my research. I would like to thank my wife Yizhe for her constant patience, love, and support in all of my endeavors for the last four years. Words cannot express my appreciation and gratefulness for finding you on this Earth and making the improbable a reality. iii Contents List of Tables ix List of Figures xi List of Algorithms xv List of Symbols xvi Preface xx Abstract xxii 1 Introduction1 1.1 Motivations..................................1 1.2 Problem Statement..............................3 1.3 Background and Related Work........................3 1.3.1 Definitions..............................3 1.3.2 Existing Erasure Codes........................5 1.3.3 Research on Performance in RAID Systems.............9 1.3.4 Research on Scalability in RAID Systems.............. 12 iv 1.4 Research Approaches............................. 16 2 H-Code 19 2.1 Introduction.................................. 19 2.2 H-Code.................................... 23 2.2.1 Data/Parity Layout and Encoding of H-Code............ 23 2.2.2 Construction Process......................... 25 2.2.3 Proof of Correctness......................... 26 2.2.4 Reconstruction............................ 29 2.3 Property Analysis............................... 32 2.3.1 Optimal Property of H-Code..................... 32 2.3.2 Partial Stripe Writes to Two Continuous Data Elements in H-Code. 34 2.3.3 Different Cases of Partial Stripe Writes to w Continuous Data Elements............................... 35 2.4 Performance Evaluation............................ 36 2.4.1 Evaluation Methodology....................... 36 2.4.2 Numerical Results.......................... 40 2.5 Summary................................... 45 3 HDP Code 47 3.1 Introduction.................................. 47 3.2 Summary on Load Balancing of Various Parities and Our Motivation.... 50 3.2.1 Horizontal Parity (HP)........................ 51 3.2.2 Diagonal/Anti-diagonal Parity (DP/ADP).............. 51 3.2.3 Vertical Parity (VP).......................... 52 v 3.2.4 Motivation.............................. 52 3.3 HDP Code................................... 53 3.3.1 Data/Parity Layout and Encoding of HDP Code........... 53 3.3.2 Construction Process......................... 55 3.3.3 Proof of Correctness......................... 55 3.3.4 Reconstruction Process........................ 57 3.3.5 Property of HDP Code........................ 59 3.4 Load Balancing Analysis........................... 59 3.4.1 Evaluation Methodology....................... 60 3.4.2 Numerical Results.......................... 63 3.5 Reliability Analysis.............................. 65 3.5.1 Fast Recovery on Single Disk.................... 66 3.5.2 Parallel Recovery on Double Disks................. 67 3.6 Summary................................... 68 4 Stripe-based Data Migration (SDM) Scheme 70 4.1 Introduction.................................. 70 4.2 Motivation................................... 72 4.3 SDM Scheme................................. 73 4.3.1 Priority Definition of Data/Parity Movements............ 74 4.3.2 Layout Comparison.......................... 75 4.3.3 Load Balancing Check in SDM................... 80 4.3.4 Data Migration............................ 83 4.3.5 Data Addressing Algorithm..................... 84 vi 4.3.6 Property of SDM........................... 84 4.4 Scalability Analysis.............................. 86 4.4.1 Evaluation Methodology....................... 86 4.4.2 Numerical Results.......................... 89 4.4.3 Analysis............................... 93 4.5 Summary................................... 93 5 MDS Coding Scaling Framework (MDS-Frame) 95 5.1 Introduction.................................. 95 5.2 Motivation................................... 96 5.3 MDS-Frame.................................. 97 5.4 Intermediate Code Layer........................... 98 5.4.1 Sp-Code................................ 99 5.4.2 Sp−1-Code and Sp+1-Code...................... 104 5.4.3 Observations............................. 106 5.5 Management Layer.............................. 107 5.5.1 Unified Management User Interface................. 107 5.5.2 Definition of the HE and LE Scaling................. 107 5.5.3 Scaling Algorithms.......................... 108 5.5.4 Overhead Analysis of Scaling between Intermediate Codes..... 109 5.5.5 Scaling Rules............................. 110 5.6 Scalability Analysis.............................. 110 5.6.1 Evaluation Methodology....................... 111 5.6.2 Numerical Results and Analysis................... 111 vii 5.7 Summary................................... 114 6 Conclusions 115 Bibliography 117 Vita 130 viii List of Tables 2.1 50 Random Integer Numbers......................... 38 2.2 Improvement of H-Code over Other Codes in terms of Average Partial Stripe Write Cost (Uniform Access)..................... 41 2.3 Improvement of H-Code over Other Codes in terms of Average Partial Stripe Write Cost (Random Access)..................... 43 2.4 Cost Of A Partial Stripe Write to w Continuous Data Elements in the Same Row (p = 7, Uniform Access)........................ 44 3.1 Summary of Different Parities........................ 53 3.2 50 Random Integer Numbers......................... 62 3.3 Different L Value in Horizontal Codes.................... 64 3.4 Reduced I/O of Single Disk failure in Different Codes............ 67 3.5 Recovery Time of Any Two Disk Failures in HDP Code (p = 5)...... 68 4.1 Summary on Various Fast Scaling Approaches................ 72 4.2 Priorities of Data/Parity Migration in SDM (For A Movement on Single Data/Parity Element)............................. 74 4.3 Different overhead of Extended Disk(s) Labelling in SDM......... 77 ix 4.4 Data Distribution of RAID-6 Scaling by Using SDM Scheme (RDP code, Scaling from 6 to 8 disks)........................... 80 4.5 Data Distribution of RAID-6 Scaling by Using SDM Scheme (P-Code, Scaling from 6 to 7 disks)........................... 80 4.6 Overhead of RAID-6 Scaling by Using SDM Scheme (RDP code, Scaling from 6 disks to 8 disks)............................ 87 4.7 Overhead of RAID-6 Scaling by Using SDM Scheme (P-Code, Scaling from 6 disks to 7 disks)............................ 88 4.8 Speed Up of SDM Scheme over Other RAID-6 Scaling Schemes in terms of Migration Time............................... 93 5.1 Overhead of Typical scaling in MDS-Frame................. 109 x List of Figures 1.1 EVENODD code for p + 2 disks when p = 5 and n = 7...........6 1.2 Blaum-Roth code for p + 1 disks (p = 5, n = 6)...............6 1.3 Liberation code for p + 2 disks (p = 5, n = 7)................7 1.4 RDP code for p + 1 disks (p = 5, n = 6)...................7 1.5 X-Code for p disks (p = 5, n = 5).......................8 1.6 P-Code.....................................8 1.7 HoVer code for 6 disks (n = 6)........................9 1.8 Reduce I/O cost when column 2 fails in RDP code (p = 7, n = 8)...... 11 1.9 Reduce I/O cost when column 2 fails in X-Code (p = 7, n = 7)....... 12 1.10 RAID-6 scaling in RDP from 6 to 8 disks using RR approach........ 14 1.11 RAID-6 scaling in P-Code from 6 to 7 disks using Semi-RR approach.... 15 1.12 RAID-5 scaling from 4 to 5 disks using ALV approach (all data blocks are need to be migrated).............................

Improve the Performance and Scalability of RAID-6 Systems Using Erasure Codes

VIA RAID Configurations

Building Reliable Massive Capacity Ssds Through a Flash Aware RAID-Like Protection †

Rethinking RAID for SSD Reliability

Disk Array Data Organizations and RAID

Identify Storage Technologies and Understand RAID

1 Configuring SATA Controllers A

Techsmart Representatives

Softraid Boot

RAID — Begin with the Basics How Does RAID Work? RAID Increases Data Protection and Performance by What Is RAID? Duplicating And/Or Spreading Data Over Multiple Disks

ZFS Basics Various ZFS RAID Lebels = & . = a Little About Zetabyte File System ( ZFS / Openzfs)

RAID with Windows Server 2012 O RAID 0: Stripe Set Without Parity O RAID 1: Mirror Set Without Parity O RAID 5: Stripe Set with Distributed Parity

Which RAID Level Is Right for Me?