
DRESS Code For The Storage Cloud Distributed Replication-based Exact Simple Storage Salim El Rouayheb EECS Department University of California, Berkeley Joint work with: Sameer Pawar Nima Noorshams Prof. Kannan Ramchandran Talk Roadmap I Distributed Storage Background Failures & Repairs II & Motivation Reliability & Security Low Complexity Redundancy & Codes DRESS Uncoded repair Projective planes Codes Bins & balls III Malicious attacks Network coding Separation result Security Secure capacity Data Everywhere Coffee shop Airport Office Home • Increasing popularity of data-enabled devices and Hot spots, 4G, Wimax, Wi-fi etc, => “always online” infrastructure. • Truly mobile users • Want to be able to seamlessly access and share data anytime anywhere Off To The Cloud… • Want our files to “follow” us wherever we go Put them in the cloud Get them from the cloud Data Centers P2P Systems Cloud Applications Amazon Web Services Wuala: P2P Cloud Storage • Trade idle local disk space on your computer for online storage • First 1GB is free Cloud storage = local donated storage * percentage of online time 70 GB 100 GB 70% Can We Trust the Cloud? Main Challenge: Cloud is formed of unreliable and untrusted components CloudyDistributed with a Chance Storage of Failure Data center Server Disk failure virus user peer Component failures are the norm rather then exception. • Nodes are unreliable by nature: - Hard disk failure - Network failure Solution: store data - Peer churning redundantly - Malicious nodes Multiple System Considerations Coding for cloud storage Decoding complexity C rate “Traditional” Coding Theory Escher “Impossible” Cube Different Codes for Different Folks Replication (4,2) Erasure code New node New node n n 1 a a 1 a a a b 2MB File 1MB Download a+b n2 a n2 b a b 1 MB Download n n 2MB 3 b 3 a+b n n 4 b 4 a+2b tolerates 1 failure tolerates 2 failures Reliability Bandwidth Complexity/ Disk I/O Reducing Bandwidth Overhead • Repair BW can be reduced from 2 to 1.5 MB • Helper nodes send random linear combinations: random linear network coding .5MB • New data “functionally equivalent” to lost a 1 data. a2 • Repair BW can be reduced from 1.5 MB to a1 a2 b1 b2 1.2 MB if each node is willing to store 20% more. a, b b1 b2 b +b 2MB 1 2 a1+b1 a +b +a +b a2+b2 1 1 2 2 .5MB d=3 .5MB a +2b 1 1 a1+a2 .5MB 2a2+b2 a +2b +2(a +2b ) 1 1 2 2 a1+4a2 Dimakis, Wu & Ramchandran ‘07 Replacement node Fundamental Storage/Bandwidth Tradeoff cost Internet, Wireless MBR Min. Bandwidth Regime Storage Archival applications Min. Storage Regime MSR MDS (Dimakis et al. ’07) Repair BW We Want More: Exact Repair? • We want exact repair: a1 – Maintain a systematic copy of the n 1 a data 2 – Reduce system overhead b1 n2 – Security b2 ? a1 a1+b1 n3 a2 • Exact repair makes the problem 2a2+b2 Exact much harder 2a +b repair n4 1 1 – Relation to non-multicast network a2+b2 coding • Recent important progress: Rashmi et al. (Product-Matrix Codes), Suh & Ramchandran ‘10, Cadambe et al. ‘10 Storage/Bandwidth Tradeoff This Talk cost Min. Bandwidth Regime Recent Result: (Shah et al ’10) MBR Only MBR and MSR points are achievable under exact repair Storage Interference alignment MSR techniques Min. Storage Regime Repair BW Can we also reduce Disk I/O & Energy? Bandwidth-efficiency comes with a price-tag. a 1) Excessive data reading & 1 a computations 2 2) Read/write bandwidth of a hard b1 b +b disk is the bottleneck (100 MB/s vs. b2 1 2 New node network speed ~1000 MB/s) a +2a +b +b a1 a1+b1 1 2 1 2 a2 2a2+b2 Can we have exact codes with minimum overheads of: 2a1+b1 1) Repair bandwidth a2+b2 2) Disk reads 3) Computations? Contributions: Yes We Can • DRESS codes: Exact Repair Minimum disk reads for repair No data processing for repair: uncoded repair Maximum Distance Separable MDS Code Fractional Repetition file Distributed Replication-based Exact Storage Simple Storage Cloud • Surprise: no loss of bandwidth efficiency • Bonus: well-suited for security applications • System price: repair not as flexible w.r.t. who can help Talk Roadmap I Background II & Motivation Uncoded repair DRESS Projective planes Steiner Systems Codes Bins & balls III Security DRESS Code Example d=3MB (n,k)=(7,3) 1 2 3 n Max file size= 6MB: cut-set bound 1" (Dimakis et al.) 3 4 5 n 2" K=3 parity-check File 6 MB 1 5 6 n3" 1 2 3 4 5 6 MDS 1 2 3 4 5 6 7 Code 1 4 7 n 4" User 2 5 7 Gets 6 different n New parameter 5" packets repetition factor r=3 3 6 7 n6" 2 4 6 n7" n=7 nodes Repairing DRESS Codes failure (n,k)=(7,3) parity-check New node File 6 MB 1 2 3 1 2 3 n1" 1 2 3 4 5 6 7 3 4 5 n2" ! Repetition r=3can tolerate up to 2 simultaneous 1 5 6 n3" failures ! Table based repair 1 4 7 n4" •! If 3 nodes fail? 2 5 7 –! Download complete file n5" –! Reconstruct the lost data 3 6 7 –! Rare bad even n6" ! Min repair bandwidth 2 4 6 n7" ! Min disk I/O calls ! Uncoded repair Why “Traditional” Repetition Won’t Work? Max file size= 6MB: cut-set bound 1 2 3 (Dimakis et al.) n1" 1 2 3 parity-check n2" File 6 MB 1 2 3 4 5 6 7 1 2 3 n 3" user gets ONLY 3MB 4 5 6 (n,k)=(7,3) n4" repetition factor r=3 4 5 6 n5" 4 5 6 n6" 7 7 7 n7" Smart Packet Placement How should we place the packet replicas? Goal: maximize file size k(k 1) Max file size: C = kd − (Dimakis et al.) − 2 Penalty term due # nodes contacted to intersection by user # packets/node k(k 1) k Rule: Make sure that any − = two nodes have at most one 2 2 packet in common Where Did That Solution Come From? Lines in projective planes 1 2 3 intersect in exactly one point n1" 3 4 5 n2" 1 5 6 n3" 1 4 7 n4" 2 5 7 Fano Plane n5" Projective plane of order 2 3 6 7 n6" Points Packets 2 4 6 n Lines Storage nodes 7" Higher Order Projective Planes order m=2 m=3 m=4 •! Projective planes are guaranteed to exist for any prime power order m. Codes from Projective Planes Projective plane of order m=3 1 12 2 11 3 1 1 Projective Plane Storage System 1 10 13 4 Lines Storage nodes 9 5 Points Packets 8 6 7 Can we have constructions spanning n=13, r=d=4 more system parameters? Theorem 1 (R. & Ramchandran Allerton’10) A projective plane of order m gives DRESS codes with n=m2+m+1 nodes and repetition factor r=repair degree d = m+1. Kirkman’s Schoolgirls Problem “Fifteen young ladies in a school walk out three abreast for seven days in succession; it is required to arrange them daily so that no two shall walk twice abreast.” Kirkman, 1847 •! Fifteen schoolgirls: A, B,…, O •! They always take their daily walks in groups of threes group 1 group 2 group 3 group 4 group 5 Sun ABC DEF GHI JKL MNO Mon ADH BEK MOI FLN GJC … ? ? ? ? ? Sat ? ? ? ? ? •! Arrange them such that any two schoolgirls walk together exactly once a week Why Do We Care? Storage node A solution to Kirkman’s problem Sun ABC DEF GHI JKL MNO Mon ADH BEK CIO FLN GJM Tue AEM BHN CGK DIL FJO 7 days Wed AFI BLO CHJ DKM EGN Thu AGL BDJ CFM EHO IKN Fri AJN BIM CEL DOG FHK Sat AKO BFG CDN EIJ HLM 5 groups •! Think of schoolgirls as packets and groups as storage nodes •! Kirkman’s solution gives an optimal DRESS code for a ! System with n=35 nodes ! Each node stores 3 packets •! What is the repetition factor here? answer=7 Not Your Average Puzzle… •! Kirkman’s schoolgirls problem initiated a branch of combinatorics known now as Design Theory •! Deep connections to other branches in mathematics: Algebra, Geometry, matroid theory… •! Applications: –! Coding Theory –! Cryptography –! Statistics Steiner Systems A Steiner system is a collection of points & lines such that: 1.! There are points in total 2.! Each line contains exactly points We want t=2 3.! There is exactly one line that contains any given points 1 2 3 4 5 6 7 8 9 Fano plane =S(2, 3, 7) S(2,3,9) # pts on a line Total # points Codes from Steiner Systems Theorem 2: (R. & Ramchandran Allerton’10) A Steiner system gives a capacity-achieving DRESS code with parameters using correspondence lines: nodes, points: packets. DRESS Codes Open problem I ? ? DRESS codes Projective beyond Steiner Steiner Dual planes Systems codes systems? ? ? ? Min Reads Capacity Min Bandwidth Capacity Lookup Minimum k C = k d Repair! reads! × − 2 Open problem II Capacity for minimum reads (and lookup repair)? " Example (n,k,d)=(7,3,3) 1 2 3 1 2 3 4 5 6 4 5 6 7 8 9 1 4 7 7 8 9 2 5 8 3 6 9 • Users contacting any 3 nodes will observe either 9 or 7 packets • This code guarantees a “rate” = 7 to any user • Capacity when repairing from any d nodes = 6 We Want More • Explicit constructions for Steiner systems exist for small values of r=3,4,5 • Existential guarantees for large number of nodes [Wilson 75] • Want more flexibility • Want to “grow”.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages43 Page
-
File Size-