SFD15 - WD Tegile
Total Page:16
File Type:pdf, Size:1020Kb
SFD15 - WD Tegile @WesternDigiDC Narayan Venkat Phil Bullinger (SVP DC Systems) ex-EMC (Isilon), Oracle, Engenio Introduction to Data Centre Systems 13500+ active patents worldwide $26B Market capitalisation Storage systems, SSDs, HDDs, embedded and removable flash memory Roughly half of the world’s data is stored on Western Digital products SanDisk, WD, HGST, G-Technology, upthere, Tegile Strategy and Capabilities Ultrastar - Flash and HDD Platform and Integration Services ActiveScale - Object Storage, Geo-distributed IntelliFlash - Unified Block / File Storage, NVMe, All-Flash and Hybrid Density Capacity Durability Integrity Performance Manageability *Narayan Venkat One Flash Platform, Any workload - Test/Dev - Messaging - Collaboration - Warehousing - Analytics - OLTP Balanced Performance Capacity Media (HDD, MLC, TLC) Extreme Performance Performance Media (SCM, NVMe, eMLC) Hybrid - 10% Flash Flash-Hybrid - 30% - 50% Flash All-Flash - 100% Flash NVMe - 100% NVMe Flash One OS, Feature Set, User Experience IntelliFlash Product Portfolio Hybrid systems targeted at secondary storage systems. IntelliFlash Operating System Flash Optimized Software Architecture Management Flexibility - integration with vSphere, Hyper-V, OpenStack, KVM - Automated Call-home, Web UI, RESTful APIs Protocol Choice - Multiple protocol access - FC, iSCSI, NFS, SMB-3, VVOLs Data Services - Snapshots, Clones, Inline Compression, Inline deduplication - Replication, Disaster Recovery Metadata Acceleration - Classification, Separation, and Placement of Metadata and Application data - Caching, Aggregation, and Scaling for high-speed storage operations Media Optimization - Delivers media resiliency, flash wear, and data integrity - Media-optimized protection, data layout and storage pooling for high performance Physical Media - Mixture of hard disks, dense flash, performance flash, persistent memory and dynamic RAM Shailendra Tripathi Architecture and Design Run every workloads at the speed of persistent memory Memory Tier Performance Tier Capacity Tier Categorising Physical Devices by Performance into Different Classes Fully Distributed Storage Architecture Read and Write I/O Flow Architecture IntelliFlash Flash Media Optimization Looking Ahead *Roger Weeks Amplidata ActiveScale - Archive and Backup - Active Data for Analytics - Data Forever Architecture - Versioning - Encryption - Replication - Single Pane Management - S3 Compatible APIs - Multi-Geo Availability Zones - Scale Up and Scale Out ActiveScale Architecture Durable - BitSpread erasure coding - BitDynamics data integrity Flexible - Single site scale-up and scale-out - Two+ sites asynchronous replication - Three site availability zones Scale-out - Metadata and data separate - Distributed system nodes store metadata - Columns of storage nodes store data BitSpread - Dynamic Data Placement Local - data does not move after ingest Performance - predictable across workloads Resilient - highly durable data ActiveScale EC - http://www.hgst.com/sites/default/files/resources/WP34- ActiveScale-Erasure-Coding-Technology.pdf BitDynamics - Continuous data Integrity Background - verification process always running Performance - not impacted by verification or repair Automatic - all repairs happen with no intervention GeoSpread - Availability Zones Single - Distributed erasure coded copy Available - Can sustain the loss of an entire site Efficient - Better than 2 or 3 copy replication ActiveScale Replication Create Regions Bucket - asynchronous replication base Any-Any - All active scale systems Choose - the number of sites you need ActiveScale Systems P100 - start as low as 720TB, goes to 18PB. 17x 9s data durability, 4.6KVA typical power consumption X100 - 5.4PB in a rack, 840TB - 52PB, 17x 9s data durability, 6.5KVA typical power consumption Scale out to 9 expansion racks, 52PB scale out per namespace Use Cases M&E - Media Archive - Tape replacement and augmentation - Transcoding - Playout Life Sciences - Bio imaging - Genomic Sequencing Analytics Mike McWhorter - Senior Technologist, DCS Field Applications Engineering S3A - S3 Adapter for Hadoop HDFS Triple Replication Add storage to Hadoop before S3A? Server scale out Data locality - data is stored closer to the processor for better performance The active data is your Working Set (stored on HDFS), everything else can sit on the object store.