DDN’s Vision for the Future of Lustre
LUG2015 Robert Triendl
ddn.com ddn.com 3 Topics
1. The Changing Markets for Lustre
2. A Vision for Lustre that isn’t Exascale
3. Building Lustre for the Future
4. Peak vs. Operational Performance
5. Application Optimized Lustre
6. Why Conventional Storage Still Matters
ddn.com 4 Hyperscale Storage Markets
HPC Cloud Scratch WORM(N) Petabytes Big DataBillions of Files Streaming Write & Data Random Read Large Files Analytics Small Files Infiniband Ethernet Single Location Distributed
ddn.com 5 Lustre Markets Today Archive 12% Work Cloud 31% 2%
Mixed 22%
Data 33%
ddn.com 6 Market Diversification
Cloud Work Data Mixed Use Archive
Weather HPC CAE General HPC Tier 2 Climate Work Chemical Academic Cloud
Genomics Finance Cloud Big Data Energy Security Science
ddn.com 7 Lustre Futures Beyond Exascale
CIFS/NFS Export, AD Integration, RAS Manufacturing Features, Snapshots, Data Management, etc.
Random Performance, Small File & Metadata Genomics Performance, Data Management, Security, etc.
General Broad Application Support, Connectors, User Academic Monitoring, User Access to Snapshot, etc.
Virtualization, Snapshots, Small File Read Cloud Performance, Data Distribution, etc.
Data Management Features, SMR Drive Use, Archive Data Scrubs, Data Distribution, etc.
ddn.com 8 Market Evolution
100%
90%
80%
70% Archive
60% Cloud 50% Mixed 40% Data 30%
20% Work
10%
0% 2012 2013 2014
ddn.com 9 Market Segments
100% 90% 80% 70% 60% Industry 50% Government 40% University 30% 20% 10% 0% 2011 2012 2013 2014
ddn.com 10 Disks: Throughput vs. IOPS
MB/sec IOPS/10 GB/sec
1800 14,000
1600 12,000 1400 10,000 1200
1000 8,000
800 6,000
600 4,000 400 2,000 200
0 0 Lustre 1.8 Lustre 2.4 ExaScaler 2.2 Lustre with BtrFS Next Gen SAS Drives
ddn.com 11 Lustre Development at DDN
▶ Lustre Usability Features
▶ Build-in Reliability and Availability
▶ Lustre Recovery
▶ Features for a Broader Market
▶ Performance for Broad Set of Applications
▶ Application-optimized Lustre
ddn.com 12 Lustre Code Contributions
120
100
80 Other EMC SUSE 60 Bull Cray 40 Seagate DDN
20
0 2.1 2.4.0 2.3.50-2.4.0 2.5.0 2.5.50-2.6.0
ddn.com 13 DDN ExaScaler Software Stack
DDN DDN & Intel Intel HPDD Other
Tape S3 Cloud Object (WOS) Data Fast Data Copy Management ExaScaler Data Management Framework
DDN DirectMon DDN ExaScaler Monitor Monitoring Intel IML & Management DDN ExaScaler
DDN Clients NFS/CIFS/S3 DDN IME Intel Hadoop DDN Lustre Edition
Core FS ldiskfs OpenZFS btrfs Intel DSS
Storage HW DDN Block Storage with SFX Cache Other HW
ddn.com 14 Why BtrFS?
▶ Standard Local Filesystem in RHEL7
▶ Better Throughput Performance than ZFS
▶ Similar Feature Set, but all Linux
▶ No Possible Patent Infringement
▶ Simple Integration and Deployment
ddn.com 15 Application-Optimized Lustre
▶ Lustre for Specific Applications
▶ Workload Profiling
▶ Optimization Across I/O Calls
▶ Optimizing Application Runtime
▶ Working with Customers
ddn.com 16 Genome Pipeline Benchmarks
Lustre 2.5 Client Performance Human Genetics samtools workflow 50 Samtools 20% faster 40 with DDN 30 Lustre 20 Runtime(Hours) optimizations 10
0 2.5.1 DDN Branch
ddn.com 17 SSD Pools and Caching
▶ DSS to Link File Layer to Block Layer
▶ Build into the File System
▶ Better use of SSDs for I/O Optimization
▶ Increased Small File Performance
▶ Increased Random Read to Large Files
▶ Additional specificity with fadvice()
ddn.com 18 ExaScaler Monitoring
• Filesystem, OSS, MDS, OST, MDT, etc. • Lightweight • JOB ID, UID/GID, application stats, etc. • Near real- me • Archive of data by policy • Massive scale
Burst Buffer Monitoring Server collectd
Graphite plugin
UDP(TCP)/IP based small graphite OSS, MDS text message Storage transfer
ddn.com 19 TITECH Examples
64 Billion of Lustre Stats in 15 days!
ddn.com 20 Why Block-Level Raid?
▶ Best Mixed I/O Performance
▶ Consistent Performance
▶ Hardware-optimized Performance
▶ Best Performance During Failure
▶ Integrated Storage Services
ddn.com 21 SFA RAID Stack Performance
▶ Above 1 Million 4K IOPS per 8 CPU Cores
▶ Above 10 GB/sec per 8 CPU Cores
▶ 8 Cores Sufficient for PCI Infrastructure
▶ More Cores for File System Services
▶ Additional Cores for More Functionality
ddn.com 22 SFA Random Read
MB/sec 30
25
20 512K I/O Size 15 1M I/O Size 2M I/O Size 10 4M I/O Size 5
0 1 8 16 24 32 40 48 56
ddn.com 23 Flexible SSU Design
S M L XL
ddn.com 24 SFA14K Performance SSU Monitoring Up to 45 GB/sec Up to 2950 TB 2-4 MDS External MDT MDT Storage 4-6 OSS
OST Storage
ddn.com 25 “Wolfcreek” Hardware
ddn.com 26 “Wolfcreek” Hardware
ddn.com