Intel® Solutions for Lustre* Software SC16 Technical Training
Total Page:16
File Type:pdf, Size:1020Kb
SC’16 Technical Training All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit http://www.intel.com/performance. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at http://www.intel.com/content/www/us/en/software/intel-solutions-for-lustre-software.html. You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. 3D XPoint, Intel, the Intel logo, Intel Core, Intel Xeon Phi, Optane and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. * Other names and brands may be claimed as the property of others. © 2016 Intel Corporation 2 Introductions Lustre Overview Roadmap Deep Dive 10:30 Break Lustre on ZFS Update Lunch (provided) Intel® Omni-Path with Lustre Knights Landing with Lustre Introduction to Intel® HPC Orchestrator Lustre Performance Tuning Review 3 Intel® Scalable System Framework A Holistic Solution for All HPC Needs Small Clusters Through Supercomputers Compute Memory / Storage Compute and Data-Centric Computing Fabric Software Standards-Based Programmability On-Premise and Cloud-Based Intel® Solutions for Lustre* Intel® HPC Orchestrator Intel® Xeon® Processors Intel® Omni-Path Architecture Intel® Optane™ Technology Intel® Software Tools Intel® Xeon Phi™ Processors Intel® Silicon Photonics 3D XPoint™ Technology Intel® Cluster Ready Program Intel® FPGAs and Server Solutions Intel® Ethernet Intel® SSDs Intel Supported SDVis 4 Lets go around the room! 5 71% 9 of Top10 Sites 71% of Top100 Most Adopted PFS Most Scalable PFS Open Source GPL v2 18% Commercial Packaging Vibrant Community 4% 7% Lustre GPFS NFS Other December 2015: Intel’s Analysis of Top 100 Systems (top100.org) 6 Commit per Organization Lines of codes per organization 2% 1% 1% 1%1% 2% 2% 2% 2% 3% 2% 3% 4% 6% 6% 18% 8% Intel Intel 65% 65% Intel ORNL* Seagate* Cray* DDN* Intel ORNL Cray Atos Seagate Atos* LLNL* CEA* IU Other DDN IU CEA Other 1 Source: Chris Morrone, Lead of OpenSFS Lustre Working Group, April 2016 7 Bioscience Government research and defense Large-scale manufacturing Genomic data analysis, modeling and Government funded research. Surveillance, Mechanical, computer-aided design & simulations Signal Processing, encryption etc. computer-aided engineering systems Weather and climate Energy Finance Highly complex CGI rendering Seismic processing, reservoir modeling / Fraud detection, Monte Carlo simulations, characterization, sensor data analysis risk management analysis * Other names and brands may be claimed as the property of others. 8 Intel® Scalable System Framework for HPC Intel® FOUNDATION Edition Intel® ENTERPRISE Edition Intel® CLOUD Edition for Lustre* software for Lustre* software for Lustre* software Delivers the latest functions and Maximum performance with minimal Cost-effective access to parallel features, fully supported by Intel complexity and cost for multi- storage on Amazon Web Services* Ideal for organizations that prefer to petabyte file system. Management (AWS) and Microsoft Azure* to boost design and deploy their own open with Intel® Manager for Lustre* cloud-computing source configurations software * Other names and brands may be claimed as the property of others. 9 Read/White OST Heat Map Balance Metadata Read/White Operations Bandwidth 10 * Other names and brands may be claimed as the property of others. 11 * Other names and brands may be claimed as the property of others. 12 Management Object Storage Object Storage Target (MGT) MetadataTarget (MDT) Targets (OSTs) Targets (OSTs) Metadata Object Storage Servers Servers (1-10s) (10s-1000s) Management Network Intel Manager for Lustre High Performance Data Network (Infiniband*, 10GbE) Lustre Clients (1 – 100,000+) Native Lustre* Client for Intel® Xeon Phi™ processor Intel® Omni-Path Support Robin Hood OpenZFS, RAIDz Hadoop* Adapters HSM * Other names and brands may be claimed as the property of others. 13 14 15 Lustre w/ZFS – Unique Features ZFS System Design Software Installation Lustre ZFS HA Overview 16 Raidz2: Data+2 parity data protection scheme Raidz3: Data+3 parity data protection scheme Vdev: Collection of devices (eg: raidz2 9+2 Vdev) Zpool: Collection of vdevs Zpools become Lustre OSTs You can have many devs in a zpool L2arc cache: ZFS Read Cache 17 Incredible reliability – Data is always consistent on disk; silent data corruption is detected and corrected; smart rebuild strategy Compression – Maximize usable capacity for increased ROI Snapshot – support built into Lustre – Consistent snapshot across all the storage targets without stopping the file system. Hybrid Storage Pool – Data is tiered automatically across DRAM, SSD/NVMe and HDD accelerating random & small file read performance Manageability – Powerful storage pool management makes it easy to assemble and maintain Lustre storage targets from individual devices 18 Silent Data Corruption is a real world issue: “Data ~= Dada” Causes: Interface Design Manufacturing Defects Cable Defects Heat/Power/Vibrations Software defects Netapp Study* : 1.5 Million Drives: 41 Months:400,000 Errors * https://atg.netapp.com/wp-content/uploads/2008/03/corruption-fast08.pdf 19 On Write: Write data + checksum On Read: Read data and re-compute checksum then compare to original On Error: If running zRaid discard read and recalculate from VDEV Notify user and continue on 20 Enable more space allocation to users minimizes hardware costs more data in the same footprint Increase the file transfer rate Increase throughput by up to 25% See Laval University’s presentation from HP CAST 2015: http://www.hp-cast.org/ Compression effects on genomics files Text based output of genomic sequence systems Human genome can generate 600GB file size 21 How Can Lustre* Snapshots Be Used? Undo/undelete/recover file(s) from the snapshot . Removed file by mistake, application failure causes data invalid Quickly backup the filesystem before system upgrade . Upgrade Lustre/kernel may hit some trouble and need to roll back Prepare a consistent frozen data view for backup tools . Ensure system is consistent for the whole backup Intel, the Intel logo, Xeon, and others are trademarks of Intel Corporation in the U.S. and/or other countries. 22 * Other names and brands may be claimed as the property of others. © 2016 Intel Corporation ZFS-based Lustre* Snapshot Overview . ZFS snapshot created on each target with a new fsname . Mount as separate read-only Lustre filesystem on client(s) . Architecture details: http://wiki.lustre.org/Lustre_Snapshots commands lctl API lctl snapshot Lustre ZFS control control MGS Lustre kernel Userspace MDSs Lustre kernel ZFS tools set Lustre kernel ZFS tools set OSSs Intel, the Intel logo, Xeon, and others are trademarks of Intel Corporation in the U.S. and/or other countries. 23 * Other names and brands may be claimed as the property of others. © 2016 Intel Corporation Global Write Barrier “Freeze” the system during creating snapshot pieces on every target. Write barrier on MDTs only . No orphans, no dangling references New lctl commands for the global write barrier . lctl barrier_freeze <fsname> [timeout (seconds)] . lctl barrier_thaw <fsname> . lctl barrier_stat <fsname> Intel, the Intel logo, Xeon, and others are trademarks of Intel Corporation in the U.S. and/or other countries. 24 * Other names and brands may be claimed as the property of others.