Intel® Xeon Phi™ Processors

IDC HPC User Forum Intel’s vision: the Scalable system framework Bret Costelow September 2015 Legal Disclaimers Intel technologies features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at [intel.com]. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps. Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at https://www-ssl.intel.com/content/www/us/en/high- performance-computing/path-to-aurora.html. Intel, the Intel logo, Xeon, and Intel Xeon Phi are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other names and brands may be claimed as the property of others. © 2015 Intel Corporation 2 Converged Architecture for HPC and Big Data HPC Big Data FORTRAN / C++ Java* Applications Programming Applications Model Hadoop* MPI Simple to Use High Performance Resource HPC & Big Data-Aware Resource Manager Manager File System Lustre* with Hadoop* Adapter Remote Storage Hardware Compute & Big Data Capable Scalable Performance Components Server Storage Intel® Infrastructure (SSDs and Omni-Path Burst Buffers) Architecture *Other names and brands may be claimed as the property of others 3 Intel’s Scalable System Framework A Configurable Design Path Customizable for a Wide Range of HPC & Big Data Workloads Small Clusters Through Supercomputers Compute Memory/Storage Compute and Data-Centric Computing Fabric Software Standards-Based Programmability On-Premise and Cloud-Based Intel Silicon Photonics Intel® Xeon® Processors Intel® True Scale Fabric Intel® SSDs Intel® Software Tools Intel® Xeon Phi™ Intel® Omni-Path Architecture Intel® Lustre-based Solutions HPC Scalable Software Stack Coprocessors Intel® Ethernet Intel® Silicon Photonics Technology Intel® Cluster Ready Program Intel® Xeon Phi™ Processors 4 Another huge leap in CPU Performance Parallel performance Intel’s HPC Scalable System Framework . 72 cores; 2 VPU/core; 6 DDR4 channels with 384GB capacity . >3 Teraflop/s per socket1 Integrated memory . 16GB; 5X bandwidth vs DDR2 . 3 configurable modes (memory, cache, hybrid) Integrated fabric . 2 Intel Omni-Path Fabric Ports (more configuration options) Market adoption Extreme >50 systems providers expected3 Parallel >100 PFLOPS customer system compute commits to-date3 performance >Software development kits shipping Q4’15 >Standard IA Software Programming Model and Tools 1 Source: Intel internal information. 2Projected result based on internal Intel analysis of STREAM benchmark using a Knights Landing processor with 16GB of ultra high- bandwidth versus DDR4 memory only with all channels populated. 2Over 3 Teraflops of peak theoretical double-precision performance is preliminary and based on current 5 expectations of cores, clock frequency and floating point operations per cycle. FLOPS = cores x clock frequency x floating-point operations per second per cycle. Intel’s HPC Scalable System Framework Parallel compute technology New memory/storage technologies Intel Xeon 3D XPoint Technology Intel Xeon Phi (KNL) Next Gen NVRAM (eliminate offloading) Compute Memory/Storage Fabric Software New fabric technology Powerful Parallel File System Intel Silicon Intel Enterprise Lustre* Omni-Path Arch Photonics (KNL integration and w/ Phi Support today Intel Silicon Photonics) Hadoop Adapters Intel Manager for Lustre* 6 Multidisciplinary Approach to HPC Performance 7 Intel moves Lustre forward • 8 of Top10 Sites • Most Adopted • Most Scalable • Open Source • Commercial Packaging • 4th largest SW company • More than 80% of code • Vibrant Community HPC User Site Census: Middleware – April 2014. Lustre is a registered trademark of Seagate *Other names and brands may be claimed as the property of others. 8 Intel® Solutions for Lustre* Roadmap Q2’15 Q3’15 Q4’15 Q1’16 Q2’16 H2’16 H1’17 Enterprise Enterprise Edition Enterprise Edition Enterprise Edition Enterprise Edition Enterprise Edition Enterprise Edition HPC, Commercial, 2.2 2.3 2.4 3.0 3.1 4.0 Technical Computing, and Analytics • OpenZFS in • Manager Support for • Robinhood PE 2.5.5 • FE Lustre 2.7 Baseline • OpenZFS ZIL support • File-based replication Monitoring Mode Multiple MDT • RHEL 6.7 Server/Client • Striped MDT directories • Expanded OpenZFS • Layout enhancement • Hadoop adapter and • RHEL 7 Client Support Support • ZFS performance support in Manager • Progressive file layout scheduler connectors • SLES 12 Client Support • SLES 11 SP enhancements • Support for Omni Path • Data on MDT • DKMS support Server/Client Support • Client metadata networking • DNE II completion • DSS cache support performance • IU shared key network • IOPS heat maps • OpenZFS Snapshots encryption • True Scale support • SELinux & Kerberos • Directory-based Quotas • Xeon Phi Client • RHEL 7 (server + client) • RHEL 6.6 support • SLES 12 (server + client) • SLES 11 sp3 support Cloud Cloud Edition Cloud Edition Cloud Edition HPC, Commercial, 1.1.1 1.2 2.0 Technical Computing, and Analytics delivered on • FE Lustre 2.5.90 • IPSec over-the-wire encryption • Enhanced DR Automation • GovCloud support • Disaster recovery • Password and Security Enhancements Cloud Services • Encrypted EBS volumes • Supportability enhancements • Placement groups added to templates • Client mount cmd. in console Foundation Foundation Edition Foundation Edition Foundation Edition Community Lustre 2.7 2.8 2.9 software coupled with Intel support • Job tracking with changelogs • DNE phase lib • Dynamic Lnet config development • Multiple modify RPCs • Lfs setstripte to specify OSTs • UID/GID mapping • LFSCK 3 MDT-MDT consistency verification • NRS delay • IU UID/GID mapping feature (preview) * Some names and brands may be claimed as the property of others. Key: Shipping In Development In Planning Product placement not representative of final launch date within the specified quarter. For more details refer to ILU and 5 Quarter Roadmap. Intel’s HPC Scalable System Framework Parallel compute technology New memory/storage technologies Intel Xeon 3D XPoint Technology Intel Xeon Phi (KNL) Next Gen NVRAM (eliminate offloading) Compute Memory/Storage Fabric Software New fabric technology Powerful Parallel File System Intel Silicon Intel Enterprise Lustre* Omni-Path Arch Photonics (KNL integration and w/ Phi Support today Intel Silicon Photonics) Hadoop Adapters Intel Manager for Lustre* 10 A High Performance Fabric Better scaling vs InfiniBand* Intel’s HPC Scalable . 48 port switch chip – lower switch costs System Framework . 73% higher switch MPI message rate2 Intel® . 33% lower switch fabric latency3 Omni- Configurable / Resilient / Flexible Path . Job prioritization (Traffic Flow Optimization) Fabric . No-compromise resiliency (Packet Integrity Protection, Dynamic Lane Scaling) . Open source fabric management suite, OFA-compliant software Market adoption HPC’s . >100 OEM designs1 Next 1 Generatio . >100ku nodes under contract/bid 1 Source: Intel internal information. Design win count based on OEM and HPC storage vendors who are planning to offer either Intel-branded or custom switch products, along with the total number of OEM platforms that are currently planned to support custom and/or standard Intel® nOPA adapters. Design win count as of July 1, 2015 and subject to change without notice based on vendor product plans. 2 Based on Prairie River switch silicon maximum MPI messaging rate (48-port chip), compared to Mellanox CS7500 Director Switch and Mellanox SB7700/SB7790 Edge switch product briefs (36-port chip) posted on www.mellanox.com as of July 1, 2015. 3 Latency reductions based on Mellanox CS7500 Director Switch and Mellanox SB7700/SB7790 Edge switch product briefs posted on www.Mellanox.com as of July 1, 2015, compared to Intel® OPA switch port-to-port latency of 100-110ns that was measured data that was calculated from difference between back to back osu_latency test and osu_latencyFabrictest through one switch hop. 10ns variation due to “near” and “far” ports on an Eldorado Forest switch. All tests performed using Intel ® Xeon® E5-2697v3, Turbo Mode enabled. Up to 60% latency reduction is based on a 1024-node cluster in a full bisectional bandwidth (FBB) Fat-Tree configuration (3-tier, 5 total switch hops), using a 48-port switch

Intel® Xeon Phi™ Processors

Petaflops for the People

Ushering in a New Era: Argonne National Laboratory & Aurora

Architectural Trade-Offs in a Latency Tolerant Gallium Arsenide Microprocessor

TOP500 Supercomputer Sites

MPI on Aurora

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA

Analysis of the Characteristics and Development Trends of the Next-Generation of Supercomputers in Foreign Countries

Leadership Computing Partnering with the ALCF Enabling

Exascale” Supercomputer Fugaku & Beyond

Scienzfic Simulazons on Thousands of Gpus with Performance Portability

A CORAL System and Implications for Future HPC Hardware and Data Centers

Aurora Fact Sheet