How Intel® Parallel Studio XE 2015 Helps Make Faster Code Faster for HPC
Total Page:16
File Type:pdf, Size:1020Kb
Intel Software Conference 2015 Игорь Лопатин Руководитель группы по разработке ПО Intel Intel Software and Services Group, Developer Products Division 11 марта, 2015 Intel Software & Services Group Network Infrastructure Cloud API Services Cloud and Data Center Products Client Software Platforms Intel Developer Zone * Other names and brands may be claimed as the property of others. 2 Intel Software & Services – глобальное присутствие ~14,000+ сотрудников, более 35 офисов 3 Intel в России – активный рост Saint Petersburg ~1000 сотрудников в 4 городах Moscow 80% исследования и разработки Nizhniy Novgorod Novosibirsk 4 Intel Software в России Российские разработчики играют ключевую роль • Intel Parallel Studio, Cluster Tools, Compilers, Performance Libraries, Threading Tools, GPAs, Media SDK, и т.д. • Поддержка и разработка ~ 70 продуктов • Около 20 релизов ПО ежемесячно Программы Intel для разработчиков • Intel® Developer Zone www.intel.ru/software • Intel® Software Partner Program www.intel.ru/partner • Intel - INTUIT Academy http://www.intuit.ru/catalog/se/intel/ Центры компетенции в ведущих российских 5 ВУЗах Создавайте инновационные приложения вместе с Intel Intel® Intel® Intel® Media Intel® Integrated Intel® XDK Parallel Studio XE System Studio Server Studio Native Developer Experience Technical Embedded Media Development Native App Web and hybrid Computing Systems/Device Software Development HTML5 App Enterprise, and HPC Development Software Development Software Software Software 6 Optimization Notice Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Technical Computing Intel® Parallel Studio XE Enterprise, and HPC Software Улучшение масштабируемости, устойчивости и производительности приложений Модели Лидирующая параллельного Средства анализа производительно программирован программ сть ия . Продвинутые компиляторы и . Параллелизм по данным и по . Выявление параллелизма библиотеки задачам . Анализ памяти и потоков . Платформы Linux*, Windows*, . Распределенные вычисления . Анализ производительности OS X Создание Построение Проверка Улучшение 7 Optimization Notice Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Intel® Parallel Studio XE 2015 Intel® Parallel Studio XE 2015 Intel® Parallel Studio XE 2015 Intel® Parallel Studio XE 2015 Composer Edition Professional Edition Cluster Edition Intel® C++ Compiler Intel® C++ Compiler Intel® C++ Compiler Intel® Fortran Compiler Intel® Fortran Compiler Intel® Fortran Compiler Intel® Threading Building Blocks Intel® Threading Building Blocks Intel® Threading Building Blocks Intel® Integrated Performance Intel® Integrated Performance Intel® Integrated Performance Primitives Primitives Primitives Intel® Math Kernel Library Intel® Math Kernel Library Intel® Math Kernel Library Intel® Cilk™ Plus Intel® Cilk™ Plus Intel® Cilk™ Plus Intel® OpenMP* Intel® OpenMP* Intel® OpenMP* Intel® Advisor XE Intel® Advisor XE Intel® Inspector XE Intel® Inspector XE Intel® VTune™ Amplifier XE Intel® VTune™ Amplifier XE Intel® MPI Library Intel® Trace Analyzer and Collector Компиляторы Пакет средств для Пакет средств для систем с и библиотеки систем с общей памятью распределенной памятью 8 Optimization Notice Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. User Upgrade Program Available TODAY Intel® Xeon Phi™ Product Family Industry and User Momentum 1 TFLOPS1 3+ TFLOPS2 Announcing -Bootable Processor 2H’15 -On-Pkg, High BW Memory Knights First Knights -Integrated Fabric Commercial Knights Systems Corner Landing Hill 3rd Generation Intel® Xeon Phi™ Product Family Knights + Landing 2nd Generation Intel Omni-Path systems Architecture many more Intel® Xeon Phi™ providers 3 card-based systems Coprocessor – Applications >50 expected 10nm process and Solutions Catalog technology >100 PFLOPS customer system compute commits to-date3 1 Claim based on calculated theoretical peak double precision performance capability for a single coprocessor. 16 DP FLOPS/clock/core * 61 cores * 1.23GHz = 1.208 TeraFLOPS 2Over 3 Teraflops of peak theoretical double-precision performance is preliminary and based on current expectations of cores, clock frequency and floating point operations per cycle. FLOPS = cores x clock frequency x floating-point operations per second per cycle. 3 Intel internal estimate Announcing Intel® Omni Scale—The Next-Generation Fabric INTEGRATION . Designed for Maximum Scalability Intel® Omni Intel® Omni Scale Fabric Scale Fabric . Rich Set of Programming Models . Flexible Configurations . End-to-End Solution Starting with Future 14nm Knights Landing generation Coming in ‘15 Open Intel® True Scale PCIe Edge Director Intel Silicon Fabric Upgrade Software √ Adapters √ Switches √ Systems √ Photonics √ Program Helps Your Tools* Transition *OpenFabrics Alliance Other brands and names are the property of their respective owners Спасибо! 11 ©2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Intel Xeon, and Intel Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 12 How Intel® Parallel Studio XE 2015 helps make Faster Code Faster for HPC HPC Cluster Cluster Edition MPI error checking Multi-fabric MPI library and tuning Professional Edition MPI Messages Threading design & Parallel performance Memory & thread prototyping tuning correctness Composer Edition Intel® C++ and Parallel models Vectorized Optimized libraries & Threaded Fortran compilers (e.g., OpenMP*) Node 13 Available Standalone or in Cluster suite Elevate Development Tools to a Comprehensive Shared, Achieve High Performance for MPI Distributed & Hybrid Application Development Suite Cluster Applications with MPI Library Intel® MPI Library+ + Compilers MPI Library - MPICH Based Performance Libs MPI 3.0 Standard Benchmarks & Tuning Analysis Tools Also Available in Intel® Parallel Studio XE 2015 Cluster Edition+ +Available for Windows** and Linux* 14 Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice Intel® C++ and Fortran Compilers 15.0 Productive language-level parallelism models for improved application performance . Common to both . New OpenMP 4.0 vectorization simplifies taking advantage of SIMD instructions for great performance on Intel® Xeon® and Xeon Phi™ processors and coprocessors . Improved compiler optimization reports help quickly identify optimization opportunities. For Windows-based developers, Visual Studio* 2010, 2012 and 2013 integration is included. Linux*, OS X*, Windows*, Android* . Available now in a variety of configurations to suit different development needs. C++ Info Fortran Info . Intel® C++ Compiler . Intel Cilk™ Plus keywords for parallelism simplify implementation of task and data parallelism . Complete C++11 support . Intel® Fortran Compiler . Support for the latest Fortran standards . Rogue Wave* IMSL* Fortran Numerical Libraries: Performance add-on for Intel® Fortran Windows suites 15 Performance without compromise Intel® C++ & Fortran Compilers Vector and parallel programming improvements . Explicit vectorization achieves predictable vectorization . Similar to what OpenMP* does for parallelization AVX-512 . Maps threaded execution to SIMD hardware . Memory alias and alignment analysis improvements up to AVX / AVX2 2x . Expanded alignment and restrict attributes faster Hardware Support SSE / up to SSE2 4x . Skylake support faster . Knights Landing (KNL) support Peak single precision floating point performance . Compute on Intel Graphics Standards: Full C++ 2011, Full Fortran 2003 support, Fortran 2008 BLOCK support, OpenMP* 4.0 16 Intel® Parallel Studio XE 2015 Composer Intel® Parallel Studio XE 2015 Intel® Parallel Studio XE 2015 PricingEdition and ConfigurationProfessional Edition Cluster Edition from $699 from $1,699 from $2949 Intel® C++ Compiler Intel® C++ Compiler Intel® C++ Compiler Intel® Fortran Compiler Intel® Fortran Compiler Intel® Fortran Compiler Intel® Threading Building Blocks Intel® Threading Building Blocks Intel® Threading Building Blocks Intel® Integrated Performance Primitives Intel® Integrated Performance Primitives Intel® Integrated Performance Primitives Intel® Math Kernel Library Intel® Math Kernel Library Intel® Math Kernel Library Intel® Cilk™ Plus Intel® Cilk™ Plus Intel® Cilk™ Plus Intel® OpenMP* Intel® OpenMP* Intel® OpenMP* Intel® Advisor XE Intel® Advisor XE Intel® Inspector XE Intel® Inspector XE Intel® VTune™ Amplifier XE Intel® VTune™ Amplifier XE Intel® MPI Library Intel® Trace Analyzer and Collector Bundle or Add-on: Add-on: Add-on: Rogue Wave IMSL* Library Rogue Wave IMSL* Library Rogue Wave IMSL* Library Additional configurations including, floating and academic, are available at: http://intel.ly/perf-tools 17 Legal Disclaimer & Optimization Notice INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Software and workloads used in