What's Inside Intel® Parallel Studio XE

Accelerate Parallel Code, Transform Enterprise to Cloud & HPC to AI Applications Klaus-Dieter Oertel Intel CVCG Developer Products Division CERN, 14 Nov 2018 What’s Inside Intel® Parallel Studio XE Comprehensive Software Development Tool Suite Cluster Edition Composer Edition Professional Edition BUILD ANALYZE SCALE Compilers & Libraries Analysis Tools Cluster Tools Intel® Math Kernel Library Intel® VTune™ Amplifier Intel® MPI Library C / C++, Performance Profiler Message Passing Interface Library Intel® Data Analytics Fortran Acceleration Library Compilers Intel® Inspector Intel® Trace Analyzer & Collector Intel Threading Building Blocks Memory & Thread Debugger MPI Tuning & Analysis C++ Threading Intel® Advisor Intel® Cluster Checker Intel® Integrated Performance Primitives Vectorization Optimization Cluster Diagnostic Expert System Image, Signal & Data Processing Thread Prototyping & Flow Graph Analysis Intel® Distribution for Python* High Performance Python Operating System: Windows*, Linux*, MacOS1* Intel® Architecture Platforms 1Available only in the Composer Edition. Optimization Notice Copyright © 2018, Intel Corporation. All rights reserved. 2 *Other names and brands may be claimed as the property of others. What’s New in the 2019 Version Intel® Parallel Studio XE: Accelerate Parallel Code, Transform Cloud, HPC & AI . Improve application performance on Intel® Xeon® Scalable and Core™ processors with new enhancements in compilers, performance libraries and analysis tools: – Vectorize and thread your code (using OpenMP*) to take advantage of the latest SIMD-enabled hardware, including Intel® Advanced Vector Extensions 512 (Intel® AVX-512) – Accelerate diverse workloads for enterprise, cloud, HPC and AI . Extend HPC solutions on the path to Exascale—gain greater scalability and reduce latency with next generation Intel® MPI Library. Use a new, more accessible user interface in Intel® VTune™ Amplifier for a simplified profiling workflow with familiar terminology and logical groupings. Preview a new platform profiler for longer, higher level performance analysis. Visualize parallelism with rapid visual prototyping environment— interactively build, validate, and visualize parallel algorithms with Intel® Advisor’s Flow Graph Analyzer. Speed machine learning by enabling new high performance Python* capabilities. Supports industry standards and IDEs. Optimization Notice Copyright © 2018, Intel Corporation. All rights reserved. Intel Confidential 3 *Other names and brands may be claimed as the property of others. Optimize Efficiently with Valuable Resources Shortcut Optimization Sign up now Intel® Parallel Studio XE Attend TEC Webinars! . Overview, features, support, code samples . Training materials, Tech.Decoded webinars, how-to videos & articles . Reviews & Case Studies . More Intel® Software Development Products Intel Code Modernization Program . Overview . Live training . Remote Access https://intel.ly/2PdkNhN Optimization Notice Copyright © 2018, Intel Corporation. All rights reserved. 4 *Other names and brands may be claimed as the property of others. Build Analyze SCALE Intel® C++ Compiler Intel® VTune™ Amplifier Intel® MPI Library Intel® Fortran Compiler Intel® Advisor Intel® Trace Analyzer & Collector Intel® Distribution for Python* Intel® Math Kernel Library Intel® Inspector Intel® Cluster Checker Intel® Integrated Performance Primitives Intel® Threading Building Blocks Intel® Data Analytics Acceleration Library Part of the Professional Edition Part of the Cluster Edition Included in Composer Edition Fast, Scalable, Parallel Code with Intel® Compilers Deliver Industry-leading C/C++ & Fortran Code Performance, Unleash the Power of the latest Intel® Processors . Develop optimized and vectorized code for Intel® architectures, including Intel® Xeon® processors . Leverage language and OpenMP* standards, and compatibility with leading compilers & IDEs Learn More: software.intel.com/intel-compilers Optimization Notice Copyright © 2018, Intel Corporation. All rights reserved. 6 *Other names and brands may be claimed as the property of others. What’s New in Intel® Compilers 2019 (19.0) Updates to All Versions Advance Support for Intel® Architecture—use Intel® Compilers to generate optimized code for Intel Atom® processor through Intel® Xeon® Scalable processors. Achieve Superior Parallel Performance—vectorize & thread your code (using OpenMP*) to take advantage of the latest SIMD-enabled hardware, including Intel® Advanced Vector Extensions 512 (Intel® AVX-512). What’s New in C++ What’s New in Fortran Additional C++17 Standard feature support Substantial Fortran 2018 support including . Enjoy improvements to lambda & constant expression support . Coarray features: EVENTS & COSHAPE . Improved GNU C++ & Microsoft C++ compiler compatibility . IMPORT statement enhancements Standards-driven parallelization for C++ . Default module accessibility developers Complete OpenMP 4.5 support; user-defined . Partial OpenMP* 51 support reductions . Modernize your code by using the latest parallelization . Check shape option for runtime array conformance checking specifications Optimization Notice 1 Copyright © 2018, Intel Corporation. All rights reserved. OpenMP 5 is currently a draft 7 *Other names and brands may be claimed as the property of others. Accelerate Python* with Intel® Distribution for Python* High Performance Python* for Scientific Computing, Data Analytics, Machine & Deep Learning Faster Performance Greater Productivity Ecosystem compatibility Performance Libraries, Parallelism, Prebuilt & Accelerated Packages Supports Python 2.7 & 3.6, Conda & PIP Multithreading, Language Extensions . Accelerated NumPy/SciPy/scikit-learn . Prebuilt & optimized packages for . Supports Python 2.7 & 3.6, optimizations with Intel® MKL1 & Intel® DAAL2 numerical computing, machine/deep integrated in Anaconda* Distribution learning, HPC, & data analytics . Data analytics, machine learning & deep . Distribution & optimized packages available learning with scikit-learn, pyDAAL, . Drop in replacement for existing Python- via Conda, PIP, APT GET, YUM, & DockerHub, TensorFlow* & Caffe* No code changes required numerical performance optimizations integrated in Anaconda Distribution . Scale with Numba* & Cython* . Jupyter* notebooks, Matplotlib included . Optimizations upstreamed to main Python . Includes optimized mpi4py, works with . Free download & free for all uses trunk Dask* & PySpark* including commercial deployment . Priority Support with Intel® Parallel Studio XE . Optimized for latest Intel® architecture Operating System: Windows*, Linux*, MacOS1* Intel® Architecture Platforms 1Intel® Math Kernel Library Learn More: software.intel.com/distribution-for-python 2Intel® Data Analytics Acceleration Library Optimization Notice 1Available only in Intel® Parallel Studio Composer Edition. Copyright © 2018, Intel Corporation. All rights reserved. 9 *Other names and brands may be claimed as the property of others. Faster Python* with Intel® Distribution for Python* Close to Native Code Scikit-learn* Performance Advance Performance Closer to Native Code with Intel® Distribution for Python* 2019 . Accelerated NumPy, SciPy, Scikit-learn for scientific Compared to stock Python packages on Intel® Xeon® processors computing, machine learning & data analytics 100% . Drop-in replacement for existing Python—no code 90% changes required 80% 70% . Highly optimized for the latest Intel® processors 60% 50% 40% What’s New in the 2019 Release 30% . Faster machine learning with Scikit-learn: Support Vector 20% Machine (SVM) & K-means prediction, accelerated with 10% 0% Intel® Data Analytics Acceleration Library 1K x 15K 1K x 15K 1M x 50 1Mx50 1M x 50 1M x 50 1M x 50 1M x 50 10K x 1K 10K x 1K Performance efficiency measured Performance efficiency cosine distcorrelation distkmeans.fitkmeans.predictlinear_reg.fitlinear_reg.predictridge_reg.fitridge_reg.predictsvm.fit svm.predict . Includes machine learning XGBoost library (Linux* only) code with DAALnative Intel® against (binary) (binary) . Also available as easy command line standalone install Stock Python Intel® Distribution for Python* 2019 Performance results are based on testing as of July 9, 2018 and may not reflect all publicly available security updates. See configuration disclosure for details. No product can be absolutely secure. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information, see Performance Benchmark Test Disclosure. Testing by Intel as of July 9, 2018. Configuration: Stock Python: python 3.6.6 hc3d631a_0 installed from conda, NumPy 1.15, numba 0.39.0, llvmlite 0.24.0, scipy 1.1.0, scikit-learn 0.19.2 installed from pip; Intel Python: Intel® Distribution for Python* 2019 Gold: python 3.6.5 intel_11, NumPy 1.14.3 intel_py36_5, mkl 2019.0 intel_101, mkl_fft 1.0.2 intel_np114py36_6,mkl_random 1.0.1 intel_np114py36_6, numba 0.39.0 intel_np114py36_0, llvmlite 0.24.0 intel_py36_0, scipy 1.1.0 intel_np114py36_6, scikit-learn 0.19.1 intel_np114py36_35; OS: CentOS Linux 7.3.1611, kernel 3.10.0-514.el7.x86_64; Hardware:

What's Inside Intel® Parallel Studio XE

Bench - Benchmarking the State-Of- The-Art Task Execution Frameworks of Many- Task Computing

Adaptive Data Migration in Load-Imbalanced HPC Applications

Beowulf Clusters — an Overview

Improving MPI Threading Support for Current Hardware Architectures

Intel Edge Computing Portfolio E-Booklet

Exascale Computing Project -- Software

Parallel Data Mining from Multicore to Cloudy Grids

High Performance Integration of Data Parallel File Systems and Computing

MPICH Installer's Guide Version 3.3.2 Mathematics and Computer

3.0 and Beyond

Spack Package Repositories

Code That Performs