How Intel® Parallel Studio XE 2015 Helps Make Faster Code Faster for HPC

Total Page:16

File Type:pdf, Size:1020Kb

How Intel® Parallel Studio XE 2015 Helps Make Faster Code Faster for HPC Intel Software Conference 2015 Игорь Лопатин Руководитель группы по разработке ПО Intel Intel Software and Services Group, Developer Products Division 11 марта, 2015 Intel Software & Services Group Network Infrastructure Cloud API Services Cloud and Data Center Products Client Software Platforms Intel Developer Zone * Other names and brands may be claimed as the property of others. 2 Intel Software & Services – глобальное присутствие ~14,000+ сотрудников, более 35 офисов 3 Intel в России – активный рост Saint Petersburg ~1000 сотрудников в 4 городах Moscow 80% исследования и разработки Nizhniy Novgorod Novosibirsk 4 Intel Software в России Российские разработчики играют ключевую роль • Intel Parallel Studio, Cluster Tools, Compilers, Performance Libraries, Threading Tools, GPAs, Media SDK, и т.д. • Поддержка и разработка ~ 70 продуктов • Около 20 релизов ПО ежемесячно Программы Intel для разработчиков • Intel® Developer Zone www.intel.ru/software • Intel® Software Partner Program www.intel.ru/partner • Intel - INTUIT Academy http://www.intuit.ru/catalog/se/intel/ Центры компетенции в ведущих российских 5 ВУЗах Создавайте инновационные приложения вместе с Intel Intel® Intel® Intel® Media Intel® Integrated Intel® XDK Parallel Studio XE System Studio Server Studio Native Developer Experience Technical Embedded Media Development Native App Web and hybrid Computing Systems/Device Software Development HTML5 App Enterprise, and HPC Development Software Development Software Software Software 6 Optimization Notice Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Technical Computing Intel® Parallel Studio XE Enterprise, and HPC Software Улучшение масштабируемости, устойчивости и производительности приложений Модели Лидирующая параллельного Средства анализа производительно программирован программ сть ия . Продвинутые компиляторы и . Параллелизм по данным и по . Выявление параллелизма библиотеки задачам . Анализ памяти и потоков . Платформы Linux*, Windows*, . Распределенные вычисления . Анализ производительности OS X Создание Построение Проверка Улучшение 7 Optimization Notice Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Intel® Parallel Studio XE 2015 Intel® Parallel Studio XE 2015 Intel® Parallel Studio XE 2015 Intel® Parallel Studio XE 2015 Composer Edition Professional Edition Cluster Edition Intel® C++ Compiler Intel® C++ Compiler Intel® C++ Compiler Intel® Fortran Compiler Intel® Fortran Compiler Intel® Fortran Compiler Intel® Threading Building Blocks Intel® Threading Building Blocks Intel® Threading Building Blocks Intel® Integrated Performance Intel® Integrated Performance Intel® Integrated Performance Primitives Primitives Primitives Intel® Math Kernel Library Intel® Math Kernel Library Intel® Math Kernel Library Intel® Cilk™ Plus Intel® Cilk™ Plus Intel® Cilk™ Plus Intel® OpenMP* Intel® OpenMP* Intel® OpenMP* Intel® Advisor XE Intel® Advisor XE Intel® Inspector XE Intel® Inspector XE Intel® VTune™ Amplifier XE Intel® VTune™ Amplifier XE Intel® MPI Library Intel® Trace Analyzer and Collector Компиляторы Пакет средств для Пакет средств для систем с и библиотеки систем с общей памятью распределенной памятью 8 Optimization Notice Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. User Upgrade Program Available TODAY Intel® Xeon Phi™ Product Family Industry and User Momentum 1 TFLOPS1 3+ TFLOPS2 Announcing -Bootable Processor 2H’15 -On-Pkg, High BW Memory Knights First Knights -Integrated Fabric Commercial Knights Systems Corner Landing Hill 3rd Generation Intel® Xeon Phi™ Product Family Knights + Landing 2nd Generation Intel Omni-Path systems Architecture many more Intel® Xeon Phi™ providers 3 card-based systems Coprocessor – Applications >50 expected 10nm process and Solutions Catalog technology >100 PFLOPS customer system compute commits to-date3 1 Claim based on calculated theoretical peak double precision performance capability for a single coprocessor. 16 DP FLOPS/clock/core * 61 cores * 1.23GHz = 1.208 TeraFLOPS 2Over 3 Teraflops of peak theoretical double-precision performance is preliminary and based on current expectations of cores, clock frequency and floating point operations per cycle. FLOPS = cores x clock frequency x floating-point operations per second per cycle. 3 Intel internal estimate Announcing Intel® Omni Scale—The Next-Generation Fabric INTEGRATION . Designed for Maximum Scalability Intel® Omni Intel® Omni Scale Fabric Scale Fabric . Rich Set of Programming Models . Flexible Configurations . End-to-End Solution Starting with Future 14nm Knights Landing generation Coming in ‘15 Open Intel® True Scale PCIe Edge Director Intel Silicon Fabric Upgrade Software √ Adapters √ Switches √ Systems √ Photonics √ Program Helps Your Tools* Transition *OpenFabrics Alliance Other brands and names are the property of their respective owners Спасибо! 11 ©2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Intel Xeon, and Intel Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 12 How Intel® Parallel Studio XE 2015 helps make Faster Code Faster for HPC HPC Cluster Cluster Edition MPI error checking Multi-fabric MPI library and tuning Professional Edition MPI Messages Threading design & Parallel performance Memory & thread prototyping tuning correctness Composer Edition Intel® C++ and Parallel models Vectorized Optimized libraries & Threaded Fortran compilers (e.g., OpenMP*) Node 13 Available Standalone or in Cluster suite Elevate Development Tools to a Comprehensive Shared, Achieve High Performance for MPI Distributed & Hybrid Application Development Suite Cluster Applications with MPI Library Intel® MPI Library+ + Compilers MPI Library - MPICH Based Performance Libs MPI 3.0 Standard Benchmarks & Tuning Analysis Tools Also Available in Intel® Parallel Studio XE 2015 Cluster Edition+ +Available for Windows** and Linux* 14 Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice Intel® C++ and Fortran Compilers 15.0 Productive language-level parallelism models for improved application performance . Common to both . New OpenMP 4.0 vectorization simplifies taking advantage of SIMD instructions for great performance on Intel® Xeon® and Xeon Phi™ processors and coprocessors . Improved compiler optimization reports help quickly identify optimization opportunities. For Windows-based developers, Visual Studio* 2010, 2012 and 2013 integration is included. Linux*, OS X*, Windows*, Android* . Available now in a variety of configurations to suit different development needs. C++ Info Fortran Info . Intel® C++ Compiler . Intel Cilk™ Plus keywords for parallelism simplify implementation of task and data parallelism . Complete C++11 support . Intel® Fortran Compiler . Support for the latest Fortran standards . Rogue Wave* IMSL* Fortran Numerical Libraries: Performance add-on for Intel® Fortran Windows suites 15 Performance without compromise Intel® C++ & Fortran Compilers Vector and parallel programming improvements . Explicit vectorization achieves predictable vectorization . Similar to what OpenMP* does for parallelization AVX-512 . Maps threaded execution to SIMD hardware . Memory alias and alignment analysis improvements up to AVX / AVX2 2x . Expanded alignment and restrict attributes faster Hardware Support SSE / up to SSE2 4x . Skylake support faster . Knights Landing (KNL) support Peak single precision floating point performance . Compute on Intel Graphics Standards: Full C++ 2011, Full Fortran 2003 support, Fortran 2008 BLOCK support, OpenMP* 4.0 16 Intel® Parallel Studio XE 2015 Composer Intel® Parallel Studio XE 2015 Intel® Parallel Studio XE 2015 PricingEdition and ConfigurationProfessional Edition Cluster Edition from $699 from $1,699 from $2949 Intel® C++ Compiler Intel® C++ Compiler Intel® C++ Compiler Intel® Fortran Compiler Intel® Fortran Compiler Intel® Fortran Compiler Intel® Threading Building Blocks Intel® Threading Building Blocks Intel® Threading Building Blocks Intel® Integrated Performance Primitives Intel® Integrated Performance Primitives Intel® Integrated Performance Primitives Intel® Math Kernel Library Intel® Math Kernel Library Intel® Math Kernel Library Intel® Cilk™ Plus Intel® Cilk™ Plus Intel® Cilk™ Plus Intel® OpenMP* Intel® OpenMP* Intel® OpenMP* Intel® Advisor XE Intel® Advisor XE Intel® Inspector XE Intel® Inspector XE Intel® VTune™ Amplifier XE Intel® VTune™ Amplifier XE Intel® MPI Library Intel® Trace Analyzer and Collector Bundle or Add-on: Add-on: Add-on: Rogue Wave IMSL* Library Rogue Wave IMSL* Library Rogue Wave IMSL* Library Additional configurations including, floating and academic, are available at: http://intel.ly/perf-tools 17 Legal Disclaimer & Optimization Notice INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Software and workloads used in
Recommended publications
  • Driving Performance with Intel® Advisor's Flow Graph Analyzer
    Software THE PARALLEL UNIVERSE Driving Code Performance with Intel® Advisor’s Flow Graph Analyzer Modernize your Code with Issue Intel® Parallel Studio XE 30 Enabling FPGAs for Software Developers 2017 CONTENTS Letter from the Editor 3 Meet Intel® Parallel Studio XE 2018 by Henry A. Gabb, Senior Principal Engineer, Intel Corporation Driving Performance with Intel® Advisor’s Flow Graph Analyzer 5 by Vasanth Tovinkere, Architect, Intel® Flow Graph Analyzer; Pablo Reble, Software Engineer; Farshad Akhbari, Perceptual Computing Technical Lead; and Palanivel Guruvareddiar, Perceptual Computing Software Architect; Intel Corporation FEATURE Welcome to the Adult World, OpenMP* 19 by Barbara Chapman, Professor, Stony Brook University, and Director of Computer Science and Mathematics, Brookhaven National Laboratory Enabling FPGAs for Software Developers 25 by Bernhard Friebe, Senior Director of FPGA Software Solutions Marketing, Intel Corporation, and James Reinders, HPC Enthusiast Modernize Your Code for Performance, Portability, and Scalability 37 by Jackson Marusarz, Technical Consulting Engineer, Intel Corporation Dealing with Outliers 45 by Oleg Kremnyov, QA Engineer; Mikhail Averbukh, Software Engineer; and Ivan Kuzmin, Software Engineering Manager; Intel Corporation Tuning for Success with the LatestSIMD Extensions 57 and Intel® Advanced Vector Extensions 512 by Xinmin Tian, Senior Principal Engineer; Hideki Saito, Principal Engineer; Sergey Kozhukhov, Senior Staff Engineer; and Nikolay Panchenko, Staff Engineer; Intel Compiler and Language Lab, Intel Corporation Effectively Using Your Whole Cluster 77 by Rama Kishan Malladi, Technical Marketing Engineer, Intel Corporation Is Your Cluster Healthy? 85 by Brock A. Taylor, HPC Solution Architect, Intel Corporation Optimizing HPC Clusters 89 by Michael Hebenstreit, Data Center Engineer, Intel Corporation For more complete information about compiler optimizations, see our Optimization Notice.
    [Show full text]
  • Intel® Parallel Studio XE 2013 SP1 Product Brief
    EFFICIENTLY PRODUCE FAST, SCALABLE, AND RELIABLE APPLICATIONS Intel® Parallel Studio XE 2013 SP1 Product Brief Top Features Leading Development Tools for Top Performance . Industry-leading application Deliver top application performance while minimizing development, tuning and testing time, and performance that scales as effort. Intel® Parallel Studio XE provides C/C++ and Fortran developers cutting edge performing processor core count and vector compilers and libraries, the right parallel programming models, and complementary and width increase compatible analysis tools. It plugs seamlessly into Microsoft Visual Studio* and the GNU tool chain to keep you productive while preserving your development environment investment. Efficiently scale on tomorrow’s Boost performance for your applications as they run on today’s and tomorrow’s IA-compatible hardware while preserving processors and coprocessors, including Intel® Xeon® processors and Intel® Xeon Phi™ investment in existing code coprocessors. Compatible with leading development environments Intel Parallel Studio XE includes the next generation of software development tools: . Intel® C, C++ and Fortran Compilers – Industry-Leading Compilers “If you need performance for your . Intel® Math Kernel Library (Intel® MKL) and Intel® Integrated Performance Primitives C++ and/or Fortran applications, (Intel® IPP) – Performance Libraries you need to try Intel software . Intel® Threading Building Blocks (Intel® TBB) and Intel® Cilk™ Plus – Parallel Programming development products." Models Dr. Artur Guzik, Senior Engineer, Neubrex. Co., . Intel® Advisor XE – Threading Prototyping Tool Ltd. Intel® VTune™ Amplifier XE – Advanced Threading and Performance Profiler . Intel® Inspector XE – Memory and Threading Debugger "The new VTune Amplifier XE Get more performance with less effort. Optimize performance by using the latest versions of brings even more capability to an broadly used Intel® Software Development Products.
    [Show full text]
  • Intel® Technology Journal | Volume 18, Issue 2, 2014
    Intel® Technology Journal | Volume 18, Issue 2, 2014 Publisher Managing Editor Content Architect Cory Cox Stuart Douglas Paul Cohen Program Manager Technical Editor Technical Illustrators Paul Cohen David Clark MPS Limited Technical and Strategic Reviewers Bruce J Beare Tracey Erway Michael Richmond Judi A Goldstein Kira C Boyko Miao Wei Paul De Carvalho Ryan Tabrah Prakash Iyer Premanand Sakarda Ticky Thakkar Shelby E Hedrick Intel® Technology Journal | 1 Intel® Technology Journal | Volume 18, Issue 2, 2014 Intel Technology Journal Copyright © 2014 Intel Corporation. All rights reserved. ISBN 978-1-934053-63-8, ISSN 1535-864X Intel Technology Journal Volume 18, Issue 2 No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4744. Requests to the Publisher for permission should be addressed to the Publisher, Intel Press, Intel Corporation, 2111 NE 25th Avenue, JF3-330, Hillsboro, OR 97124-5961. E-Mail: [email protected]. Th is publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold with the understanding that the publisher is not engaged in professional services. If professional advice or other expert assistance is required, the services of a competent professional person should be sought.
    [Show full text]
  • Intel Compiler
    FZ-Jülich, 26 November 2020 Intel Tuning for Juwels and Jureca Dr. Heinrich Bockhorst - Intel Notices & Disclaimers Intel technologies may require enabled hardware, software or service activation. Learn more at intel.com or from the OEM or retailer. Your costs and results may vary. Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy. Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice Revision #20110804. https://software.intel.com/en-us/articles/optimization-notice Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. See backup for configuration details. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.
    [Show full text]
  • Intel® Oneapi Programming Guide
    Intel® oneAPI Programming Guide Intel Corporation www.intel.com Notices and Disclaimers Contents Notices and Disclaimers....................................................................... 5 Chapter 1: Introduction Intel oneAPI Programming Overview ............................................................6 oneAPI Toolkit Distribution..........................................................................7 About This Guide.......................................................................................8 Related Documentation ..............................................................................8 Chapter 2: oneAPI Programming Model Data Parallel C++ (DPC++) ........................................................................9 C/C++ or Fortran with OpenMP* Offload Programming Model ........................ 11 Device Selection...................................................................................... 13 Chapter 3: oneAPI Development Environment Setup Use the setvars Script with Windows* ........................................................ 16 Use a Config file for setvars.bat on Windows ..................................... 18 Automate the setvars.bat Script with Microsoft Visual Studio* ............... 21 Use the setvars Script with Linux* or MacOS* ............................................. 24 Use a Config file for setvars.sh on Linux or macOS ............................. 26 Automate the setvars.sh Script with Eclipse* ...................................... 29 Use Modulefiles with Linux* .....................................................................
    [Show full text]
  • Использование Intel® Distribution for Python* INTRODUCTION Python Is #1 Programming Language in Hiring Demand Followed by Java and C++
    Использование Intel® Distribution for Python* INTRODUCTION Python is #1 programming language in hiring demand followed by Java and C++. Python is among the most popular programming languages And the demand is . Especially in research/data science for prototyping growing . But very limited use in production . Big data real time analytics requires deploying models to High Performance Data Analytics (HPDA) environments . Hire a team of Java/C++ programmers … OR . Ease access to HPDA for data scientist . This talk shows how Intel Distribution for Python may ease access to production environment for researcher/data scientist . Complementary to Intel TAP ATK Problem statement: DATA scientist is not a professional programmer Until 2015 customer demand for using Intel tools with scripting & JIT languages was addressed by publishing Knowledge Base articles on Intel Developer Zone “… I was hoping to take “Any KB articles I found advantage of by on your site that related building my NumPy and to actually using the Scipy installations MKL for compiling using the MKL … After something were overly spending over 40 hours technical. I couldn't trying to figure it out figure out what the myself, I gave up.” – heck some of the Researcher, Arizona things were doing or State University* talking about.” – Anonymous Data Scientist* * PSXE 2015 Beta Customer Survey Programming Languages Productivity LANGUAGE VERBOSITY PROGRAMMING COMPLEXITY (LOC/FEATURE) (HOURS) 4 5 4 3 3 2 2 1 1 0 0 Python Java C++ C Python Java C++ C Prechelt* Berkholz** Prechelt* Berkholz** * L.Prechelt, An empirical comparison of seven programming languages, IEEE Computer, 2000, Vol. 33, Issue 10, pp.
    [Show full text]
  • Intel Tuning for JURECA and JUWELS
    FZJ, May 20th, 2020 Dr. Heinrich Bockhorst, Intel Agenda Introduction Processor Architecture Overview Composer XE – Compiler Intel Python APS – Application Performance Snapshot MPI and ITAC analysis VTune Amplifier XE - analysis Advisor XE - Vectorization Selected Intel® Tools References Refer to software.intel.com/articles/optimization-notice for more information regarding performance & optimization choices in Intel software products. Copyright ©, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. The “Free Lunch” is over, really Processor clock rate growth halted around 2005 Source: © 2014, James Reinders, Intel, used with permission Software must be parallelized to realize all the potential performance Refer to software.intel.com/articles/optimization-notice for more information regarding performance & optimization choices in Intel software products. Copyright ©, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. What platform should I use for code modernization? Juwels The world is going Intel® Xeon® Processor Intel® Xeon Phi™ x100 Intel® Xeon® Processor Intel® Xeon Phi™ x200 parallel – stick E5-2600 v3 Product Product Family E5-2600 v4 Product Product Family with sequential Family formerly formerly codenamed Family codenamed codenamed … code and you will codenamed Knights Broadwell Knights Skylake fall behind. Haswell Corner Landing Cores 18 61 22 72 28 Threads/Core 2 4 2 4 2 Vector Width 256-bit 512-bit 256-bit 512-bit (x2) 512-bit (x2) Peak Memory Bandwidth 68 GB/s 352 GB/s 77 GB/s >500 GB/s 228 GB/s Both Xeon and KNL are suitable platforms; KNL provides higher scale & memory bandwidth.
    [Show full text]
  • Разработка Приложений Для Tizen. Tizen SDK
    Операционная система Tizen для разработчика Ренат Няжеметдинов Старший специалист по новым мобильным платформам, Samsung Russia Содержание • Введение в Tizen • Архитектура Tizen • Разработка приложений для Tizen 2 Введение в Введение в Tizen. О платформе • Операционная система: – открытый исходный код – ядро Linux • Архитектура: x86, ARM • Поддержка: – Linux Foundation – Tizen Association: • OEM: • Операторы: 4 Введение в Tizen. Основные черты Tizen • Web-ориентированная платформа – Высокий уровень поддержки стандартов HTML5 http://html5test.com – WebKit2 – Web-компоненты для интерфейса пользователя – Доступ к функциям устройства через Web API • Нативный API (C/C++) – Glibc, libstdc++, libxml2 – OpenGL® ES, OpenAL и OpenMP® – Высокий уровень совместимости с bada 2.0 API • Встроенный магазин приложений – Бесплатная регистрация – Предустановлен на все устройства Tizen 5 Введение в Tizen. Разновизности Tizen Мобильные Компьютеры Информационно- устройства (Tizen PC) развлекательные системы автомобилей (Tizen IVI) И другие устройства: • Телевизоры • Бытовая техника • Носимая электроника Часы (Gear 2) Камеры (NX300M) • Медицинская техника 6 Введение в Tizen. Profiles • Mobile – мобильные устройства • IVI – развлекательные системы для транспортных средств • PC – desktops, laptops • TV – DTV/STB/IPTV • Wearables – носимые устройства (например, Samsung Gear 2) • … 7 История Tizen 8 История релизов Tizen Mobile Profile Tizen 2.2.1 Tizen 2.2 Minor Update Tizen 2.1 Commercial Ready - Privacy menu in w/ Enhanced UX settings Tizen 2.0 Hybrid Web/Native,
    [Show full text]
  • Intel Xeon-Phi an Overview – MKL
    C-DAC Four Days Technology Workshop ON Hybrid Computing – Coprocessors/Accelerators Power-Aware Computing – Performance of Applications Kernels hyPACK-2013 Mode 3 : Intel Xeon Phi Coprocessors Lecture Topic : Intel Xeon-Phi An Overview – MKL Venue : CMSD, UoHYD ; Date : October 15-18, 2013 C-DAC hyPACK-2013 An Overview of MKL – Intel Xeon-Phi Prog. 1 An Overview of Xeon Phi Coprocessor Lecture Outline Following topics will be discussed Understanding of Xeon –Phi Architectures Programming on Xeon-Phi Architectures –MKL Tuning & Performance – Software Threading C-DAC hyPACK-2013 An Overview of MKL – Intel Xeon-Phi Prog. 2 MIC Architecture, System Overview C-DAC hyPACK-2013 An Overview of MKL – Intel Xeon-Phi Prog. 3 Systems with Accelerators A set (one or more) of very simple execution units that can perform few operations (with respect to standard CPU) with very high efficiency. When combined with full featured CPU (CISC or RISC) can accelerate the “nominal” speed of a system. CPU ACC Single thread perf. throughput CPU ACC. Physical integration CPU : Control Path & Data Path ALU : Arithmetic and logic units CPU & ACC ACC: Accumulator PC : Program counter Architectural integration Micro-Instructions (Register transfer) Source : NVIDIA, AMD, SGI, Intel, IBM Alter, Xilinux References C-DAC hyPACK-2013 An Overview of MKL – Intel Xeon-Phi Prog. 4 Compute modes vision Xeon Centric MIC Centric Xeon Scalar Symmetric Parallel MIC Hosted Co-processing Co-processing Hosted General purpose Codes with Highly-parallel serial and parallel balanced needs codes computing Codes with highly- Highly parallel parallel phases codes with scalar phases Source : References & Intel Xeon-Phi; http://www.intel.com/ C-DAC hyPACK-2013 An Overview of MKL – Intel Xeon-Phi Prog.
    [Show full text]
  • Best Practice Guide Intel Xeon Phi V2.0
    Best Practice Guide Intel Xeon Phi v2.0 Emanouil Atanassov, IICT-BAS, Bulgaria Michaela Barth, KTH, Sweden Mikko Byckling, CSC, Finland Vali Codreanu, SURFsara, Netherlands Nevena Ilieva, NCSA, Bulgaria Tomas Karasek, IT4Innovations, Czech Republic Jorge Rodriguez, BSC, Spain Sami Saarinen, CSC, Finland Ole Widar Saastad, University of Oslo, Norway Michael Schliephake, KTH, Sweden Martin Stachon, IT4Innovations, Czech Republic Janko Strassburg, BSC, Spain Volker Weinberg (Editor), LRZ, Germany 31-01-2017 1 Best Practice Guide Intel Xeon Phi v2.0 Table of Contents 1. Introduction .............................................................................................................................. 5 2. Intel MIC architecture & system overview ..................................................................................... 6 2.1. The Intel MIC architecture ............................................................................................... 6 2.1.1. Intel Xeon Phi coprocessor architecture overview ....................................................... 6 2.1.2. The cache hierarchy .............................................................................................. 8 2.2. Network configuration & system access .............................................................................. 9 3. Programming Models ............................................................................................................... 15 4. Native compilation ..................................................................................................................
    [Show full text]
  • Intel-Xeon-Phi-Coprocessor-Quick-Start
    Intel® Xeon Phi™ Coprocessor DEVELOPER’S QUICK START GUIDE FOR MICROSOFT* WINDOWS HOST White Paper Intel Xeon Phi Coprocessor DEVELOPER’S QUICK START GUIDE FOR MICROSOFT* WINDOWS HOST Version 1.2 Intel® Xeon Phi™ Coprocessor DEVELOPER’S QUICK START GUIDE FOR MICROSOFT* WINDOWS HOST Contents Introduction ........................................................................................................................................................................................................ 4 Goals ............................................................................................................................................................................................................................. 4 This document does: ...................................................................................................................................................................................... 4 This document does not: ............................................................................................................................................................................. 4 Terminology .............................................................................................................................................................................................................. 4 System Configuration ...................................................................................................................................................................................
    [Show full text]
  • Intel® Software Adrenaline 1 Intel-Inspired Ultrabook™ Devices
    ISSUE 2 | WINTER 2013 Perceptual Computing Transforms the Human-Computer Interaction Wind River and Intel Revolutionize Embedded Computing Multi-touch Technology Deepens Gaming Experiences on the Battlefield and Beyond Augmented Reality is Here and Now On Tablets and Ultrabook™ Devices LETTER FROM THE EDITOR riven by the ferocious In this second issue of Intel® the current realities of security and our pace of innovation Software Adrenaline we get up- mobile lifestyles. in both hardware close and personal with touch. We I hope you find the stories as and software explore how the mobile lifestyle and interesting to read as they were for us to applications, our Ultrabook™ device are affecting the write. I’m eager to hear your feedback relationship with world of software and how touch and suggestions for future issues. Please technology—and control is fast becoming ubiquitous. send me a note at [email protected] how we interact We preview what’s to come with Intel® and stay in “touch.” In the meantime, with it—is always growing. Today, processor-based phones, explore touch enjoy the read! Dthis relationship is moving into one in PC gaming, and view the wider of the most significant evolutions in world of perceptual computing and computing history. Mobile device and user-interface revolutions happening computer users have been shackled now that will become tomorrow’s to keyboards and mice for decades norm. This issue also takes an inspiring while promises of machines controlled look at the #CreatorsLive Project, by thought alone languished in the which invites the public to interact pages of science fiction.
    [Show full text]