D5.5 Innovative HPC Trends and the Hidalgo Benchmarks

Total Page:16

File Type:pdf, Size:1020Kb

D5.5 Innovative HPC Trends and the Hidalgo Benchmarks HiDALGO D5.5 Innovative HPC Trends and the HiDALGO Benchmarks Document Identification Status Final Due Date 31/05/2020 Version 1.0 Submission Date 19/06/2020 Related WP WP5 Document Reference D5.5 Related D3.1, D3.2, D5.1, Dissemination Level (*) PU Deliverable(s) D5.2, D5.3 Lead Participant PSNC Lead Author Marcin Lawenda Contributors PSNC Reviewers Konstantinos Nikas USTUTT (ICCS) ECMWF Nabil Ben Said ICCS (MOON) Keywords: New promising technologies, HPC, benchmarks, scalability, efficiency, Exascale, Global Challenges, Global Systems Science This document is issued within the frame and for the purpose of the HiDALGO project. This project has received funding from the European Union’s Horizon2020 Framework Programme under Grant Agreement No. 824115. The opinions expressed and arguments employed herein do not necessarily reflect the official views of the European Commission. The dissemination of this document reflects only the author’s view and the European Commission is not responsible for any use that may be made of the information it contains. This deliverable is subject to final acceptance by the European Commission. This document and its content are the property of the HiDALGO Consortium. The content of all or parts of this document can be used and distributed provided that the HiDALGO project and the document are properly referenced. Each HiDALGO Partner may use this document in conformity with the HiDALGO Consortium Grant Agreement provisions. (*) Dissemination level: PU: Public, fully open, e.g. web; CO: Confidential, restricted under conditions set out in Model Grant Agreement; CI: Classified, Int = Internal Working Document, information as referred to in Commission Decision 2001/844/EC. Document Information List of Contributors Name Partner Dineshkumar Rajagopal USTUTT Sergiy Gogolenko USTUTT Dennis Hoppe USTUTT Natalie Lewandowski USTUTT Krzesimir Samborski PSNC John Hanley ECMWF Milana Vuckovic ECMWF Nikela Papadopoulou ICCS Petros Anastasiadis ICCS Document History Version Date Change editors Changes 0.1 20/05/2020 Marcin Lawenda TOC, Introduction 0.2 04/05/2020 Krzesimir Chapter 1 and 4 Samborski, Marcin Lawenda 0.3 04/05/2020 Nikela Chapter 6 Papadopoulou, Petros Anastasiadis 0.35 05/05/2020 Dineshkumar Chapter 2 and 3 Rajagopal, Sergiy Gogolenko, Dennis Hoppe, Natalie Lewandowski 0.36 05/05/2020 John Hanley, Annex 1 Milana Vuckovic 0.4 20/05/2020 Marcin Lawenda Chapter 2, 3 and 4, conclusion, executive summary, Document name: D5.5 Innovative HPC Trends and the HiDALGO Benchmarks Page: 2 of 66 Reference: D5.5 Dissemination: PU Version: 1.0 Status: Final Document History Version Date Change editors Changes 0.7 30/05/2020 Konstantinos First review Nikas, Nabil Ben Said 0.8 05/06/2020 Marcin Lawenda Addressing project review comments. Final version to be submitted. 0.9 17/06/2020 Marcin Lawenda Review and approve it. 1.0 19/06/2020 F. Javier Nieto Final approval Quality Control Role Who (Partner short name) Approval Date Deliverable Leader Marcin Lawenda (PSNC) 17/06/2020 Quality Manager Marcin Lawenda (PSNC) 19/06/2020 Project Coordinator Francisco Javier Nieto de Santos (ATOS) 19/06/2020 Document name: D5.5 Innovative HPC Trends and the HiDALGO Benchmarks Page: 3 of 66 Reference: D5.5 Dissemination: PU Version: 1.0 Status: Final Table of Contents Document Information .............................................................................................................. 2 Table of Contents ....................................................................................................................... 4 List of Tables .............................................................................................................................. 7 List of Figures ............................................................................................................................. 7 List of Acronyms ......................................................................................................................... 8 Executive Summary .................................................................................................................. 10 1 Introduction....................................................................................................................... 12 1.1 Purpose of the document ........................................................................................ 12 1.2 Relation to other project work ................................................................................ 12 1.3 Structure of the document ...................................................................................... 13 2 System components .......................................................................................................... 14 2.1 General-purpose CPUs ............................................................................................. 14 2.1.1 Intel x86 ............................................................................................................. 14 2.1.2 AMD x86 ............................................................................................................. 17 2.1.3 ARM .................................................................................................................... 20 2.1.4 POWER – OpenPOWER ...................................................................................... 22 2.1.5 SPARC ................................................................................................................. 24 2.2 Accelerators ............................................................................................................. 26 2.2.1 FPGA ................................................................................................................... 26 2.2.2 GPGPU ................................................................................................................ 27 2.2.3 Vector Co-Processor .......................................................................................... 29 2.3 Memory Technologies ............................................................................................. 29 2.3.1 DDR-SDRAM ....................................................................................................... 30 2.3.2 NVRAM ............................................................................................................... 30 2.4 Exascale architectures and technologies ................................................................. 32 2.4.1 Quantum Computing ......................................................................................... 32 2.4.2 Massively Parallel Processor Arrays ................................................................... 33 2.4.3 ARM-based Microservers with UNIMEM ........................................................... 33 Document name: D5.5 Innovative HPC Trends and the HiDALGO Benchmarks Page: 4 of 66 Reference: D5.5 Dissemination: PU Version: 1.0 Status: Final 2.4.4 FPGA based Microservers with UNILOGIC ......................................................... 35 3 Tools and libraries ............................................................................................................. 38 3.1 Performance tools.................................................................................................... 38 3.1.1 Exa-PAPI ............................................................................................................. 38 3.1.2 HPCToolkit .......................................................................................................... 38 3.1.3 TAU ..................................................................................................................... 39 3.1.4 Score-P ............................................................................................................... 39 3.1.5 VampirServer ..................................................................................................... 39 3.1.6 Darshan .............................................................................................................. 40 3.1.7 Relevance to HiDALGO ....................................................................................... 40 3.2 Mathematical libraries ............................................................................................. 40 3.2.1 NumPy ................................................................................................................ 40 3.2.2 SuperLU .............................................................................................................. 41 3.2.3 PetSc ................................................................................................................... 41 3.2.4 SLATE ................................................................................................................ 41 3.2.5 Relevance to HiDALGO ....................................................................................... 42 3.3 Open standards and programming libraries ............................................................ 42 3.3.1 MPI ..................................................................................................................... 42 3.3.2 OpenMP ............................................................................................................. 42 3.3.3 CUDA/OpenCL .................................................................................................... 43 3.3.4 Relevance to HiDALGO ......................................................................................
Recommended publications
  • Wind Rose Data Comes in the Form >200,000 Wind Rose Images
    Making Wind Speed and Direction Maps Rich Stromberg Alaska Energy Authority [email protected]/907-771-3053 6/30/2011 Wind Direction Maps 1 Wind rose data comes in the form of >200,000 wind rose images across Alaska 6/30/2011 Wind Direction Maps 2 Wind rose data is quantified in very large Excel™ spreadsheets for each region of the state • Fields: X Y X_1 Y_1 FILE FREQ1 FREQ2 FREQ3 FREQ4 FREQ5 FREQ6 FREQ7 FREQ8 FREQ9 FREQ10 FREQ11 FREQ12 FREQ13 FREQ14 FREQ15 FREQ16 SPEED1 SPEED2 SPEED3 SPEED4 SPEED5 SPEED6 SPEED7 SPEED8 SPEED9 SPEED10 SPEED11 SPEED12 SPEED13 SPEED14 SPEED15 SPEED16 POWER1 POWER2 POWER3 POWER4 POWER5 POWER6 POWER7 POWER8 POWER9 POWER10 POWER11 POWER12 POWER13 POWER14 POWER15 POWER16 WEIBC1 WEIBC2 WEIBC3 WEIBC4 WEIBC5 WEIBC6 WEIBC7 WEIBC8 WEIBC9 WEIBC10 WEIBC11 WEIBC12 WEIBC13 WEIBC14 WEIBC15 WEIBC16 WEIBK1 WEIBK2 WEIBK3 WEIBK4 WEIBK5 WEIBK6 WEIBK7 WEIBK8 WEIBK9 WEIBK10 WEIBK11 WEIBK12 WEIBK13 WEIBK14 WEIBK15 WEIBK16 6/30/2011 Wind Direction Maps 3 Data set is thinned down to wind power density • Fields: X Y • POWER1 POWER2 POWER3 POWER4 POWER5 POWER6 POWER7 POWER8 POWER9 POWER10 POWER11 POWER12 POWER13 POWER14 POWER15 POWER16 • Power1 is the wind power density coming from the north (0 degrees). Power 2 is wind power from 22.5 deg.,…Power 9 is south (180 deg.), etc… 6/30/2011 Wind Direction Maps 4 Spreadsheet calculations X Y POWER1 POWER2 POWER3 POWER4 POWER5 POWER6 POWER7 POWER8 POWER9 POWER10 POWER11 POWER12 POWER13 POWER14 POWER15 POWER16 Max Wind Dir Prim 2nd Wind Dir Sec -132.7365 54.4833 0.643 0.767 1.911 4.083
    [Show full text]
  • GPU Developments 2018
    GPU Developments 2018 2018 GPU Developments 2018 © Copyright Jon Peddie Research 2019. All rights reserved. Reproduction in whole or in part is prohibited without written permission from Jon Peddie Research. This report is the property of Jon Peddie Research (JPR) and made available to a restricted number of clients only upon these terms and conditions. Agreement not to copy or disclose. This report and all future reports or other materials provided by JPR pursuant to this subscription (collectively, “Reports”) are protected by: (i) federal copyright, pursuant to the Copyright Act of 1976; and (ii) the nondisclosure provisions set forth immediately following. License, exclusive use, and agreement not to disclose. Reports are the trade secret property exclusively of JPR and are made available to a restricted number of clients, for their exclusive use and only upon the following terms and conditions. JPR grants site-wide license to read and utilize the information in the Reports, exclusively to the initial subscriber to the Reports, its subsidiaries, divisions, and employees (collectively, “Subscriber”). The Reports shall, at all times, be treated by Subscriber as proprietary and confidential documents, for internal use only. Subscriber agrees that it will not reproduce for or share any of the material in the Reports (“Material”) with any entity or individual other than Subscriber (“Shared Third Party”) (collectively, “Share” or “Sharing”), without the advance written permission of JPR. Subscriber shall be liable for any breach of this agreement and shall be subject to cancellation of its subscription to Reports. Without limiting this liability, Subscriber shall be liable for any damages suffered by JPR as a result of any Sharing of any Material, without advance written permission of JPR.
    [Show full text]
  • 2Nd Generation Intel Core Processor Family with Intel 6 Series Chipset Development Kit User Guide
    2nd Generation Intel® Core™ Processor Family with Intel® 6 Series Chipset Development Kit User Guide March 2011 Document Number: 325208 About This Document INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR. Intel may make changes to specifications and product descriptions at any time, without notice. Intel Corporation may have patents or pending patent applications, trademarks, copyrights, or other intellectual property rights that relate to the presented subject matter. The furnishing of documents and other materials and information does not provide any license, express or implied, by estoppel or otherwise, to any such patents, trademarks, copyrights, or other intellectual property rights. Designers must not rely on the absence or characteristics of any features or instructions marked “reserved” or “undefined.” Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. Intel processor numbers are not a measure of performance.
    [Show full text]
  • The New Intel® Xeon® Processor Scalable Family
    Akhilesh Kumar Intel Corporation, 2017 Authors: Don Soltis, Irma Esmer, Adi Yoaz, Sailesh Kottapalli Notices and Disclaimers This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer. No computer system can be absolutely secure. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance. Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
    [Show full text]
  • Future-ALCF-Systems-Parker.Pdf
    Future ALCF Systems Argonne Leadership Computing Facility 2021 Computational Performance Workshop, May 4-6 Scott Parker www.anl.gov Aurora Aurora: A High-level View q Intel-HPE machine arriving at Argonne in 2022 q Sustained Performance ≥ 1Exaflops DP q Compute Nodes q 2 Intel Xeons (Sapphire Rapids) q 6 Intel Xe GPUs (Ponte Vecchio [PVC]) q Node Performance > 130 TFlops q System q HPE Cray XE Platform q Greater than 10 PB of total memory q HPE Slingshot network q Fliesystem q Distributed Asynchronous Object Store (DAOS) q ≥ 230 PB of storage capacity; ≥ 25 TB/s q Lustre q 150PB of storage capacity; ~1 TB/s 3 The Evolution of Intel GPUs 4 The Evolution of Intel GPUs 5 XE Execution Unit qThe EU executes instructions q Register file q Multiple issue ports q Vector pipelines q Float Point q Integer q Extended Math q FP 64 (optional) q Matrix Extension (XMX) (optional) q Thread control q Branch q Send (memory) 6 XE Subslice qA sub-slice contains: q 16 EUs q Thread dispatch q Instruction cache q L1, texture cache, and shared local memory q Load/Store q Fixed Function (optional) q 3D Sampler q Media Sampler q Ray Tracing 7 XE 3D/Compute Slice qA slice contains q Variable number of subslices q 3D Fixed Function (optional) q Geometry q Raster 8 High Level Xe Architecture qXe GPU is composed of q 3D/Compute Slice q Media Slice q Memory Fabric / Cache 9 Aurora Compute Node • 6 Xe Architecture based GPUs PVC PVC (Ponte Vecchio) • All to all connection PVC PVC • Low latency and high bandwidth PVC PVC • 2 Intel Xeon (Sapphire Rapids) processors Slingshot
    [Show full text]
  • Openpower AI CERN V1.Pdf
    Moore’s Law Processor Technology Firmware / OS Linux Accelerator sSoftware OpenStack Storage Network ... Price/Performance POWER8 2000 2020 DRAM Memory Chips Buffer Power8: Up to 12 Cores, up to 96 Threads L1, L2, L3 + L4 Caches Up to 1 TB per socket https://www.ibm.com/blogs/syst Up to 230 GB/s sustained memory ems/power-systems- openpower-enable- bandwidth acceleration/ System System Memory Memory 115 GB/s 115 GB/s POWER8 POWER8 CPU CPU NVLink NVLink 80 GB/s 80 GB/s P100 P100 P100 P100 GPU GPU GPU GPU GPU GPU GPU GPU Memory Memory Memory Memory GPU PCIe CPU 16 GB/s System bottleneck Graphics System Memory Memory IBM aDVantage: data communication and GPU performance POWER8 + 78 ms Tesla P100+NVLink x86 baseD 170 ms GPU system ImageNet / Alexnet: Minibatch size = 128 ADD: Coherent Accelerator Processor Interface (CAPI) FPGA CAPP PCIe POWER8 Processor ...FPGAs, networking, memory... Typical I/O MoDel Flow Copy or Pin MMIO Notify Poll / Int Copy or Unpin Ret. From DD DD Call Acceleration Source Data Accelerator Completion Result Data Completion Flow with a Coherent MoDel ShareD Mem. ShareD Memory Acceleration Notify Accelerator Completion Focus on Enterprise Scale-Up Focus on Scale-Out and Enterprise Future Technology and Performance DriVen Cost and Acceleration DriVen Partner Chip POWER6 Architecture POWER7 Architecture POWER8 Architecture POWER9 Architecture POWER10 POWER8/9 2007 2008 2010 2012 2014 2016 2017 TBD 2018 - 20 2020+ POWER6 POWER6+ POWER7 POWER7+ POWER8 POWER8 P9 SO P9 SU P9 SO 2 cores 2 cores 8 cores 8 cores 12 cores w/ NVLink
    [Show full text]
  • Hardware Developments V E-CAM Deliverable 7.9 Deliverable Type: Report Delivered in July, 2020
    Hardware developments V E-CAM Deliverable 7.9 Deliverable Type: Report Delivered in July, 2020 E-CAM The European Centre of Excellence for Software, Training and Consultancy in Simulation and Modelling Funded by the European Union under grant agreement 676531 E-CAM Deliverable 7.9 Page ii Project and Deliverable Information Project Title E-CAM: An e-infrastructure for software, training and discussion in simulation and modelling Project Ref. Grant Agreement 676531 Project Website https://www.e-cam2020.eu EC Project Officer Juan Pelegrín Deliverable ID D7.9 Deliverable Nature Report Dissemination Level Public Contractual Date of Delivery Project Month 54(31st March, 2020) Actual Date of Delivery 6th July, 2020 Description of Deliverable Update on "Hardware Developments IV" (Deliverable 7.7) which covers: - Report on hardware developments that will affect the scientific areas of inter- est to E-CAM and detailed feedback to the project software developers (STFC); - discussion of project software needs with hardware and software vendors, completion of survey of what is already available for particular hardware plat- forms (FR-IDF); and, - detailed output from direct face-to-face session between the project end- users, developers and hardware vendors (ICHEC). Document Control Information Title: Hardware developments V ID: D7.9 Version: As of July, 2020 Document Status: Accepted by WP leader Available at: https://www.e-cam2020.eu/deliverables Document history: Internal Project Management Link Review Review Status: Reviewed Written by: Alan Ó Cais(JSC) Contributors: Christopher Werner (ICHEC), Simon Wong (ICHEC), Padraig Ó Conbhuí Authorship (ICHEC), Alan Ó Cais (JSC), Jony Castagna (STFC), Godehard Sutmann (JSC) Reviewed by: Luke Drury (NUID UCD) and Jony Castagna (STFC) Approved by: Godehard Sutmann (JSC) Document Keywords Keywords: E-CAM, HPC, Hardware, CECAM, Materials 6th July, 2020 Disclaimer:This deliverable has been prepared by the responsible Work Package of the Project in accordance with the Consortium Agreement and the Grant Agreement.
    [Show full text]
  • Marc and Mentat Release Guide
    Marc and Mentat Release Guide Marc® and Mentat® 2020 Release Guide Corporate Europe, Middle East, Africa MSC Software Corporation MSC Software GmbH 4675 MacArthur Court, Suite 900 Am Moosfeld 13 Newport Beach, CA 92660 81829 Munich, Germany Telephone: (714) 540-8900 Telephone: (49) 89 431 98 70 Toll Free Number: 1 855 672 7638 Email: [email protected] Email: [email protected] Japan Asia-Pacific MSC Software Japan Ltd. MSC Software (S) Pte. Ltd. Shinjuku First West 8F 100 Beach Road 23-7 Nishi Shinjuku #16-05 Shaw Tower 1-Chome, Shinjuku-Ku Singapore 189702 Tokyo 160-0023, JAPAN Telephone: 65-6272-0082 Telephone: (81) (3)-6911-1200 Email: [email protected] Email: [email protected] Worldwide Web www.mscsoftware.com Support http://www.mscsoftware.com/Contents/Services/Technical-Support/Contact-Technical-Support.aspx Disclaimer User Documentation: Copyright 2020 MSC Software Corporation. All Rights Reserved. This document, and the software described in it, are furnished under license and may be used or copied only in accordance with the terms of such license. Any reproduction or distribution of this document, in whole or in part, without the prior written authorization of MSC Software Corporation is strictly prohibited. MSC Software Corporation reserves the right to make changes in specifications and other information contained in this document without prior notice. The concepts, methods, and examples presented in this document are for illustrative and educational purposes only and are not intended to be exhaustive or to apply to any particular engineering problem or design. THIS DOCUMENT IS PROVIDED ON AN “AS-IS” BASIS AND ALL EXPRESS AND IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID.
    [Show full text]
  • Im Divar Ip 6000 2U
    DIVAR IP 6000 2U DIP-6080-00N, DIP-6082-8HD, DIP-6083-8HD en Installation Manual DIVAR IP 6000 2U Table of Contents | en 3 Table of contents 1 Safety precautions 5 1.1 General safety precautions 5 1.2 Electrical safety precautions 6 1.3 ESD precautions 7 1.4 Operating precautions 7 1.5 Important notices 8 1.6 FCC and ICES compliance 8 2 System overview 9 2.1 Chassis features 9 2.2 Chassis components 9 2.2.1 Chassis 10 2.2.2 Backplane 10 2.2.3 Fans 10 2.2.4 Mounting rails 10 2.2.5 Power supply 10 2.2.6 Air shroud 10 2.3 System interface 10 2.3.1 Control panel buttons 11 2.3.2 Control panel LEDs 12 2.3.3 Drive carrier LEDs 12 3 Chassis setup and maintenance 13 3.1 Removing the chassis cover 13 3.2 Installing hard drives 14 3.2.1 Removing hard drive trays 14 3.2.2 Installing a hard drive 15 3.3 Installing an optional floppy or fixed hard drive 16 3.4 Installing or replacing a DVD-ROM drive 17 3.5 Replacing the internal transcoder device 17 3.6 Replacing or installing the front port panel 17 3.7 Installing the motherboard 18 3.8 Installing the air shroud 18 3.9 System fans 19 3.10 Power supply 20 3.10.1 Replacing the power supply 21 3.10.2 Replacing the power distributor 22 4 Rack installation 23 4.1 Unpacking the system 23 4.2 Preparing for setup 23 4.2.1 Choosing a setup location 23 4.2.2 Rack precautions 23 4.2.3 General system precautions 24 4.2.4 Rack mounting considerations 24 4.3 Rack mounting instructions 24 4.3.1 Separating the sections of the rack rails 25 4.3.2 Installing the inner rails 26 4.3.3 Installing the outer rails to the
    [Show full text]
  • Desktop 4Th Generation Intel® Core™ Processor Family, Desktop Intel® Pentium® Processor Family, and Desktop Intel® Celeron® Processor Family Datasheet – Volume 1 of 2
    Desktop 4th Generation Intel® Core™ Processor Family, Desktop Intel® Pentium® Processor Family, and Desktop Intel® Celeron® Processor Family Datasheet – Volume 1 of 2 July 2014 Order No.: 328897-008 By using this document, in addition to any agreements you have with Intel, you accept the terms set forth below. You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS.
    [Show full text]
  • Marrying Many-Core Accelerators and Infiniband for a New Commodity Processor
    Marrying Many-core Accelerators and InfiniBand for a New Commodity Processor Konstantin S. Solnushkin* <[email protected]> Yuichi Tsujita <[email protected]> *Your speaker today International Conference on Computational Science (ICCS 2013) Barcelona, Spain June 5 - June 7, 2013 About the speaker • PhD candidate, topic is “Automated Design of Computer Clusters” • Main findings: – Continuity is more important than sporadic improvements – Ease of use is more important than energy efficiency (or any other trendy topic) • Although “ease of use” is harder to measure than “energy efficiency” About the speaker • Main finding of my PhD research, in one sentence: Economics is important About the speaker • Main finding of my PhD research, in one sentence: Economics is important Why is economics important? • New devices bring: benefits: energy efficiency, etc. disadvantages: you need to learn how to use them • New device (GPGPUs, Intel Xeon Phi) need to rewrite code (e.g., learn NVIDIA’s CUDA) • Learning new devices means time, and therefore money • Vendor lock-in also translates into money (your knowledge of CUDA can become “sunk costs”) – And you will need to learn something new when NVIDIA retires CUDA Economics • Mass-produced devices rule the world • We use commodity, off-the-shelf components to build clusters because they are cheap, not because they are the best • Future exascale computers are going to contain O(105) or even O(106) individual CPU chips [1] • 1. J. Dongarra et al., “The international exascale software project roadmap” Economics • 106 CPUs is a game-changer • It becomes economically viable to build your own CPUs • CPU design project costs around $100M [1].
    [Show full text]
  • POWER10 Processor Chip
    POWER10 Processor Chip Technology and Packaging: PowerAXON PowerAXON - 602mm2 7nm Samsung (18B devices) x x 3 SMT8 SMT8 SMT8 SMT8 3 - 18 layer metal stack, enhanced device 2 Core Core Core Core 2 - Single-chip or Dual-chip sockets + 2MB L2 2MB L2 2MB L2 2MB L2 + 4 4 SMP, Memory, Accel, Cluster, PCI Interconnect Cluster,Accel, Memory, SMP, Computational Capabilities: PCI Interconnect Cluster,Accel, Memory, SMP, Local 8MB - Up to 15 SMT8 Cores (2 MB L2 Cache / core) L3 region (Up to 120 simultaneous hardware threads) 64 MB L3 Hemisphere Memory Signaling (8x8 OMI) (8x8 Signaling Memory - Up to 120 MB L3 cache (low latency NUCA mgmt) OMI) (8x8 Signaling Memory - 3x energy efficiency relative to POWER9 SMT8 SMT8 SMT8 SMT8 - Enterprise thread strength optimizations Core Core Core Core - AI and security focused ISA additions 2MB L2 2MB L2 2MB L2 2MB L2 - 2x general, 4x matrix SIMD relative to POWER9 - EA-tagged L1 cache, 4x MMU relative to POWER9 SMT8 SMT8 SMT8 SMT8 Core Core Core Core Open Memory Interface: 2MB L2 2MB L2 2MB L2 2MB L2 - 16 x8 at up to 32 GT/s (1 TB/s) - Technology agnostic support: near/main/storage tiers - Minimal (< 10ns latency) add vs DDR direct attach 64 MB L3 Hemisphere PowerAXON Interface: - 16 x8 at up to 32 GT/s (1 TB/s) SMT8 SMT8 SMT8 SMT8 x Core Core Core Core x - SMP interconnect for up to 16 sockets 3 2MB L2 2MB L2 2MB L2 2MB L2 3 - OpenCAPI attach for memory, accelerators, I/O 2 2 + + - Integrated clustering (memory semantics) 4 4 PCIe Gen 5 PCIe Gen 5 PowerAXON PowerAXON PCIe Gen 5 Interface: Signaling (x16) Signaling
    [Show full text]