Exploiting Compiler-Generated Schedules for Energy Savings in High-Performance Processors
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Validated Products List, 1995 No. 3: Programming Languages, Database
NISTIR 5693 (Supersedes NISTIR 5629) VALIDATED PRODUCTS LIST Volume 1 1995 No. 3 Programming Languages Database Language SQL Graphics POSIX Computer Security Judy B. Kailey Product Data - IGES Editor U.S. DEPARTMENT OF COMMERCE Technology Administration National Institute of Standards and Technology Computer Systems Laboratory Software Standards Validation Group Gaithersburg, MD 20899 July 1995 QC 100 NIST .056 NO. 5693 1995 NISTIR 5693 (Supersedes NISTIR 5629) VALIDATED PRODUCTS LIST Volume 1 1995 No. 3 Programming Languages Database Language SQL Graphics POSIX Computer Security Judy B. Kailey Product Data - IGES Editor U.S. DEPARTMENT OF COMMERCE Technology Administration National Institute of Standards and Technology Computer Systems Laboratory Software Standards Validation Group Gaithersburg, MD 20899 July 1995 (Supersedes April 1995 issue) U.S. DEPARTMENT OF COMMERCE Ronald H. Brown, Secretary TECHNOLOGY ADMINISTRATION Mary L. Good, Under Secretary for Technology NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY Arati Prabhakar, Director FOREWORD The Validated Products List (VPL) identifies information technology products that have been tested for conformance to Federal Information Processing Standards (FIPS) in accordance with Computer Systems Laboratory (CSL) conformance testing procedures, and have a current validation certificate or registered test report. The VPL also contains information about the organizations, test methods and procedures that support the validation programs for the FIPS identified in this document. The VPL includes computer language processors for programming languages COBOL, Fortran, Ada, Pascal, C, M[UMPS], and database language SQL; computer graphic implementations for GKS, COM, PHIGS, and Raster Graphics; operating system implementations for POSIX; Open Systems Interconnection implementations; and computer security implementations for DES, MAC and Key Management. -
SPARC64-III User's Guide
SPARC64-III User’s Guide HAL Computer Systems, Inc. Campbell, California May 1998 Copyright © 1998 HAL Computer Systems, Inc. All rights reserved. This product and related documentation are protected by copyright and distributed under licenses restricting their use, copying, distribution, and decompilation. No part of this product or related documentation may be reproduced in any form by any means without prior written authorization of HAL Computer Systems, Inc., and its licensors, if any. Portions of this product may be derived from the UNIX and Berkeley 4.3 BSD Systems, licensed from UNIX System Laboratories, Inc., a wholly owned subsidiary of Novell, Inc., and the University of California, respectively. RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the United States Government is subject to the restrictions set forth in DFARS 252.227-7013 (c)(1)(ii), FAR 52.227-19, and NASA FAR Supplement. The product described in this book may be protected by one or more U.S. patents, foreign patents, or pending applications. TRADEMARKS HAL, the HAL logo, HyperScalar, and OLIAS are registered trademarks and HAL Computer Systems, Inc. HALstation 300, and Ishmail are trademarks of HAL Computer Systems, Inc. SPARC64 and SPARC64/OS are trademarks of SPARC International, Inc., licensed by SPARC International, Inc., to HAL Computer Systems, Inc. Fujitsu and the Fujitsu logo are trademarks of Fujitsu Limited. All SPARC trademarks, including the SCD Compliant Logo, are trademarks or registered trademarks of SPARC International, Inc. SPARCstation, SPARCserver, SPARCengine, SPARCstorage, SPARCware, SPARCcenter, SPARCclassic, SPARCcluster, SPARCdesign, SPARC811 SPARCprinter, UltraSPARC, microSPARC, SPARCworks, and SPARCompiler are licensed exclusively to Sun Microsystems, Inc. -
The Conference Program Booklet
Austin Convention Center Conference Austin, TX Program http://sc15.supercomputing.org/ Conference Dates: Exhibition Dates: The International Conference for High Performance Nov. 15 - 20, 2015 Nov. 16 - 19, 2015 Computing, Networking, Storage and Analysis Sponsors: SC15.supercomputing.org SC15 • Austin, Texas The International Conference for High Performance Computing, Networking, Storage and Analysis Sponsors: 3 Table of Contents Welcome from the Chair ................................. 4 Papers ............................................................... 68 General Information ........................................ 5 Posters Research Posters……………………………………..88 Registration and Conference Store Hours ....... 5 ACM Student Research Competition ........ 114 Exhibit Hall Hours ............................................. 5 Posters SC15 Information Booth/Hours ....................... 5 Scientific Visualization/ .................................... 120 Data Analytics Showcase SC16 Preview Booth/Hours ............................. 5 Student Programs Social Events ..................................................... 5 Experiencing HPC for Undergraduates ...... 122 Registration Pass Access .................................. 7 Mentor-Protégé Program .......................... 123 Student Cluster Competition Kickoff ......... 123 SCinet ............................................................... 8 Student-Postdoc Job & ............................. 123 Convention Center Maps ................................. 12 Opportunity Fair Daily Schedules -
Computer Architectures an Overview
Computer Architectures An Overview PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 25 Feb 2012 22:35:32 UTC Contents Articles Microarchitecture 1 x86 7 PowerPC 23 IBM POWER 33 MIPS architecture 39 SPARC 57 ARM architecture 65 DEC Alpha 80 AlphaStation 92 AlphaServer 95 Very long instruction word 103 Instruction-level parallelism 107 Explicitly parallel instruction computing 108 References Article Sources and Contributors 111 Image Sources, Licenses and Contributors 113 Article Licenses License 114 Microarchitecture 1 Microarchitecture In computer engineering, microarchitecture (sometimes abbreviated to µarch or uarch), also called computer organization, is the way a given instruction set architecture (ISA) is implemented on a processor. A given ISA may be implemented with different microarchitectures.[1] Implementations might vary due to different goals of a given design or due to shifts in technology.[2] Computer architecture is the combination of microarchitecture and instruction set design. Relation to instruction set architecture The ISA is roughly the same as the programming model of a processor as seen by an assembly language programmer or compiler writer. The ISA includes the execution model, processor registers, address and data formats among other things. The Intel Core microarchitecture microarchitecture includes the constituent parts of the processor and how these interconnect and interoperate to implement the ISA. The microarchitecture of a machine is usually represented as (more or less detailed) diagrams that describe the interconnections of the various microarchitectural elements of the machine, which may be everything from single gates and registers, to complete arithmetic logic units (ALU)s and even larger elements. -
Energy-Effective Issue Logic
Energy-Effective Issue Logic Daniele Folegnani and Antonio GonzaJez Departament d'Arquitectura de Computadors Universitat Politecnica de Catalunya Jordi Girona, 1-3 Mbdul D6 08034 Barcelona, Spain [email protected] Abstract require an increasing computing power, which sometimes is achieved through higher clock frequencies and more sophisticated Tile issue logic of a dynamically-scheduled superscalar processor architectural techniques, with an obvious impact on power is a complex mechanism devoted to start the execution of multiple consumption. instructions eveo' cycle. Due to its complexity, it is responsible for a significant percentage of the energy consumed by a With the fast increase in design complexity and reduction in microprocessor. The energy consumption of the issue logic design time, new generation CAD tools for VLSI design like depends on several architectural parameters, the instruction issue PowerMill [13] and QuickPower [12] are crucial to evaluate the queue size being one of the most important. In this paper we energy consumption at different points of the design, helping to present a technique to reduce the energy consumption of the issue make important decisions early in the design process. The logic of a high-performance superscalar processor. The proposed importance of a continuous feedback between the functional technique is based on the observation that the conventional issue system description and the power impact requires better and faster logic wastes a significant amount of energy for useless activio,. In evaluation tools and models to introduce changes rapidly through particular, the wake-up of empty entries and operands that are various design scenarios. Recent research has shown how read)' represents an important source of energy waste. -
Data Locality Optimizations for Multigrid Methods on Structured Grids
Data Locality Optimizations for Multigrid Methods on Structured Grids Christian Weiß Technische Universitat¨ Munchen¨ Institut fur¨ Informatik Lehrstuhl fur¨ Rechnertechnik und Rechnerorganisation Data Locality Optimizations for Multigrid Methods on Structured Grids Christian Weiß Vollst¨andiger Abdruck der von der Fakult¨at f¨ur Informatik der Technischen Universit¨at M¨unchen zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften (Dr. rer. nat.) genehmigten Dissertation. Vorsitzender: Univ.-Prof. Dr. Hans Michael Gerndt Pr¨ufer der Dissertation: 1. Univ.-Prof. Dr. Arndt Bode 2. Univ.-Prof. Dr. Ulrich R¨ude Friedrich-Alexander-Universit¨at Erlangen-N¨urnberg 3. Univ.-Prof. (komm.) Dr.-Ing. Eike Jessen, em. Die Dissertation wurde am 26. September 2001 bei der Technischen Universit¨at M¨unchen eingereicht und durch die Fakult¨at f¨ur Informatik am 06. Dezember 2001 angenommen. Abstract Beside traditional direct solvers iterative methods offer an efficient alternative for the solution of systems of linear equations which arise in the solution of partial differen- tial equations (PDEs). Among them, multigrid algorithms belong to the most efficient methods based on the number of operations required to achieve a good approximation of the solution. The relevance of the number of arithmetic operations performed by an application as a metric for the complexity of an algorithm wanes since the performance of modern computing systems nowadays is limited by memory latency and bandwidth. Consequently, almost all computer manufacturers nowadays equip their computers with cache–based hierarchical memory systems. Thus, the efficiency of multigrid methods is rather determined by good data locality, i.e. good utilization of data caches, than by the number of arithmetic operations. -
Full Program
The International Conference for High Performance Computing, Networking, Storage and Analysis Conference Program http://sc16.supercomputing.org/ Salt Lake City Convention Center Salt Lake City, Utah Conference: November 13-18, 2016 Exhibition: November 14-17, 2016 Sponsors: SC16.supercomputing.org SC16 • Salt Lake City, Utah The International Conference for High Performance Computing, Networking, Storage and Analysis The International Conference for High Performance Computing, Networking, Storage and Analysis Sponsors: Sponsors 3 Table of Contents Welcome from the Chair ................................. 2 Panels ............................................................... 69 General Information ........................................ 4 Papers ............................................................... 73 Registration and Conference Store Hours ....... 4 Posters Posters ....................................................... 98 Exhibit Hall Hours ............................................. 4 Research Posters……………………………………..105 SC16 Information Booth/Hours ....................... 4 ACM Student Research Competition SC17 Preview Booth/Hours ............................. 4 Scientific Visualization/ Data Analytics Showcase ................................. 135 Social Events ..................................................... 5 Student Programs ............................................ 136 Registration Pass Access .................................. 7 Experiencing HPC for Undergraduates ...... 137 SCinet .............................................................. -
Solbournecomputer ~Solbourne Computer
SolbourneComputer ~Solbourne Computer Series4 and Series5 Theory of Operations Solbourne Confidential· Do Not Reproduce Solboume Computer, Inc. 303-772-3400 1900 Pike Road Longmont, Colorado 80501 For Solbourne support call: 1-800-447-2861 SoIbourne, Series4, SeriesS, Virtual Desktop, Kbus, OSIMP, Object Interface Library (01), and the Solboume logo are trademarks of Solboume Computer, Inc. SPARC, SunOS, and Sun-4 are trademarks of Sun Microsystems. UNIX is a trademark of the AT&T Bell Laboratories. Zilog is a trademark of Zilog Corporation. Exabyte is a trademark of Exabyte Corporation. Ciprico is a trademark of Ciprico Corporation. Xylogics is a trademark of Xylogics Corporation. VMEbus is a trademark of VMEbus Manufacturers' Association. This document contains highly-sensitive confidential information that may only be viewed by employees of Solbourne Computer, Inc. DO NOT COPY OR DISTRIBUTE THIS MANUAL. Part Number: 101250-AB February 1990 Copyright © 1990 by Solboume Computer, Inc. All rights reserved. No part of this publication may be reproduced, stored in any media or in any type of retrieval system, transmitted in any form (e.g., electronic, mechanical, photocopying, recording) or translated into any language or computer language without the prior written permission of Solboume Computer, Inc., 1900 Pike Road, Longmont, Colorado 80501. There is no right to reverse engineer, decompile, or disassemble the information contained herein or in the accompanying software. Solboume Computer, Inc. reserves the right to revise this publication and to make changes from time to time without obligation to notify any person of such revisions or changes. Preface This manual covers the theory of operation of the Solbourne Series4 and SeriesS systems. -
Processor Architecture
Processor Architecture Springer-Verlag Berlin Heidelberg GmbH v Jurij Silc • Borut Robic • Theo Ungerer Processor Architecture From Dataflow to Superscalar and Beyond With 132 Figures and 34 Tables Springer Dr. Jurij Silc Computer Systems Department Jozef Stefan Institute Jamova 39 SI -1001 Ljubljana, Slovenia Assistant Professor Dr. Borut Robic Faculty of Computer and Information Science University of Ljubljana Tdaska cesta 25 SI-1001 Ljubljana, Slovenia Professor Dr. Theo Ungerer Department of Computer Design and Fault Tolerance University of Karlsruhe P.O. Box 6980 D-76128 Karlsruhe, Germany Library of Congress Cataloging-in-Publication Data applied for Die Deutsche Bibliothek - CIP-Einheitsaufnahme SiIc, Jurij: Processor architecture: from dataflow to superscalar and beyond/ Jurij SiIc; Borut Robic; Theo Ungerer. - Berlin; Heidelberg; New York; Barcelona; Hong Kong; London; Milan; Paris; Singapore; Tokyo: Springer, 1999 ISBN 978-3-540-64798-0 ISBN 978-3-642-58589-0 (eBook) DOI 10.1007/978-3-642-58589-0 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concemed, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Dupli cation of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. © Springer-Verlag Berlin Heidelberg 1999 Originally published by Springer-Verlag Berlin Heidelberg New Y ork in 1999 The use of general descriptive names, trademarks, etc. -
Validated Products List, 1995 No. 2
A111D4 NISTIR 5629 (Supersedes NISTIR 5585) VALIDATED PRODUCTS LIST Volume 1 1995 No. 2 Programming Languages Database Language SQL Judy B. Kaiiey Graphics Editor POSIX U.S. DEPARTMENT OF COMMERCE Computer Security Technology Administration National Institute of Standards and Technology Computer Systems Laboratory Software Standards Validation Group Gaithersburg, MD 20899 April 1995 GC 100 NIST . U56 no. 5629 VI fi0.2 NISTIR 5629 (Supersedes NISTIR 5585) VALIDATED PRODUCTS LIST Volume 1 1995 No. 2 Programming Languages Database Language SQL Judy B. Kailey Graphics Editor POSIX U.S. DEPARTMENT OF COMMERCE Computer Security Technology Administration National Institute of Standards and Technology Computer Systems Laboratory Software Standards Validation Group Gaithersburg, MD 20899 April 1995 (Supersedes January 1995 issue) U.S. DEPARTMENT OF COMMERCE Ronald H. Brown, Secretary TECHNOLOGY ADMINISTRATION Mary L. Good, Under Secretary for Technology NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY Arab Prabhakar, Director FOREWORD The Validated Products List (VPL) identifies information technology products that have been tested for conformance to Federal Information Processing Standards (FIPS) in accordance with Computer Systems Laboratory (CSL) conformance testing procedures, and have a current validation certificate or registered test report. The VPL also contains information about the organizations, test methods and procedures that support the validation programs for the FIPS identified in this document. The VPL includes computer language processors for programming languages COBOL, Fortran, Ada, Pascal, C, M[UMPS], and database language SQL; computer graphic implementations for GKS, CGM, PHIGS, and Raster Graphics; operating system implementations for POSIX; open systems interconnect implementations for GOSIP; and computer security implementations for DES, MAC and Key Management. -
Design Space Exploration of 64-Bit Arm Compute Nodes for Highly Energy Effcient Exascale Joël Wanza Weloli
Design space exploration of 64-bit arm compute nodes for highly energy effcient Exascale Joël Wanza Weloli To cite this version: Joël Wanza Weloli. Design space exploration of 64-bit arm compute nodes for highly energy effcient Exascale. Electronics. Université Côte d’Azur, 2019. English. NNT : 2019AZUR4042. tel-02530122 HAL Id: tel-02530122 https://tel.archives-ouvertes.fr/tel-02530122 Submitted on 2 Apr 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Ph.D. Thesis Modélisation, simulation de différents types d’architectures de noeuds de calcul basés sur l’architecture ARM et optimisés pour le calcul haute-performance Design Space Exploration of 64-bit ARM Compute Nodes for Highly Energy Efficient Exascale Joël WANZA WELOLI LEAT/Bull A dissertation submitted to the Ecole Members of the Thesis jury: Doctorale Sciences et Technologies de • Mr. Bertrand Granado, Professor, L’Information et de la Communication Sorbonne Université, Examiner (EDSTIC) of the Université Côte d’Azur for • Mr. Frédéric Petrot, Professor, Université Grenoble Alpes, Examiner the degree of Doctor of Electronic • Mr. Dominique Lavenier, CNRS Research Director, IRISA Directed by: Cécile Belleudy and • Mr. Elyes Zekri , Dr. -
Reconfigurable Architectures for General-Purpose Computing
MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY A.I. Technical Report No. 1586 October, 1996 Recon®gurable Architectures for General-Purpose Computing Andre DeHon [email protected] Abstract: General-purpose computing devices allow us to (1) customize computation after fabrication and (2) conserve area by reusing expensive active circuitry for different functions in time. We de®ne RP-space, a restricted domain of the general-purpose architectural space focussed on recon®gurable computing architectures. Two dominant features differentiate recon®gurable from special-purpose architectures and account for most of the area overhead associated with RP devices: (1) instructions which tell the device how to behave, and (2) ¯exible interconnect which supports task dependent data¯ow between operations. We can characterize RP-space by the allocation and structure of these resources and compare the ef®ciencies of architectural points across broad application characteristics. Conventional FPGAs fall at one extreme end of this space and their ef®ciency ranges over two orders of magnitude across the space of application characteristics. Understanding RP-space and its consequences allows us to pick the best architecture for a task and to search for more robust design points in the space. Our DPGA, a ®ne-grained computing device which adds small, on-chip instruction memories to FPGAs is one such design point. For typical logic applications and ®nite-state machines, a DPGA can implement tasks in one-third the area of a traditional FPGA. TSFPGA, a variant of the DPGA which focuses on heavily time-switched interconnect, achieves circuit densities close to the DPGA, while reducing typical physical mapping times from hours to seconds.