MMX™ Technology Architecture Overview
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
New Instruction Set Extensions
New Instruction Set Extensions Instruction Set Innovation in Intels Processor Code Named Haswell [email protected] Agenda • Introduction - Overview of ISA Extensions • Haswell New Instructions • New Instructions Overview • Intel® AVX2 (256-bit Integer Vectors) • Gather • FMA: Fused Multiply-Add • Bit Manipulation Instructions • TSX/HLE/RTM • Tools Support for New Instruction Set Extensions • Summary/References Copyright© 2012, Intel Corporation. All rights reserved. Partially Intel Confidential Information. 2 *Other brands and names are the property of their respective owners. Instruction Set Architecture (ISA) Extensions 199x MMX, CMOV, Multiple new instruction sets added to the initial 32bit instruction PAUSE, set of the Intel® 386 processor XCHG, … 1999 Intel® SSE 70 new instructions for 128-bit single-precision FP support 2001 Intel® SSE2 144 new instructions adding 128-bit integer and double-precision FP support 2004 Intel® SSE3 13 new 128-bit DSP-oriented math instructions and thread synchronization instructions 2006 Intel SSSE3 16 new 128-bit instructions including fixed-point multiply and horizontal instructions 2007 Intel® SSE4.1 47 new instructions improving media, imaging and 3D workloads 2008 Intel® SSE4.2 7 new instructions improving text processing and CRC 2010 Intel® AES-NI 7 new instructions to speedup AES 2011 Intel® AVX 256-bit FP support, non-destructive (3-operand) 2012 Ivy Bridge NI RNG, 16 Bit FP 2013 Haswell NI AVX2, TSX, FMA, Gather, Bit NI A long history of ISA Extensions ! Copyright© 2012, Intel Corporation. All rights reserved. Partially Intel Confidential Information. 3 *Other brands and names are the property of their respective owners. Instruction Set Architecture (ISA) Extensions • Why new instructions? • Higher absolute performance • More energy efficient performance • New application domains • Customer requests • Fill gaps left from earlier extensions • For a historical overview see http://en.wikipedia.org/wiki/X86_instruction_listings Copyright© 2012, Intel Corporation. -
VOLUME V INFORMATIQUE NON AMERICAINE Première Partie Par L' Ingénieur Général De L'armement BOUCHER Henri TABLE
VOLUME V INFORMATIQUE NON AMERICAINE Première partie par l' Ingénieur Général de l'Armement BOUCHER Henri TABLE DES MATIERES INFORMATIQUE NON AMERICAINE Première partie 731 Informatique européenne (statistiques, exemples) 122 700 Histoire de l'Informatique allemande 1 701 Petits constructeurs 5 702 Les facturières de Kienzle Data System 16 703 Les minis de gestion de Nixdorf 18 704 Siemens & Halske AG 23 705 Systèmes informatiques d'origine allemande 38 706 Histoire de l'informatique britannique 40 707 Industriels anglais de l'informatique 42 708 Travaux des Laboratoires d' Etat 60 709 Travaux universitaires 63 710 Les coeurs synthétisables d' ARM 68 711 Computer Technology 70 712 Elliott Brothers et Elliott Automation 71 713 Les machines d' English Electric Company 74 714 Les calculateurs de Ferranti 76 715 Les études de General Electric Company 83 716 La patiente construction de ICL 85 Catalogue informatique – Volume E - Ingénieur Général de l'Armement Henri Boucher Page : 1/333 717 La série 29 d' ICL 89 718 Autres produits d' ICL, et fin 94 719 Marconi Company 101 720 Plessey 103 721 Systèmes en Grande-Bretagne 105 722 Histoire de l'informatique australienne 107 723 Informatique en Autriche 109 724 Informatique belge 110 725 Informatique canadienne 111 726 Informatique chinoise 116 727 Informatique en Corée du Sud 118 728 Informatique à Cuba 119 729 Informatique danoise 119 730 Informatique espagnole 121 732 Informatique finlandaise 128 733 Histoire de l'informatique française 130 734 La période héroïque : la SEA 140 735 La Compagnie -
Hyper-Threading Performance with Intel Cpus for Linux SAP Deployment on Proliant Servers
Hyper-Threading Performance with Intel CPUs for Linux SAP Deployment on ProLiant Servers Session #3798 Hein van den Heuvel Performance Engineer Hewlett-Packard © 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Topics • Hyper-Threading Intro • Implementation details Intel, IBM, Sun • Linux implementation • My own tests • SAP (SD) benchmark • Benchmark Results • Conclusions: (18% improvement for SAP 2-tier) Intel Hyper-Threading Overview “Hyper-Threading Technology is a form of simultaneous multithreading technology (SMT), where multiple threads of software applications can be run simultaneously on one processor. This is achieved by duplicating the architectural state on each processor, while sharing one set of processor execution resources. The architectural state tracks the flow of a program or thread, and the execution resources are the units on the processor that do the work: add, multiply, load, etc. “ http://www.intel.com/business/bss/products/hyperthreading/server/ht_server.pdf http://www.intel.com/technology/hyperthread/ Intel HT in a picture To-be-updated Hyper-Threading Versus Dual Core • HP (PA + ipf) opted for ‘dual core’ technology. − Each processor has full set of resources − Only limitation is shared ‘system’ connection. − Allows for dense (8p – 4u – 4640) − minimally constrained systems • Software licensing impact (Oracle!) • Hyper-Threading technology effectiveness will depend on application IBM P5 SMT Summary Enhanced Simultaneous Multi-Threading features To improve SMT performance for various workload mixes and provide robust quality of service, POWER5 provides two features: • Dynamic resource balancing – The objective of dynamic resource balancing is to ensure that the two threads executing on the same processor flow smoothly through the system. -
SIMD Extensions
SIMD Extensions PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 12 May 2012 17:14:46 UTC Contents Articles SIMD 1 MMX (instruction set) 6 3DNow! 8 Streaming SIMD Extensions 12 SSE2 16 SSE3 18 SSSE3 20 SSE4 22 SSE5 26 Advanced Vector Extensions 28 CVT16 instruction set 31 XOP instruction set 31 References Article Sources and Contributors 33 Image Sources, Licenses and Contributors 34 Article Licenses License 35 SIMD 1 SIMD Single instruction Multiple instruction Single data SISD MISD Multiple data SIMD MIMD Single instruction, multiple data (SIMD), is a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data simultaneously. Thus, such machines exploit data level parallelism. History The first use of SIMD instructions was in vector supercomputers of the early 1970s such as the CDC Star-100 and the Texas Instruments ASC, which could operate on a vector of data with a single instruction. Vector processing was especially popularized by Cray in the 1970s and 1980s. Vector-processing architectures are now considered separate from SIMD machines, based on the fact that vector machines processed the vectors one word at a time through pipelined processors (though still based on a single instruction), whereas modern SIMD machines process all elements of the vector simultaneously.[1] The first era of modern SIMD machines was characterized by massively parallel processing-style supercomputers such as the Thinking Machines CM-1 and CM-2. These machines had many limited-functionality processors that would work in parallel. -
Computer Hardware
Computer Hardware MJ Rutter mjr19@cam Michaelmas 2014 Typeset by FoilTEX c 2014 MJ Rutter Contents History 4 The CPU 10 instructions ....................................... ............................................. 17 pipelines .......................................... ........................................... 18 vectorcomputers.................................... .............................................. 36 performancemeasures . ............................................... 38 Memory 42 DRAM .................................................. .................................... 43 caches............................................. .......................................... 54 Memory Access Patterns in Practice 82 matrixmultiplication. ................................................. 82 matrixtransposition . ................................................107 Memory Management 118 virtualaddressing .................................. ...............................................119 pagingtodisk ....................................... ............................................128 memorysegments ..................................... ............................................137 Compilers & Optimisation 158 optimisation....................................... .............................................159 thepitfallsofF90 ................................... ..............................................183 I/O, Libraries, Disks & Fileservers 196 librariesandkernels . ................................................197 -
Validated Products List, 1993 No. 2
mmmmm NJST PUBLICATIONS NISTIR 5167 (Supersedes NISTIR 5103) Validated Products List 1993 No. 2 Programming Languages Database Language SQL Graphics GOSIP POSIX Judy B. Kailey Computer Security Editor U.S. DEPARTMENT OF COMMERCE Technology Administration National Institute of Standards and Technology Computer Systems Laboratory Software Standards Validation Group Gaithersburg, MD 20899 April 1993 (Supersedes January 1993 issue) —QC 100 NIST . U56 #5167 1993 NISTIR 5167 (Supersedes NISTIR 5103) Validated Products List 1993 No. 2 Programming Languages Database Language SQL Graphics GOSIP POSIX Judy B. Kailey Computer Security Editor U.S. DEPARTMENT OF COMMERCE Technology Administration National Institute of Standards and Technology Computer Systems Laboratory Software Standards Validation Group Gaithersburg, MD 20899 April 1993 (Supersedes January 1993 issue) U.S. DEPARTMENT OF COMMERCE Ronald H. Brown, Secretary NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY Raymond Kammer, Acting Director FOREWORD The Validated Products List is a collection of registers describing implementations of Federal Information Processing Standards (FIPS) that have been validated for conformance to FTPS. The Validated Products List also contains information about the organizations, test methods and procedures that support the validation programs for the FIPS identified in this document. The Validated Products List is updated quarterly. iii iv TABLE OF CONTENTS 1. INTRODUCTION 1 1.1 Purpose 1 1.2 Document Organization 2 1.2.1 Programming Languages 2 1.2.2 Database -
Instruction-Level Parallelism in AES Candidates Craig S
Instruction-level Parallelism in AES Candidates Craig S. K. Clapp Instruction-level Parallelism in AES Candidates Craig S.K. Clapp PictureTel Corporation, 100 Minuteman Rd., Andover, MA01810, USA email: [email protected] Abstract. We explore the instruction-level parallelism present in a number of candidates for the Advanced Encryption Standard (AES) and demonstrate how their speed in software varies as a function of the execution resources available in the target CPU. An analysis of the critical paths through the algorithms is used to establish theoretical upper limits on their performance, while performance on finite machines is characterized on a family of hypothetical RISC/VLIW CPUs having from one through eight concurrent instruction-issue slots. The algorithms studied are Crypton, E2, Mars, RC6, Rijndael, Serpent, and Twofish. 1 Introduction Several performance comparisons among AES candidate algorithms have already been published, e.g.[6,7,8,10,16]. However, while such studies do indeed render the valuable service of providing a quantitative rather than qualitative comparison between candidates, and in some cases do so for a number of currently popular processors, they are not necessarily very insightful as to how to expect performance of the algorithms to compare on processors other than those listed, nor in particular on future processors that are likely to replace those in common use today. One of the lessons of DES[11] is that, apart from having too short a key, the original dominant focus on hardware implementation led to a design which over the years has become progressively less able to take full advantage of each new processor generation. -
X86 Assembly Language Reference Manual
x86 Assembly Language Reference Manual Part No: 817–5477–11 March 2010 Copyright ©2010 Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related software documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable: U.S. GOVERNMENT RIGHTS Programs, software, databases, and related documentation and technical data delivered to U.S. Government customers are “commercial computer software” or “commercial technical data” pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, duplication, disclosure, modification, and adaptation shall be subject to the restrictions and license terms setforth in the applicable Government contract, and, to the extent applicable by the terms of the Government contract, the additional rights set forth in FAR 52.227-19, Commercial Computer Software License (December 2007). Oracle USA, Inc., 500 Oracle Parkway, Redwood City, CA 94065. -
Processor Design:System-On-Chip Computing For
Processor Design Processor Design System-on-Chip Computing for ASICs and FPGAs Edited by Jari Nurmi Tampere University of Technology Finland A C.I.P. Catalogue record for this book is available from the Library of Congress. ISBN 978-1-4020-5529-4 (HB) ISBN 978-1-4020-5530-0 (e-book) Published by Springer, P.O. Box 17, 3300 AA Dordrecht, The Netherlands. www.springer.com Printed on acid-free paper All Rights Reserved © 2007 Springer No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. To Pirjo, Lauri, Eero, and Santeri Preface When I started my computing career by programming a PDP-11 computer as a freshman in the university in early 1980s, I could not have dreamed that one day I’d be able to design a processor. At that time, the freshmen were only allowed to use PDP. Next year I was given the permission to use the famous brand-new VAX-780 computer. Also, my new roommate at the dorm had got one of the first personal computers, a Commodore-64 which we started to explore together. Again, I could not have imagined that hundreds of times the processing power will be available in an everyday embedded device just a quarter of century later. -
Computer Architectures an Overview
Computer Architectures An Overview PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 25 Feb 2012 22:35:32 UTC Contents Articles Microarchitecture 1 x86 7 PowerPC 23 IBM POWER 33 MIPS architecture 39 SPARC 57 ARM architecture 65 DEC Alpha 80 AlphaStation 92 AlphaServer 95 Very long instruction word 103 Instruction-level parallelism 107 Explicitly parallel instruction computing 108 References Article Sources and Contributors 111 Image Sources, Licenses and Contributors 113 Article Licenses License 114 Microarchitecture 1 Microarchitecture In computer engineering, microarchitecture (sometimes abbreviated to µarch or uarch), also called computer organization, is the way a given instruction set architecture (ISA) is implemented on a processor. A given ISA may be implemented with different microarchitectures.[1] Implementations might vary due to different goals of a given design or due to shifts in technology.[2] Computer architecture is the combination of microarchitecture and instruction set design. Relation to instruction set architecture The ISA is roughly the same as the programming model of a processor as seen by an assembly language programmer or compiler writer. The ISA includes the execution model, processor registers, address and data formats among other things. The Intel Core microarchitecture microarchitecture includes the constituent parts of the processor and how these interconnect and interoperate to implement the ISA. The microarchitecture of a machine is usually represented as (more or less detailed) diagrams that describe the interconnections of the various microarchitectural elements of the machine, which may be everything from single gates and registers, to complete arithmetic logic units (ALU)s and even larger elements. -
Dynamicsilicon Gilder Publishing, LLC
Written by Published by Nick Tredennick DynamicSilicon Gilder Publishing, LLC Vol. 2, No. 9 The Investor's Guide to Breakthrough Micro Devices September 2002 Lessons From the PC he worldwide market for personal computers has grown to 135 million units annually. Personal com- puters represent half of the worldwide revenue for semiconductors. In July of this year, PC makers Tshipped their billionth PC. I trace the story of the personal computer (PC) from its beginning. The lessons from the PC apply to contemporary products such as switches, routers, network processors, microprocessors, and cell phones. The story doesn’t repeat exactly because semiconductor-process advances change the rules. PC beginnings Intel introduced the first commercial microprocessor in 1971. The first microprocessors were designed solely as cost-effective substitutes for numerous chips in bills of material. But it wasn’t long before micro- processors became central processing units in small computer systems. The first advertisement for a micro- processor-based computer appeared in March 1974. Soon, companies, such as Scelbi Computer Consulting, MITS, and IMSAI, offered kit computers. Apple Computer incorporated in January 1977 and introduced the Apple II computer in April. The Apple II came fully assembled, which, together with the invention of the spreadsheet, changed the personal computer from a kit hobby to a personal business machine. In 1981, IBM legitimized personal computers by introducing the IBM Personal Computer. Once endorsed by IBM, many businesses bought personal computers. Even though it came out in August, IBM sold 15,000 units that year. Apple had a four-year head start. When IBM debuted its personal computer, the Apple II dom- inated the market. -
Extracting and Mapping Industry 4.0 Technologies Using Wikipedia
Computers in Industry 100 (2018) 244–257 Contents lists available at ScienceDirect Computers in Industry journal homepage: www.elsevier.com/locate/compind Extracting and mapping industry 4.0 technologies using wikipedia T ⁎ Filippo Chiarelloa, , Leonello Trivellib, Andrea Bonaccorsia, Gualtiero Fantonic a Department of Energy, Systems, Territory and Construction Engineering, University of Pisa, Largo Lucio Lazzarino, 2, 56126 Pisa, Italy b Department of Economics and Management, University of Pisa, Via Cosimo Ridolfi, 10, 56124 Pisa, Italy c Department of Mechanical, Nuclear and Production Engineering, University of Pisa, Largo Lucio Lazzarino, 2, 56126 Pisa, Italy ARTICLE INFO ABSTRACT Keywords: The explosion of the interest in the industry 4.0 generated a hype on both academia and business: the former is Industry 4.0 attracted for the opportunities given by the emergence of such a new field, the latter is pulled by incentives and Digital industry national investment plans. The Industry 4.0 technological field is not new but it is highly heterogeneous (actually Industrial IoT it is the aggregation point of more than 30 different fields of the technology). For this reason, many stakeholders Big data feel uncomfortable since they do not master the whole set of technologies, they manifested a lack of knowledge Digital currency and problems of communication with other domains. Programming languages Computing Actually such problem is twofold, on one side a common vocabulary that helps domain experts to have a Embedded systems mutual understanding is missing Riel et al. [1], on the other side, an overall standardization effort would be IoT beneficial to integrate existing terminologies in a reference architecture for the Industry 4.0 paradigm Smit et al.