Processors and Memory Heirac
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Floating-Point Package for Intel 8008 and 8080 Microprocessors
UCRL-51940 FLOATING-POINT PACKAGE FOR INTEL 8008 AND 8080 MICROPROCESSORS Michael D. Maples October 24, 1975 Prepared for U.S. Energy Research& Development Administration under contract No. W-7405-Eng-48 I_AV~=IENCE I_IVEFIMORE I.ABOFIATOFIY University ol Calilomia ~ Livermore ~ NOTICE .sponsored by tht: United $~ates G~ven~menl.Neilhe~ the United States nor the United ~tates I’:n,~rgy of their employees,IIOr lilly of their eorltl’ilctclrs~ warranty~ express t~r implied, or asstltlleS ~t]y legld liability or responsihilit y fnr the accuracy, apparatus, product or ])rc)eess disclosed, represents that its rise would IIt~l illl’r liege privlttely-owned rights." Printed in the United States of America Avai.] able from National Technical. information Service U.S. Department of Commerce 5285 Port Royal Road Springfield, Virginia 22151 Price: Printed Copy $ *; Microfiche $2.25 NTIS ""Pages _Sellin_.g Price 1-50 $4.00 51-150 $5.45 151-325 $7.60 326-500 $10.60 501.-1000 $13.60 DISCI.AlMBR This documeut was prepared as an account of work sponsored by an agency of the United States Gnvernment.Neither the United States Governmentnor the University of California nor any of their employees,makes any warranty, express or implied, or assumesany legal liability or responsibility for the accuracy, complete.aess, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use wouldnot infrioge privately ownedrights. Refarenceherein to any specific commercialproduct, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation,or favoring by the United States Govermnentor the University of California. -
Data-Level Parallelism
Fall 2015 :: CSE 610 – Parallel Computer Architectures Data-Level Parallelism Nima Honarmand Fall 2015 :: CSE 610 – Parallel Computer Architectures Overview • Data Parallelism vs. Control Parallelism – Data Parallelism: parallelism arises from executing essentially the same code on a large number of objects – Control Parallelism: parallelism arises from executing different threads of control concurrently • Hypothesis: applications that use massively parallel machines will mostly exploit data parallelism – Common in the Scientific Computing domain • DLP originally linked with SIMD machines; now SIMT is more common – SIMD: Single Instruction Multiple Data – SIMT: Single Instruction Multiple Threads Fall 2015 :: CSE 610 – Parallel Computer Architectures Overview • Many incarnations of DLP architectures over decades – Old vector processors • Cray processors: Cray-1, Cray-2, …, Cray X1 – SIMD extensions • Intel SSE and AVX units • Alpha Tarantula (didn’t see light of day ) – Old massively parallel computers • Connection Machines • MasPar machines – Modern GPUs • NVIDIA, AMD, Qualcomm, … • Focus of throughput rather than latency Vector Processors 4 SCALAR VECTOR (1 operation) (N operations) r1 r2 v1 v2 + + r3 v3 vector length add r3, r1, r2 vadd.vv v3, v1, v2 Scalar processors operate on single numbers (scalars) Vector processors operate on linear sequences of numbers (vectors) 6.888 Spring 2013 - Sanchez and Emer - L14 What’s in a Vector Processor? 5 A scalar processor (e.g. a MIPS processor) Scalar register file (32 registers) Scalar functional units (arithmetic, load/store, etc) A vector register file (a 2D register array) Each register is an array of elements E.g. 32 registers with 32 64-bit elements per register MVL = maximum vector length = max # of elements per register A set of vector functional units Integer, FP, load/store, etc Some times vector and scalar units are combined (share ALUs) 6.888 Spring 2013 - Sanchez and Emer - L14 Example of Simple Vector Processor 6 6.888 Spring 2013 - Sanchez and Emer - L14 Basic Vector ISA 7 Instr. -
Computer Organization and Architecture Designing for Performance Ninth Edition
COMPUTER ORGANIZATION AND ARCHITECTURE DESIGNING FOR PERFORMANCE NINTH EDITION William Stallings Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montréal Toronto Delhi Mexico City São Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo Editorial Director: Marcia Horton Designer: Bruce Kenselaar Executive Editor: Tracy Dunkelberger Manager, Visual Research: Karen Sanatar Associate Editor: Carole Snyder Manager, Rights and Permissions: Mike Joyce Director of Marketing: Patrice Jones Text Permission Coordinator: Jen Roach Marketing Manager: Yez Alayan Cover Art: Charles Bowman/Robert Harding Marketing Coordinator: Kathryn Ferranti Lead Media Project Manager: Daniel Sandin Marketing Assistant: Emma Snider Full-Service Project Management: Shiny Rajesh/ Director of Production: Vince O’Brien Integra Software Services Pvt. Ltd. Managing Editor: Jeff Holcomb Composition: Integra Software Services Pvt. Ltd. Production Project Manager: Kayla Smith-Tarbox Printer/Binder: Edward Brothers Production Editor: Pat Brown Cover Printer: Lehigh-Phoenix Color/Hagerstown Manufacturing Buyer: Pat Brown Text Font: Times Ten-Roman Creative Director: Jayne Conte Credits: Figure 2.14: reprinted with permission from The Computer Language Company, Inc. Figure 17.10: Buyya, Rajkumar, High-Performance Cluster Computing: Architectures and Systems, Vol I, 1st edition, ©1999. Reprinted and Electronically reproduced by permission of Pearson Education, Inc. Upper Saddle River, New Jersey, Figure 17.11: Reprinted with permission from Ethernet Alliance. Credits and acknowledgments borrowed from other sources and reproduced, with permission, in this textbook appear on the appropriate page within text. Copyright © 2013, 2010, 2006 by Pearson Education, Inc., publishing as Prentice Hall. All rights reserved. Manufactured in the United States of America. -
Lecture 14: Gpus
LECTURE 14 GPUS DANIEL SANCHEZ AND JOEL EMER [INCORPORATES MATERIAL FROM KOZYRAKIS (EE382A), NVIDIA KEPLER WHITEPAPER, HENNESY&PATTERSON] 6.888 PARALLEL AND HETEROGENEOUS COMPUTER ARCHITECTURE SPRING 2013 Today’s Menu 2 Review of vector processors Basic GPU architecture Paper discussions 6.888 Spring 2013 - Sanchez and Emer - L14 Vector Processors 3 SCALAR VECTOR (1 operation) (N operations) r1 r2 v1 v2 + + r3 v3 vector length add r3, r1, r2 vadd.vv v3, v1, v2 Scalar processors operate on single numbers (scalars) Vector processors operate on linear sequences of numbers (vectors) 6.888 Spring 2013 - Sanchez and Emer - L14 What’s in a Vector Processor? 4 A scalar processor (e.g. a MIPS processor) Scalar register file (32 registers) Scalar functional units (arithmetic, load/store, etc) A vector register file (a 2D register array) Each register is an array of elements E.g. 32 registers with 32 64-bit elements per register MVL = maximum vector length = max # of elements per register A set of vector functional units Integer, FP, load/store, etc Some times vector and scalar units are combined (share ALUs) 6.888 Spring 2013 - Sanchez and Emer - L14 Example of Simple Vector Processor 5 6.888 Spring 2013 - Sanchez and Emer - L14 Basic Vector ISA 6 Instr. Operands Operation Comment VADD.VV V1,V2,V3 V1=V2+V3 vector + vector VADD.SV V1,R0,V2 V1=R0+V2 scalar + vector VMUL.VV V1,V2,V3 V1=V2*V3 vector x vector VMUL.SV V1,R0,V2 V1=R0*V2 scalar x vector VLD V1,R1 V1=M[R1...R1+63] load, stride=1 VLDS V1,R1,R2 V1=M[R1…R1+63*R2] load, stride=R2 -
Lecture 1: Course Introduction G Course Organization G Historical Overview G Computer Organization G Why the MC68000? G Why Assembly Language?
Lecture 1: Course introduction g Course organization g Historical overview g Computer organization g Why the MC68000? g Why assembly language? Microprocessor-based System Design 1 Ricardo Gutierrez-Osuna Wright State University Course organization g Grading Instructor n Exams Ricardo Gutierrez-Osuna g 1 midterm and 1 final Office: 401 Russ n Homework Tel:775-5120 g 4 problem sets (not graded) [email protected] n Quizzes http://www.cs.wright.edu/~rgutier g Biweekly Office hours: TBA n Laboratories g 5 Labs Teaching Assistant g Grading scheme Mohammed Tabrez Office: 339 Russ [email protected] Weight (%) Office hours: TBA Quizes 20 Laboratory 40 Midterm 20 Final Exam 20 Microprocessor-based System Design 2 Ricardo Gutierrez-Osuna Wright State University Course outline g Module I: Programming (8 lectures) g MC68000 architecture (2) g Assembly language (5) n Instruction and addressing modes (2) n Program control (1) n Subroutines (2) g C language (1) g Module II: Peripherals (9) g Exception processing (1) g Devices (6) n PI/T timer (2) n PI/T parallel port (2) n DUART serial port (1) g Memory and I/O interface (1) g Address decoding (2) Microprocessor-based System Design 3 Ricardo Gutierrez-Osuna Wright State University Brief history of computers GENERATION FEATURES MILESTONES YEAR NOTES Asia Minor, Abacus 3000BC Only replaced by paper and pencil Mech., Blaise Pascal, Pascaline 1642 Decimal addition (8 decimal figs) Early machines Electro- Charles Babbage Differential Engine 1823 Steam powered (3000BC-1945) mech. Herman Hollerith, -
The Birth, Evolution and Future of Microprocessor
The Birth, Evolution and Future of Microprocessor Swetha Kogatam Computer Science Department San Jose State University San Jose, CA 95192 408-924-1000 [email protected] ABSTRACT timed sequence through the bus system to output devices such as The world's first microprocessor, the 4004, was co-developed by CRT Screens, networks, or printers. In some cases, the terms Busicom, a Japanese manufacturer of calculators, and Intel, a U.S. 'CPU' and 'microprocessor' are used interchangeably to denote the manufacturer of semiconductors. The basic architecture of 4004 same device. was developed in August 1969; a concrete plan for the 4004 The different ways in which microprocessors are categorized are: system was finalized in December 1969; and the first microprocessor was successfully developed in March 1971. a) CISC (Complex Instruction Set Computers) Microprocessors, which became the "technology to open up a new b) RISC (Reduced Instruction Set Computers) era," brought two outstanding impacts, "power of intelligence" and "power of computing". First, microprocessors opened up a new a) VLIW(Very Long Instruction Word Computers) "era of programming" through replacing with software, the b) Super scalar processors hardwired logic based on IC's of the former "era of logic". At the same time, microprocessors allowed young engineers access to "power of computing" for the creative development of personal 2. BIRTH OF THE MICROPROCESSOR computers and computer games, which in turn led to growth in the In 1970, Intel introduced the first dynamic RAM, which increased software industry, and paved the way to the development of high- IC memory by a factor of four. -
Professor Won Woo Ro, School of Electrical and Electronic Engineering Yonsei University the Intel® 4004 Microprocessor, Introdu
Professor Won Woo Ro, School of Electrical and Electronic Engineering Yonsei University The 1st Microprocessor The Intel® 4004 microprocessor, introduced in November 1971 An electronics revolution that changed our world. There were no customer‐ programmable microprocessors on the market before the 4004. It propelled software into the limelight as a key player in the world of digital electronics design. 4004 Microprocessor Display at New Intel Museum A Japanese calculator maker (Busicom) asked to design: A set of 12 custom logic chips for a line of programmable calculators. Marcian E. "Ted" Hoff Recognized the integrated circuit technology (of the day) had advanced enough to build a single chip, general purpose computer. Federico Faggin to turn Hoff's vision into a silicon reality. (In less than one year, Faggin and his team delivered the 4004, which was introduced in November, 1971.) The world's first microprocessor application was this Busicom calculator. (sold about 100,000 calculators.) Measuring 1/8 inch wide by 1/6 inch long, consisting of 2,300 transistors, Intel’s 4004 microprocessor had as much computing power as the first electronic computer, ENIAC. 2 inch 4004 and 12 inch Core™2 Duo wafer ENIAC, built in 1946, filled 3000‐cubic‐ feet of space and contained 18,000 vacuum tubes. The 4004 microprocessor could execute 60,000 operations per second Running frequency: 108 KHz Founders wanted to name their new company Moore Noyce. However the name sounds very much similar to “more noise”. "Only the paranoid survive". Moore received a B.S. degree in Chemistry from the University of California, Berkeley in 1950 and a Ph.D. -
Computer Organization EECC 550 • Introduction: Modern Computer Design Levels, Components, Technology Trends, Register Transfer Week 1 Notation (RTN)
Computer Organization EECC 550 • Introduction: Modern Computer Design Levels, Components, Technology Trends, Register Transfer Week 1 Notation (RTN). [Chapters 1, 2] • Instruction Set Architecture (ISA) Characteristics and Classifications: CISC Vs. RISC. [Chapter 2] Week 2 • MIPS: An Example RISC ISA. Syntax, Instruction Formats, Addressing Modes, Encoding & Examples. [Chapter 2] • Central Processor Unit (CPU) & Computer System Performance Measures. [Chapter 4] Week 3 • CPU Organization: Datapath & Control Unit Design. [Chapter 5] Week 4 – MIPS Single Cycle Datapath & Control Unit Design. – MIPS Multicycle Datapath and Finite State Machine Control Unit Design. Week 5 • Microprogrammed Control Unit Design. [Chapter 5] – Microprogramming Project Week 6 • Midterm Review and Midterm Exam Week 7 • CPU Pipelining. [Chapter 6] • The Memory Hierarchy: Cache Design & Performance. [Chapter 7] Week 8 • The Memory Hierarchy: Main & Virtual Memory. [Chapter 7] Week 9 • Input/Output Organization & System Performance Evaluation. [Chapter 8] Week 10 • Computer Arithmetic & ALU Design. [Chapter 3] If time permits. Week 11 • Final Exam. EECC550 - Shaaban #1 Lec # 1 Winter 2005 11-29-2005 Computing System History/Trends + Instruction Set Architecture (ISA) Fundamentals • Computing Element Choices: – Computing Element Programmability – Spatial vs. Temporal Computing – Main Processor Types/Applications • General Purpose Processor Generations • The Von Neumann Computer Model • CPU Organization (Design) • Recent Trends in Computer Design/performance • Hierarchy -
Microprocessors in the 1970'S
Part II 1970's -- The Altair/Apple Era. 3/1 3/2 Part II 1970’s -- The Altair/Apple era Figure 3.1: A graphical history of personal computers in the 1970’s, the MITS Altair and Apple Computer era. Microprocessors in the 1970’s 3/3 Figure 3.2: Andrew S. Grove, Robert N. Noyce and Gordon E. Moore. Figure 3.3: Marcian E. “Ted” Hoff. Photographs are courtesy of Intel Corporation. 3/4 Part II 1970’s -- The Altair/Apple era Figure 3.4: The Intel MCS-4 (Micro Computer System 4) basic system. Figure 3.5: A photomicrograph of the Intel 4004 microprocessor. Photographs are courtesy of Intel Corporation. Chapter 3 Microprocessors in the 1970's The creation of the transistor in 1947 and the development of the integrated circuit in 1958/59, is the technology that formed the basis for the microprocessor. Initially the technology only enabled a restricted number of components on a single chip. However this changed significantly in the following years. The technology evolved from Small Scale Integration (SSI) in the early 1960's to Medium Scale Integration (MSI) with a few hundred components in the mid 1960's. By the late 1960's LSI (Large Scale Integration) chips with thousands of components had occurred. This rapid increase in the number of components in an integrated circuit led to what became known as Moore’s Law. The concept of this law was described by Gordon Moore in an article entitled “Cramming More Components Onto Integrated Circuits” in the April 1965 issue of Electronics magazine [338]. -
Vector Vs. Scalar Processors: a Performance Comparison Using a Set of Computational Science Benchmarks
Vector vs. Scalar Processors: A Performance Comparison Using a Set of Computational Science Benchmarks Mike Ashworth, Ian J. Bush and Martyn F. Guest, Computational Science & Engineering Department, CCLRC Daresbury Laboratory ABSTRACT: Despite a significant decline in their popularity in the last decade vector processors are still with us, and manufacturers such as Cray and NEC are bringing new products to market. We have carried out a performance comparison of three full-scale applications, the first, SBLI, a Direct Numerical Simulation code from Computational Fluid Dynamics, the second, DL_POLY, a molecular dynamics code and the third, POLCOMS, a coastal-ocean model. Comparing the performance of the Cray X1 vector system with two massively parallel (MPP) micro-processor-based systems we find three rather different results. The SBLI PCHAN benchmark performs excellently on the Cray X1 with no code modification, showing 100% vectorisation and significantly outperforming the MPP systems. The performance of DL_POLY was initially poor, but we were able to make significant improvements through a few simple optimisations. The POLCOMS code has been substantially restructured for cache-based MPP systems and now does not vectorise at all well on the Cray X1 leading to poor performance. We conclude that both vector and MPP systems can deliver high performance levels but that, depending on the algorithm, careful software design may be necessary if the same code is to achieve high performance on different architectures. KEYWORDS: vector processor, scalar processor, benchmarking, parallel computing, CFD, molecular dynamics, coastal ocean modelling All of the key computational science groups in the 1. Introduction UK made use of vector supercomputers during their halcyon days of the 1970s, 1980s and into the early 1990s Vector computers entered the scene at a very early [1]-[3]. -
Computer Organization EECC 550 • Introduction: Modern Computer Design Levels, Components, Technology Trends, Register Transfer Week 1 Notation (RTN)
Computer Organization EECC 550 • Introduction: Modern Computer Design Levels, Components, Technology Trends, Register Transfer Week 1 Notation (RTN). [Chapters 1, 2] • Instruction Set Architecture (ISA) Characteristics and Classifications: CISC Vs. RISC. [Chapter 2] Week 2 • MIPS: An Example RISC ISA. Syntax, Instruction Formats, Addressing Modes, Encoding & Examples. [Chapter 2] • Central Processor Unit (CPU) & Computer System Performance Measures. [Chapter 1] 3rd Edition Ch. 4 Week 3 • CPU Organization: Datapath & Control Unit Design. [Chapter 4] 3rd Edition Ch. 5 Week 4 – MIPS Single Cycle Datapath & Control Unit Design. – MIPS Multicycle Datapath and Finite State Machine Control Unit Design. 3rd Edition Ch. 5 (not in 4th) Week 5 • Microprogrammed Control Unit Design. 3rd Edition Ch. 5 (not in 4th Edition) – Microprogramming Project Week 6 • Midterm Review and Midterm Exam Week 7 • CPU Pipelining. [Chapter 4] 3rd Edition Ch. 6 • The Memory Hierarchy: Cache Design & Performance. [Chapter 5] Week 8 3rd Edition Ch. 7 • The Memory Hierarchy: Main & Virtual Memory. [Chapter 5] Week 9 • Input/Output Organization & System Performance Evaluation. [Chapter 7] 3rd Edition Ch. 8 Week 10 • Computer Arithmetic & ALU Design. [Chapter 3] If time permits. Week 11 • Final Exam. EECC550 - Shaaban #1 Lec # 1 Winter 2011 11-29-2011 Computing System History/Trends + Instruction Set Architecture (ISA) Fundamentals • Computing Element Choices: – Computing Element Programmability – Spatial vs. Temporal Computing – Main Processor Types/Applications • General -
Design and Implementation of a Multithreaded Associative Simd Processor
DESIGN AND IMPLEMENTATION OF A MULTITHREADED ASSOCIATIVE SIMD PROCESSOR A dissertation submitted to Kent State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy by Kevin Schaffer December, 2011 Dissertation written by Kevin Schaffer B.S., Kent State University, 2001 M.S., Kent State University, 2003 Ph.D., Kent State University, 2011 Approved by Robert A. Walker, Chair, Doctoral Dissertation Committee Johnnie W. Baker, Members, Doctoral Dissertation Committee Kenneth E. Batcher, Eugene C. Gartland, Accepted by John R. D. Stalvey, Administrator, Department of Computer Science Timothy Moerland, Dean, College of Arts and Sciences ii TABLE OF CONTENTS LIST OF FIGURES ......................................................................................................... viii LIST OF TABLES ............................................................................................................. xi CHAPTER 1 INTRODUCTION ........................................................................................ 1 1.1. Architectural Trends .............................................................................................. 1 1.1.1. Wide-Issue Superscalar Processors............................................................... 2 1.1.2. Chip Multiprocessors (CMPs) ...................................................................... 2 1.2. An Alternative Approach: SIMD ........................................................................... 3 1.3. MTASC Processor ................................................................................................