Strategies for the Modelling and Simulation of Asynchronous Computer Architectures

Total Page:16

File Type:pdf, Size:1020Kb

Strategies for the Modelling and Simulation of Asynchronous Computer Architectures STRATEGIES FOR THE MODELLING AND SIMULATION OF ASYNCHRONOUS COMPUTER ARCHITECTURES A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Science and Engineering September 1995 By Georgios Theodoropoulos Department of Computer Science Contents Abstract 15 The author 19 Acknowledgements 22 1 Introduction 23 1.1 Background .............................. 23 1.2 Motivation and Objectives . 23 1.3 StructureoftheThesis . 24 1.3.1 Related Publications . 26 2 The Quest for High Performance 27 2.1 Introduction.............................. 27 2.2 Bit and Instruction Level Parallelism . 28 2.3 ReducedInstructionSetComputers . 29 2.4 The Limits of Sequential Computation . 30 2.5 ParallelComputerArchitectures. 31 2.5.1 SIMD ............................. 32 2.5.2 MIMD............................. 33 2.5.2.1 Shared Memory MIMD Architectures . 33 2.5.2.2 Distributed Memory MIMD Architectures . 35 2 2.5.3 Parallel Programming Models and Languages . 36 2.5.3.1 Communicating Sequential Processes . 37 2.6 OccamandtheTransputer. 38 2.6.1 The Occam Programming Language . 38 2.6.1.1 The SEQ and PAR Constructs . 40 2.6.1.2 The ALT Construct . 41 2.6.1.3 Timers........................ 42 2.6.1.4 Functions and Procedures . 45 2.6.2 TheTransputer ........................ 45 2.6.2.1 Configuring Occam Programs . 48 2.6.2.2 The T9000 Transputer . 49 2.7 Summary ............................... 50 3 Modelling and Simulation 51 3.1 Introduction.............................. 51 3.2 Discrete Event Simulation Modelling . 55 3.3 The Need for Parallel Discrete Event Simulation . 57 3.3.1 Exploiting Parallelism . 58 3.4 The Logical Process Paradigm . 60 3.4.1 TimingIssues ......................... 61 3.5 Synchronous versus Asynchronous Simulation . 63 3.6 Time Driven Logical Process Simulation . 64 3.7 Event Driven Logical Process Simulation . 65 3.7.1 Conservative Techniques . 67 3.7.1.1 Deadlock Avoidance . 68 3.7.1.2 Deadlock Detection and Recovery . 70 3.7.1.3 Characteristics of Conservative Protocols . 70 3.8 Optimistic Synchronization Protocols . 71 3 3.8.1 TimeWarp .......................... 72 3.8.1.1 Global Virtual Time . 72 3.8.1.2 State Saving and Memory Management . 73 3.8.1.3 Characteristics of Optimistic Protocols . 74 3.9 Modelling and Simulation in Computer Architecture Research . 74 3.9.1 The Need for Improved Digital System Simulation Perfor- mance ............................. 78 3.9.1.1 Parallel Digital System Simulation . 78 3.10Summary ............................... 80 4 Asynchronous Systems 81 4.1 Introduction.............................. 81 4.2 Advantages of Asynchronous Systems . 82 4.2.1 Clock Distribution Problems . 83 4.2.2 Potential for Low Power . 83 4.2.3 Potential for High Performance . 84 4.2.4 Better Technology Migration Potential . 85 4.3 Basic Characteristics of Asynchronous Systems . 85 4.3.1 TimingModel......................... 85 4.3.2 Signalling Protocols . 86 4.3.2.1 Two-phase Signalling . 86 4.3.2.2 Four-phase Signalling . 87 4.3.3 Data Passing Techniques . 88 4.3.3.1 The Four-Wire Technique . 88 4.3.3.2 The Three-Wire Technique . 88 4.3.3.3 The Two-Plus-Wire Technique . 89 4.3.3.4 The Bundled Data Technique . 90 4.4 Micropipelines............................. 90 4 4.4.1 Event Control Elements . 91 4.4.2 Event Controlled Storage Element . 93 4.4.3 Micropipelines Without Processing . 94 4.4.4 Micropipelines With Processing . 95 4.5 AMULET ............................... 96 4.6 TheAMULET1Microprocessor . 98 4.6.1 The AMULET1 Interface . 98 4.6.2 The AMULET1 Internal Organization . 100 4.6.2.1 The Address Interface Unit . 100 4.6.2.2 The Data Interface Unit . 102 4.6.2.3 The Register Bank Unit . 103 4.6.2.4 The Execution Unit . 104 4.6.2.5 The Primary Decode Unit . 105 4.6.3 AMULET2 .......................... 105 4.7 Summary ............................... 106 5 Modelling Asynchronous Systems 107 5.1 Introduction.............................. 107 5.2 ModellingTechniques. 108 5.2.1 CSP-based Modelling Approaches . 108 5.3 Modelling Micropipelined Systems with Occam . 110 5.3.1 WhyOccam.......................... 111 5.3.1.1 The Deadlock Problem . 112 5.3.2 The Modelling Philosophy . 114 5.3.3 Modelling a Pipeline Without Processing . 115 5.3.4 Modelling a Pipeline With Processing . 116 5.3.5 Modelling Control Logic . 119 5.3.6 TimingIssues ......................... 119 5 5.3.6.1 Synchronous Merge . 120 5.3.6.2 Data Dependent Merge . 121 5.3.6.3 Arbitrated Merge . 122 5.3.6.4 Delay Independence . 125 5.4 Summary ............................... 126 6 Occarm: An Occam Model of AMULET1 127 6.1 Introduction.............................. 127 6.2 OccarmGeneralStructure . 128 6.2.1 Non-BundledSignals . 130 6.3 TheAddressInterface . 131 6.3.1 The Address Interface Internal Organization . 131 6.3.1.1 ThePCLoop.................... 131 6.3.1.2 ThePCPipe .................... 133 6.3.1.3 TheLSMLoop . 134 6.3.2 The Address Interface Occam Model . 135 6.4 TheDataInterface .......................... 137 6.5 InstructionFlowControl . 139 6.5.1 Condition Code Evaluation . 140 6.5.2 BranchExecution. 140 6.5.3 ExceptionHandling. 142 6.5.3.1 Software Interrupts . 143 6.5.3.2 Instruction Prefetch Aborts . 143 6.5.3.3 Hardware Interrupts . 143 6.5.3.4 Data Transfer Aborts . 144 6.6 ThePrimaryDecode . 147 6.6.1 The Dec1CtrlA Process . 148 6.6.1.1 Modelling of the Arbitration logic . 149 6 6.6.1.2 Detecting Data Aborts . 152 6.6.2 The Dec1CtrlB Process . 154 6.7 TheRegisterBank .......................... 155 6.7.1 Modelling the Register Bank . 157 6.8 TheExecutionUnitModel. 160 6.8.1 TheCPSRModel....................... 161 6.8.2 Decode2 ............................ 162 6.8.3 Decode3 ............................ 164 6.9 TheWriteBusControl . 165 6.10Summary ............................... 166 7 Simulation Issues 168 7.1 Introduction.............................. 168 7.2 The Host Machine: The ParSiFal T-Rack . 169 7.3 Monitoring............................... 171 7.3.1 MonitoringOccarm. 175 7.3.1.1 Debugging . 176 7.3.1.2 Performance Evaluation . 177 7.4 Termination .............................. 180 7.5 The Simulator Environment . 182 7.6 Multiprocessor Implementation . 183 7.6.1 Mapping Occarm onto the T-Rack . 184 7.6.1.1 Balancing the Workload . 186 7.6.1.2 Balancing the Communication Load . 187 7.6.1.3 The Monitoring Path . 190 7.6.1.4 The Generic Simulator Node . 191 7.7 Summary ............................... 191 7 8 Validation of the Occarm Model 192 8.1 Introduction.............................. 192 8.2 BenchmarkPrograms. 193 8.3 Accuracy................................ 195 8.4 Performance.............................. 204 8.5 Summary ............................... 208 9 Addressing the Time Modelling Problem 209 9.1 Introduction.............................. 209 9.2 Requirements ............................. 210 9.3 The Program Driven Synchronization Protocol (PDSP) . 211 9.3.1 TheBasis ........................... 211 9.3.2 TheRules ........................... 212 9.3.3 ThePDSPArbiterProcess. 213 9.3.3.1 Improving PDSP Performance . 214 9.3.4 TheLimitations. 216 9.4 ApplyingPDSPtoOccarm. 217 9.5 TheAddressInterfaceArbiter . 218 9.5.1 Providing Instruction Lookahead Information . 218 9.5.2 ThePCchLink ........................ 221 9.5.2.1 Filling of the Datapath . 224 9.5.2.2 Register Read Instructions . 230 9.5.2.3 Instructions Activating the ALUgo Signal . 232 9.5.2.4 Load/Store Multiple Instructions . 233 9.5.2.5 The Instruction Lookahead Table . 233 9.5.3 TheWchLink......................... 234 9.5.3.1 Colour Mismatch . 235 9.5.3.2 Condition Codes Failure . 236 8 9.6 ThePrimaryDecodeArbiter. 238 9.7 TheWriteControlArbiter . 242 9.7.1 TheDINchLink ....................... 242 9.7.2 TheDPchLink ........................ 244 9.8 Performance Evaluation of PDSP . 248 9.9 Summary ............................... 249 10 Conclusions and Further Work 250 10.1Background .............................. 250 10.2 ContributionoftheThesis . 252 10.2.1 Modelling ........................... 252 10.2.2 Simulation. 253 10.3 The Program Driven Synchronization Protocol . 255 10.4Performance.............................. 256 10.5 Occam as an Asynchronous Hardware Description Language . 257 10.6FurtherWork ............................. 258 10.6.1 Modelling and Simulation . 258 10.6.2 AutomaticSynthesis . 259 A The ARM6 Programmer’s Model 260 A.1 TheRegisters ............................. 260 A.2 TheInstructionSet.. .. .. .. .. ... .. .. .. .. .. ... 262 B Modelling the Control Logic of AMULET1 266 Bibliography 277 9 List of Tables 7.1 Communication Load on Occarm Links . 186 8.1 TimestampDrift ........................... 194 8.2 DhrystoneNumbers.......................... 194 8.3 AMULET1 Pipeline Occupancy (Dhrystone (1 loop)) . 196 8.4 AMULET1PipelineStalls(Dhrystone(1loop)) . 197 8.5 Asim versus Occarm (Single Transputer Implementation) . 206 8.6 PerformanceofOccarm. 206 9.1 PDSP: Number of Free Stages in the Datapath . 226 9.2 Performance of PDSP (Address Interface) . 248 10 List of Figures 2.1 TheUseoftheOccamSEQandPARConstructs . 40 2.2 TheOccamALTConstruct . 41 2.3 Introducing Delays with Occam Timers . 42 2.4 Programming Timeout Behaviour . 43 2.5 AnExampleOccamProgram . 44 2.6 TheArchitectureoftheT800Transputer . 46 3.1 ATaxonomyofModels........................ 53 3.2 Abstraction Levels in Digital Systems . 75 4.1 The Request-Acknowledge Interface . 86 4.2 Two-phase Signalling: Rising and Falling Edges Equivalent. ... 86 4.3 Two-phaseSignallingProtocol . 87 4.4 Four-phaseSignallingProtocol. 87 4.5 TheBundledDataInterface . 89 4.6 The Two-phase Bundled Data Protocol . 89 4.7 EventControlModules. 91 4.8 The Capture-Pass Storage Element . 93 4.9 MicropipelineWithoutProcessing . 94 4.10 MicropipelineWithProcessing. 96 4.11 TheAMULET1Interface. 99 4.12 The AMULET1 Internal Organization . 101 11 4.13 The AMULET1 Processor Physical Layout . 102 5.1 Micropipeline Without Processing: The Register Model . 116 5.2 Micropipeline With Processing: A High Level View . 117 5.3 Micropipeline With Processing: The Register Model . 118 5.4 SynchronousMerge.
Recommended publications
  • Parallelization Hardware Architecture Type to Enter Text
    Parallelization Hardware architecture Type to enter text Delft University of Technology Challenge the future Contents • Introduction • Classification of systems • Topology • Clusters and Grid • Fun Hardware 2 hybrid parallel vector superscalar scalar 3 Why Parallel Computing Primary reasons: • Save time • Solve larger problems • Provide concurrency (do multiple things at the same time) Classification of HPC hardware • Architecture • Memory organization 5 1st Classification: Architecture • There are several different methods used to classify computers • No single taxonomy fits all designs • Flynn's taxonomy uses the relationship of program instructions to program data • SISD - Single Instruction, Single Data Stream • SIMD - Single Instruction, Multiple Data Stream • MISD - Multiple Instruction, Single Data Stream • MIMD - Multiple Instruction, Multiple Data Stream 6 Flynn’s Taxonomy • SISD: single instruction and single data stream: uniprocessor • SIMD: vector architectures: lower flexibility • MISD: no commercial multiprocessor: imagine data going through a pipeline of execution engines • MIMD: most multiprocessors today: easy to construct with off-the-shelf computers, most flexibility 7 SISD • One instruction stream • One data stream • One instruction issued on each clock cycle • One instruction executed on single element(s) of data (scalar) at a time • Traditional ‘von Neumann’ architecture (remember from introduction) 8 SIMD • Also von Neumann architectures but more powerful instructions • Each instruction may operate on more than one data element • Usually intermediate host executes program logic and broadcasts instructions to other processors • Synchronous (lockstep) • Rating how fast these machines can issue instructions is not a good measure of their performance • Two major types: • Vector SIMD • Parallel SIMD 9 Vector SIMD • Single instruction results in multiple operands being updated • Scalar processing operates on single data elements.
    [Show full text]
  • DE86 006665 Comparison of the CRAY X-MP-4, Fujitsu VP-200, And
    Distribution Category: Mathematics and Computers General (UC-32) ANL--85-1 9 DE86 006665 ARGONNE NATIONAL LABORATORY 9700 South Cass Avenue Argonne, Illinois 60439 Comparison of the CRAY X-MP-4, Fujitsu VP-200, and Hitachi S-810/20: An Argonne Perspcctive Jack J. Dongarra Mathematics and Computer Science Division and Alan Hinds Computing Services October 1985 DISCLAIMER This report was prepared as an account of work sponsored by an agency o h ntdSae Government. Neither the United States Government nor any agency of the United States employees, makes any warranty, express or implied, or assumes ancy thereof, nor any of their bility for the accuracy, completeness, or usefulness of any informany legal liability or responsi- process disclosed, or represents that its use would nyinformation, apparatus, product, or ence enceherinherein tooay any specificcomriseii commercial rdt not infringe privately owned rights. Refer- product, process, or service by trade name, trademak manufacturer, or otherwise does not necessarily constitute or imply itsenrme, r ark, mendation, or favoring by the United States Government or any ag endorsement, recom- and opinions of authors expressed herein do not necessarily st agency thereof. The views United States Government or any agency thereof.ry to or reflect those of the DISTRIBUTION OF THIS DOCUMENT IS UNLIMITE Table of Contents List of Tables v List of Figures v Abstract 1 1. Introduction 1 2. Architectures 1 2.1 CRAY X-MP 2 2.2 Fujitsu VP-200 4 2.3 Hitachi S-810/20 6 3. Comparison of Computers 8 3.1 IBM Compatibility
    [Show full text]
  • Performance of Various Computers Using Standard Linear Equations Software
    ———————— CS - 89 - 85 ———————— Performance of Various Computers Using Standard Linear Equations Software Jack J. Dongarra* Electrical Engineering and Computer Science Department University of Tennessee Knoxville, TN 37996-1301 Computer Science and Mathematics Division Oak Ridge National Laboratory Oak Ridge, TN 37831 University of Manchester CS - 89 - 85 June 15, 2014 * Electronic mail address: [email protected]. An up-to-date version of this report can be found at http://www.netlib.org/benchmark/performance.ps This work was supported in part by the Applied Mathematical Sciences subprogram of the Office of Energy Research, U.S. Department of Energy, under Contract DE-AC05-96OR22464, and in part by the Science Alliance a state supported program at the University of Tennessee. 6/15/2014 2 Performance of Various Computers Using Standard Linear Equations Software Jack J. Dongarra Electrical Engineering and Computer Science Department University of Tennessee Knoxville, TN 37996-1301 Computer Science and Mathematics Division Oak Ridge National Laboratory Oak Ridge, TN 37831 University of Manchester June 15, 2014 Abstract This report compares the performance of different computer systems in solving dense systems of linear equations. The comparison involves approximately a hundred computers, ranging from the Earth Simulator to personal computers. 1. Introduction and Objectives The timing information presented here should in no way be used to judge the overall performance of a computer system. The results reflect only one problem area: solving dense systems of equations. This report provides performance information on a wide assortment of computers ranging from the home-used PC up to the most powerful supercomputers. The information has been collected over a period of time and will undergo change as new machines are added and as hardware and software systems improve.
    [Show full text]
  • Performance of Various Computers Using Standard Linear Equations Software
    ———————— CS - 89 - 85 ———————— Performance of Various Computers Using Standard Linear Equations Software Jack J. Dongarra* Electrical Engineering and Computer Science Department University of Tennessee Knoxville, TN 37996-1301 Computer Science and Mathematics Division Oak Ridge National Laboratory Oak Ridge, TN 37831 University of Manchester CS - 89 - 85 September 30, 2009 * Electronic mail address: [email protected]. An up-to-date version of this report can be found at http://WWW.netlib.org/benchmark/performance.ps This Work Was supported in part by the Applied Mathematical Sciences subprogram of the Office of Energy Research, U.S. Department of Energy, under Contract DE-AC05-96OR22464, and in part by the Science Alliance a state supported program at the University of Tennessee. 9/30/2009 2 Performance of Various Computers Using Standard Linear Equations Software Jack J. Dongarra Electrical Engineering and Computer Science Department University of Tennessee Knoxville, TN 37996-1301 Computer Science and MatHematics Division Oak Ridge National Laboratory Oak Ridge, TN 37831 University of Manchester September 30, 2009 Abstract This report compares the performance of different computer systems in solving dense systems of linear equations. The comparison involves approximately a hundred computers, ranging from the Earth Simulator to personal computers. 1. Introduction and Objectives The timing information presented here should in no way be used to judge the overall performance of a computer system. The results reflect only one problem area: solving dense systems of equations. This report provides performance information on a wide assortment of computers ranging from the home-used PC up to the most powerful supercomputers. The information has been collected over a period of time and will undergo change as new machines are added and as hardware and software systems improve.
    [Show full text]
  • Fujitsu Data Book 2014
    FUJITSU DATA BOOK Published October 2014 Contents Corporate Data FUJITSU Way 2 Corporate Data 3 Management Direction Fiscal 2014 4 Organizational Chart 6 Management 7 CSR and Environmental Activities 13 Financial Information 14 Principal Development and Manufacturing Facilities in Japan 18 Listed Subsidiaries in Japan 20 Principal Subsidiaries and Affiliated Companies 21 Intellectual Property 26 Structural Reforms, M&A Transactions and Spin-Off Ventures 27 History of Fujitsu 29 Fujitsu’s Business Overview 38 Vendor Share by Category 40 Big Data 42 Security 43 Cloud 44 Mobile 45 Manufacturing 46 Digital Marketing 47 Transformation of Work Styles 47 Health and Medical 48 Education 49 Disaster Mitigation and Prevention 49 Food and Agriculture 50 Transport and Vehicle 50 Case Studies 51 System Products 52 High Performance Computing (HPC) 54 Network Products 56 Ubiquitous Solutions 58 Research & Development (Fujitsu Laboratories Ltd.) 60 FUJITSU DATA BOOK 1 Corporate Data Corporate FUJITSU Way The Fujitsu Way embodies the philosophy of the Fujitsu Group, our reason for existence, values and the principles that we follow in our daily activities. The Fujitsu Way will facilitates management innovation and promotes a unified direction for the Fujitsu Group as we expand our global business activities, bringing innovative technology and solutions to every corner of the globe. The Fujitsu Way provides a common direction for all employees of the Fujitsu Group. By adhering to its FUJITSU Way principles and values, employees enhance corporate value and their contributions to global and local societies. Corporate Through our constant pursuit of innovation, the Fujitsu Group aims to contribute to the creation of a networked society that is rewarding and secure, bringing Vision about a prosperous future that fulfills the dreams of people throughout the world.
    [Show full text]
  • Efficient Large Electromagnetic Simulation Based on Hybrid TLM
    Efficient large electromagnetic simulation based on hybrid TLM and modal approach on grid computing and supercomputer Mihai Alexandru To cite this version: Mihai Alexandru. Efficient large electromagnetic simulation based on hybrid TLM and modal ap- proach on grid computing and supercomputer. Micro and nanotechnologies/Microelectronics. Institut National Polytechnique de Toulouse - INPT, 2012. English. tel-00797061 HAL Id: tel-00797061 https://tel.archives-ouvertes.fr/tel-00797061 Submitted on 5 Mar 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. ฀ ฀ ฀฀฀฀฀ %0$503"5%&-6/*7&34*5²฀ %&506-064& ฀฀ Institut National Polytechnique฀ de Toulouse (INP Toulouse) ฀฀฀ Micro-ondes, Electromagnétisme฀ et Optoélectronique ฀ ฀฀฀฀฀฀ Mihai ALEXANDRU฀ ฀ vendredi 14 décembre 2012 ฀ ฀ ฀฀ Efficient large electromagnetic simulation based on hybrid TLM and modal฀ approach on grid computing and฀ supercomputer ฀ ฀฀฀ Génie Electrique, Electronique฀ et Télécommunications (GEET) ฀฀฀฀ LAAS-CNRS ฀฀฀ PR Hervé AUBERT MCU Thierry MONTEIL ฀ PR Serge VERDEYME฀ DR Christian฀ PEREZ
    [Show full text]
  • Uniform Random Number Generators for Vector and Parallel Computers∗
    Uniform Random Number Generators for Vector and Parallel Computers∗ Richard P. Brent Computer Sciences Laboratory Australian National University Canberra, ACT 0200 [email protected] Report TR-CS-92-02 March 1992 Abstract We consider the requirements for uniform pseudo-random number generators on modern vector and parallel machines; consider the pros and cons of various popular classes of methods and some new methods; and outline what is currently available. We then make a proposal for a class of random number generators which have good statistical properties and can be implemented efficiently on vector processors and parallel machines. A proposal regarding initialization of these generators is made. We also discuss the results of a trial implementation on a Fujitsu VP 2200/10 vector processor. 1 Introduction – Requirements Pseudo-random numbers have been used in Monte Carlo calculations [1, 15, 25, 29] since the pioneering days of Von Neumann [28]. With the increasing speed of vector processors and parallel computers, considerable attention must be paid to the quality of random number generators available in subroutine libraries. A program running on a supercomputer might use 108 random numbers per second over a period of many hours (or months in the case of QCD calculations), so 1012 or more random numbers might contribute to the result. Small correlations or other deficiencies in the random number generator could easily lead to spurious effects and invalidate the results of the computation. Applications require random numbers with various distributions (e.g. normal, exponential, Poisson, ...) but the algorithms used to generate these random numbers almost invariably require a good uniform random number generator – see for example [2, 16, 32].
    [Show full text]
  • Recent Supercomputing Development in Japan
    Development of Supercomputers in Japan --From the Numerical Wind Tunnel to Fugaku-- Yoshio Oyanagi Science Advisor, RIST Kobe Center (高度情報科学技術研究機構) 2019/9/5 KSC 2019 1 How computers started in Japan PREHISTORY 2019/9/5 KSC 2019 2 “Computers” before WWII • Many Powers/Hollerith Punch Card Systems have been introduced to Japan since 1923 • The first one in universities: IBM Punch Card Systems in Kobe Univ. in 1941 • Since no PCS could be imported during WWII, Japan tried to copy the machine. • The next slide shows computer history exhibit of Kobe Univ. RIEB (Res. Inst. for Economics and Business Administration) . 2019/9/5 KSC 2019 3 2019/9/5 KSC 2019 4 Universities in 1950’s and 1960’s • Big computers were too expensive to buy with ordinary budget (we were poor, then!) • Some universities introduced small computers • Some universities developed computers by themselves – TAC 1952-1959 Univ. of Tokyo (Eng.) (with Toshiba) V – (no name) 1953-unfinished Osaka Univ. V – PC-1 1956- 1958 Univ. of Tokyo (Sci.) P – KDC-1 1958-1960 Kyoto Univ. (with Hitachi) T – SENAC-1 1956-1958 Tohoku Univ. (with NEC) P – K-1 1958-1960 Keio Univ. T • Those computers were open to teachers and students in the campus and were heavily used 2019/9/5 KSC 2019 5 PC-1 in Univ. of Tokyo (hand made) 2019/9/5 KSC 2019 6 Authorized Computer Centers • 1963/5/13 Recommendation of Science Council of Japan • 1963/7 Budget Proposal from Univ. of Tokyo • 1964/3 Accepted by government • Finalist computers were: – Hitachi HITAC 5020 (under development) – IBM 7094 II (popular but out-of-date) – CDC 3600 (new in Japan) • 1964/5 Committee selected HITAC 5020 – Rejected American computers 2019/9/5 KSC 2019 7 Computer Center, Univ.
    [Show full text]
  • ANL-85-19.Pdf
    ANL-85-19 ANL.85-19 COMPARISON OF THE CRAY X-MP.4, FUJITSU VP-200, AND HITACHI S-810/20 AN ARGONNE PERSPECTIVE by Jack J. Dongarra and Alan Hinds •-v '^y Of ARGONNE NATIONAL LABORATORY, ARGONNE, ILLINOIS Operated by THE UNIVERSITY OF CHICAGO for the U. S. DEPARTMENT OF ENERGY under Contract W-31-109-Eng-38 Argonne National Laboratory, with facilities in the states of Illinois and Idaho, is owned by the United States government, and operated by The University of Chicago under the provisions of a contract with the Department of Energy. DISCLAIMER This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, com­ pleteness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific com­ mercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof. Printed in the United States of America Available from National Technical Information Service U. S. Department of Commerce 5285 Port Royal Road Springfield, VA 22161 NTIS price codes Printed copy: A03 Microfiche copy: AOl Distribution Category: Mathematics and Computers General (UC-32) ANL-85-19 ARGONNE NATIONAL LABORATORY 9700 South Cass Avenue Argonne, Illinois 60439 Comparison of the CRAY X-MP-4, Fujitsu VP.200, and Hitachi S-810/20: An Argonne Perspective Jack J.
    [Show full text]
  • Vector Microprocessors
    Vector Microprocessors by Krste AsanoviÂc B.A. (University of Cambridge) 1987 A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the GRADUATE DIVISION of the UNIVERSITY of CALIFORNIA, BERKELEY Committee in charge: Professor John Wawrzynek, Chair Professor David A. Patterson Professor David Wessel Spring 1998 The dissertation of Krste AsanoviÂc is approved: Chair Date Date Date University of California, Berkeley Spring 1998 Vector Microprocessors Copyright 1998 by Krste AsanoviÂc 1 Abstract Vector Microprocessors by Krste AsanoviÂc Doctor of Philosophy in Computer Science University of California, Berkeley Professor John Wawrzynek, Chair Most previous research into vector architectures has concentrated on supercomputing applications and small enhancements to existing vector supercomputer implementations. This thesis expands the body of vector research by examining designs appropriate for single-chip full-custom vector microprocessor imple- mentations targeting a much broader range of applications. I present the design, implementation, and evaluation of T0 (Torrent-0): the ®rst single-chip vector microprocessor. T0 is a compact but highly parallel processor that can sustain over 24 operations per cycle while issuing only a single 32-bit instruction per cycle. T0 demonstrates that vector architectures are well suited to full-custom VLSI implementation and that they perform well on many multimedia and human-machine interface tasks. The remainder of the thesis contains proposals for future vector microprocessor designs. I show that the most area-ef®cient vector register ®le designs have several banks with several ports, rather than many banks with few ports as used by traditional vector supercomputers, or one bank with many ports as used by superscalar microprocessors.
    [Show full text]
  • Fujitsu Group Sustainability Report [Detailed Version] 2012
    Fujitsu Group Sustainability Report 2012 Report Sustainability Group Fujitsu Fujitsu Group Sustainability Report 2012 Detailed version The Power of ICT for sustainability and beyond CONTENTS Special Feature: The Power of ICT ............................................................................................................................................ 1 Message from Management ................................................................................................................................................... 5 Fujitsu Envisions Smart Cities .................................................................................................................................................. 8 The Fujitsu Group's CSR ......................................................................................................................................................... 12 [Priority 1] Providing Opportunities and Security Through ICT ............................................................................................... 24 [Priority 2] Protecting the Global Environment ..................................................................................................................... 32 Environmental Management at the Fujitsu Group ...................................................................................................... 33 Highlight - Contribution to Advanced Environmental Monitoring at an Industrial Estate in Thailand ......................... 52 Environmental Burden Reduction Project by Green ICT,
    [Show full text]
  • The Impact of Memory and Architecture on Computer Performance
    The Impact of Memory and Architecture on Computer Performance Nelson H. F. Beebe Center for Scienti®c Computing Department of Mathematics University of Utah Salt Lake City, UT 84112 USA Tel: +1 801 581 5254 FAX: +1 801 581 4148 Internet: [email protected] 28 March 1994 Version 1.03 CONTENTS i Contents 1 Introduction 1 2 Acronyms and units 3 3 Memory hierarchy 5 3.1 Complex and reduced instruction sets ............... 6 3.2 Registers ................................. 15 3.3 Cache memory .............................. 18 3.4 Main memory .............................. 19 3.4.1 Main memory technologies .................. 20 3.4.2 Cost of main memory ..................... 22 3.4.3 Main memory size ....................... 23 3.4.4 Segmented memory ...................... 27 3.4.5 Memory alignment ....................... 30 3.5 Virtual memory ............................. 31 4 Memory bottlenecks 33 4.1 Cache con¯icts ............................. 33 4.2 Memory bank con¯icts ......................... 35 4.3 Virtual memory page thrashing ................... 36 4.4 Registers and data ........................... 39 5 Conclusions 45 6 Further reading 45 References 48 Index 57 ii CONTENTS LIST OF FIGURES iii List of Figures 1 CPU and memory performance .................... 2 2 Rising clock rates of supercomputers and microcomputers ... 3 3 Computer memory capacity pre®xes ................ 4 4 Metric pre®xes for small quantities ................. 4 5 Memory hierarchy: size and performance ............. 6 6 Cost and access time of memory technologies .......... 6 7 Price and speed of computer disk storage ............. 7 8 Selected entries from the Datamation 100 ............. 10 9 Instruction set sizes .......................... 12 10 CPU instruction formats ........................ 15 11 Register counts for various architectures ..............17 12 Memory chip price and capacity ..................
    [Show full text]