MIPS, HPS, Two-Level Branch Prediction, and Compressed Code RISC Processor

Total Page:16

File Type:pdf, Size:1020Kb

MIPS, HPS, Two-Level Branch Prediction, and Compressed Code RISC Processor Awards ................................................................................................................................................................ Common Bonds: MIPS, HPS, Two-Level Branch Prediction, and Compressed Code RISC Processor ONUR MUTLU ETH Zurich RICH BELGARD ......We are continuing our series of of performance optimization on the com- sions made in the MIPS project that retrospectives for the 10 papers that piler and keep the hardware design sim- passed the test of time. The retrospective received the first set of MICRO Test of ple. The compiler is responsible for touches on the design tradeoffs made to Time (“ToT”) Awards in December generating and scheduling simple instruc- couple the hardware and the software, 2014.1,2 This issue features four retro- tions, which require little translation in the MIPS project’s effect on the later spectives written for six of the award- hardware to generate control signals to development of “fabless” semiconductor winning papers. We briefly introduce control the datapath components, which companies, and the use of benchmarks these papers and retrospectives and in turn keeps the hardware design simple. as a method for evaluating end-to-end hope that you will enjoy reading them as Thus, the instructions and hardware both performance of a system as, among much as we have. If anything ties these remain simple, whereas the compiler others, contributions of the MIPS project works together, it is the innovation they becomes much more important (and that have stood the test of time. delivered by taking a strong position in likely complex) because it must schedule the RISC/CISC debates of their decade. instructions well to ensure correct and High-Performance Systems We hope the IEEE Micro audience, espe- high-performance use of a simple pipe- The second retrospective addresses three cially younger generations, will find the line. Many modern processors continue related papers that received MICRO ToT historical perspective provided by these to take advantage of some of the design Awards in 2014. These papers introduce retrospectives invaluable. principles outlined in this paper, either in the High Performance Substrate (HPS) their ISA (the MIPS ISA still has consider- microarchitecture, examine its critical MIPS able use) or in their internal execution of design issues, and discuss the use of large The first retrospective is for the oldest instructions (many complex instructions hardware-supported atomic units to paper being discussed in this issue: are broken into simple RISC-style microin- enhance performance in HPS-style, dynam- “MIPS: A Microprocessor Architecture.”3 structions in modern processors). The ically scheduled microarchitectures. This work introduced MIPS (Microproces- MIPSISAanddesignisalsousedinmany The first paper, “HPS, A New Micro- sor without Interlocked Pipeline Stages) modern computer architecture courses architecture: Rationale and Introduction,”4 and its design philosophy, principles, hard- because of its simplicity. introduced the HPS microarchitecture, its ware implementation, related systems, Thomas Gross, Norman Jouppi, John design principles, and a prototype imple- and software issues. It is a concise over- Hennessy, Steven Przybylski, and Chris mentation. It also introduced the concept view of the MIPS design, which is one of Rowen provide in their retrospective a of restricted dataflow and its use in imple- the early reduced, or simple, instruction historical perspective on the development menting out-of-order instruction schedul- sets that have significantly impacted the of MIPS, the context in which they arrived ing and execution while maintaining microprocessor industry as well as com- at the design principles, and the develop- sequential execution and precise excep- puter architecture research and education ments that occurred after the paper. They tions from the programmer’s perspective for decades (and probably counting!). The also provide their reflections on what they (that is, without exposing the out-of-order- key design principle is to push the burden see as the design and evaluation deci- ness of dataflow execution to software). ....................................................... 70 Published by the IEEE Computer Society 0272-1732/16/$33.00 c 2016 IEEE This concept of out-of-order execution memory disambiguation problem, has on the realization that adding another level using a restricted dataflow engine that since been the subject of many works. of history to predict a branch’s direction supports precise exceptions has been the The third paper, “Hardware Support can greatly improve branch prediction hallmark of high-performance processor for Large Atomic Units in Dynamically accuracy, in addition to the single level designs since the success of the Intel Scheduled Machines,”7 likely provides that tracks which direction the same Pentium Pro,5 which implemented the the first treatment of large units of work static branch went the last N times it was concept. The paper describes how com- as atomic units that can enable higher executed (which had been established plexinstructionscanbetranslatedinhard- performance in an HPS-like dynamically practice since Jim Smith’s seminal ISCA ware to simple micro-operations and how scheduled processor. It discusses the 1981 paper11). In other words, the direc- dependent micro-operations are linked in benefits of enlarged blocks in an out-of- tion the branch took the last N times the hardware such that they are executed in order microarchitecture and describes same “history of branch directions” was a dataflow manner, in which a micro- how enlarged blocks can be formed at dif- encountered can provide a much better operation “fires” when its input operands ferent layers: architectural, compiler, and prediction for where the branch might go are ready. The mechanisms described in hardware. It introduces the notion of a fill the next time the same “history of branch this work paved the way for many future unit, which is a hardware structure that directions” is encountered. and current processor designs with com- enables the formation of large blocks of This insight formed the basis of many plex instruction sets to execute programs micro-operations that can be executed future branch predictor designs used in using out-of-order execution (and exploit atomically. This work has opened up new high-performance processors, most nota- some of the RISC principles) without dimensions and is considered a direct pre- bly starting with the Pentium Pro. Yeh requiring changes to the instruction set cursor to notions such as the trace cache, and Patt’s MICRO 1991 paper discusses architecture or the software. The retro- which is implemented in the Intel Pen- various design choices for this predictor spective by Yale Patt and his former stu- tium 4,8 and the block-structured ISA.9 and provides a detailed evaluation of their dents Wen-mei Hwu, Steve Melvin, and In their retrospective, Yale Patt and impact on prediction accuracy. For exam- Mike Shebanow beautifully describes the colleagues provide a historical perspec- ple, they examine the effect of keeping six major contributions of the original HPS tive of HPS, the environment and the global or local branch history registers paper. works that influenced the design deci- (many modern hybrid predictors use a The second paper, “Critical Issues sions behind it, its reception by the com- combination) and the effect of how the Regarding HPS, A High Performance munity and the industry, and what pattern history tables are organized. The Microarchitecture,”6 discusses what the happened afterward. We hope you enjoy retrospective provides the historical per- authors deem as critical issues in the reading this retrospective that describes spective as to how two-level branch pre- design of the HPS microarchitecture. The the inspirations and development of a diction came about, along with the paper is forward looking, discussing some research project that has heavily influ- contemporary works that followed it. of the problems that the authors thought enced almost all modern high- “would have to be solved to make HPS performance microprocessor designs, Compressed Code RISC viable,” as the retrospective puts it. The CISC or RISC. The authors also provide a Processor paper can be seen as a “problem book” glimpse of some of the future research The final retrospective in this issue is for that anticipates the many issues to be that has been enabled and influenced by “Executing Compressed Programs on an solved in designing a modern out-of-order the HPS project, of which the next paper Embedded RISC Architecture,”12 one of execution engine that translates a com- we discuss is a prime example. the first papers to tackle the problem of plex instruction set to a simple set of code compression to save physical mem- micro-operations internally. For example, Two-Level Branch Prediction ory space in cost-sensitive embedded it introduces the notion of a node cache, The third retrospective is for “Two-Level processors. It is also one of the first to an early precursor to a modern trace Adaptive Training Branch Prediction”10 by examine cost-related architectural issues cache, and discusses issues related to Tse-Yu Yeh and Yale Patt. This paper in a RISC-based system on chip (SoC), as control flow, design of the dynamic addresses a critical problem in the HPS Andy Wolfe’s retrospective nicely instruction scheduler, the memory units, microarchitecture, and in pipelined and describes. The basic idea is to
Recommended publications
  • 08-02-2021 Agenda Packet.Pdf
    AGENDA DELANO CITY COUNCIL REGULAR MEETING August 2, 2021 DELANO CITY HALL, 1015 – 11th Avenue 5:30 P.M. IN ACCORDANCE WITH THE GOVERNOR NEWSOM’S EXECUTIVE ORDER #N-08-21, THIS MEETING WILL BE CONDUCTED FULLY VIA TELECONFERENCE, DUE TO THE CURRENT RESTRICTIONS BY SAID ORDER AND CENTERS FOR DISEASES CONTROL AND PREVENTION (CDC) GUIDELINES. THE PUBLIC WILL HAVE ACCESS TO CALL IN, LISTEN TO THE MEETING AND PROVIDE PUBLIC COMMENT. IN ACCORDANCE WITH GOVERNOR NEWSOM’S EXECUTIVE ORDER N-08-21, THERE WILL NOT BE A PHYSICAL LOCATION FROM WHICH THE PUBLIC MAY ATTEND. IN ORDER TO CALL INTO TH E MEETING PLEASE SEE THE DIRECTIONS BELOW. CALL TO ORDER INVOCATION FLAG SALUTE ROLL CALL PRESENTATIONS AND AWARDS Featured Pet – Tabitha PUBLIC COMMENT: The public may address the Council on items which do not appear on the agenda. The Council cannot respond nor take action on items that do not appear on the agenda but may refer the item to staff for further study or for placement on a future agenda. Comments are limited to 3 minutes for each person and 15 minutes on each subject. Please state your name and address for the record. CONSENT AGENDA: The Consent Agenda consists of items that in staff’s opinion are routine and non-controversial. These items are approved in one motion unless a Councilmember or member of the public removes a particular item. 1) Authorization to waive the reading of any ordinance in its entirety and consenting to the reading of such ordinances by title only 2) Warrant Register in the amount of $3,044,398.54 3) Minutes of regular City Council Meeting of July 19, 2021 4) Acceptance and Approval of the City of Delano Quarterly Investment Report 5) Resolution adopting the 2020-2021 Kern Multi-Jurisdiction Hazard Mitigation Plan (MJHMP) 6) Approval of agreement with Youth Educational Sports, Inc.
    [Show full text]
  • Mipspro C++ Programmer's Guide
    MIPSproTM C++ Programmer’s Guide 007–0704–150 CONTRIBUTORS Rewritten in 2002 by Jean Wilson with engineering support from John Wilkinson and editing support from Susan Wilkening. COPYRIGHT Copyright © 1995, 1999, 2002 - 2003 Silicon Graphics, Inc. All rights reserved; provided portions may be copyright in third parties, as indicated elsewhere herein. No permission is granted to copy, distribute, or create derivative works from the contents of this electronic documentation in any manner, in whole or in part, without the prior written permission of Silicon Graphics, Inc. LIMITED RIGHTS LEGEND The electronic (software) version of this document was developed at private expense; if acquired under an agreement with the USA government or any contractor thereto, it is acquired as "commercial computer software" subject to the provisions of its applicable license agreement, as specified in (a) 48 CFR 12.212 of the FAR; or, if acquired for Department of Defense units, (b) 48 CFR 227-7202 of the DoD FAR Supplement; or sections succeeding thereto. Contractor/manufacturer is Silicon Graphics, Inc., 1600 Amphitheatre Pkwy 2E, Mountain View, CA 94043-1351. TRADEMARKS AND ATTRIBUTIONS Silicon Graphics, SGI, the SGI logo, IRIX, O2, Octane, and Origin are registered trademarks and OpenMP and ProDev are trademarks of Silicon Graphics, Inc. in the United States and/or other countries worldwide. MIPS, MIPS I, MIPS II, MIPS III, MIPS IV, R2000, R3000, R4000, R4400, R4600, R5000, and R8000 are registered or unregistered trademarks and MIPSpro, R10000, R12000, R1400 are trademarks of MIPS Technologies, Inc., used under license by Silicon Graphics, Inc. Portions of this publication may have been derived from the OpenMP Language Application Program Interface Specification.
    [Show full text]
  • Configurable RISC-V Softcore Processor for FPGA Implementation
    1 Configurable RISC-V softcore processor for FPGA implementation Joao˜ Filipe Monteiro Rodrigues, Instituto Superior Tecnico,´ Universidade de Lisboa Abstract—Over the past years, the processor market has and development of several programming tools. The RISC-V been dominated by proprietary architectures that implement Foundation controls the RISC-V evolution, and its members instruction sets that require licensing and the payment of fees to are responsible for promoting the adoption of RISC-V and receive permission so they can be used. ARM is an example of one of those companies that sell its microarchitectures to participating in the development of the new ISA. In the list of the manufactures so they can implement them into their own members are big companies like Google, NVIDIA, Western products, and it does not allow the use of its instruction set Digital, Samsung, or Qualcomm. (ISA) in other implementations without licensing. The RISC-V The main goal of this work is the development of a RISC- instruction set appeared proposing the hardware and software V softcore processor to be implemented in an FPGA, using development without costs, through the creation of an open- source ISA. This way, it is possible that any project that im- a non-RISC-V core as the base of this architecture. The plements the RISC-V ISA can be made available open-source or proposed solution is focused on solving the problems and even implemented in commercial products. However, the RISC- limitations identified in the other RISC-V cores that were V solutions that have been developed do not present the needed analyzed in this thesis, especially in terms of the adaptability requirements so they can be included in projects, especially the and flexibility, allowing future modifications according to the research projects, because they offer poor documentation, and their performances are not suitable.
    [Show full text]
  • Tom Kearney Object Name: Sun Sparcstation IPX Vintage: C.1991 Synopsis: Sun Sparcstation IPX
    AccessionIndex: TCD-SCSS-T.20121208.075 Accession Date: 8-Dec-2012 Accession By: Tom Kearney Object name: Sun Sparcstation IPX Vintage: c.1991 Synopsis: Sun Sparcstation IPX. S/N: 600-2791-04 213M1236. Description: The Sun Sparcstation IPX is a workstation introduced by Sun Microsystems in 1991. It was designed to be an entry-level networked workstation. It is based on the SUN4C architecture, enclosed in a lunchbox chassis. It uses a Fujitsu MB86903 or Weitek W8701 40 MHz processor. Weitek provided 80MHz after-market "SPARC POWERuP!" (2000A-080 GCD) processors which worked well in an IPX but required a ROM update to v2.9. It has four 72-pin SIMM slots for memory expansion. The memory uses parity Fast Page Memory (FPM) SIMM's with speeds of 50-80ns. Slots can be filled individually giving a maximum of 64MB memory. Paired memory modules decrease access times via "bank interleaving" resulting in faster memory and overall system performance. Additional 32 and 64MB SBUS "Above Board" RAM expanders will fit and work in the IPX using the 8-pin J101 header which contains additional power and clock signals next to the DMA/Cache controller. The Sparcstation IPX also includes an on-board AMD Lance Ethernet chipset providing 10BaseT networking as standard and 10Base2 and 10Base5 via an AUI transceiver. The OpenBoot ROM is able to boot from network, using RARP and TFTP. Like all other SPARCstation systems, it holds system information such as MAC address and serial number in NVRAM. If the battery on this chip dies, then the system will not be able to boot.
    [Show full text]
  • Memory Profiling on Shared-Memory Multiprocessors
    MEMORY PROFILING ON SHARED-MEMORY MULTIPROCESSORS A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Jeffrey S. Gibson June 2004 c Copyright by Jeffrey S. Gibson 2004 All Rights Reserved ii I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Dr. John Hennessy (Principal Advisor) I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Dr. Mark Horowitz I certify that I have read this dissertation and that in my opinion it is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Dr. Mendel Rosenblum Approved for the University Committee on Graduate Studies: iii Abstract Tuning application memory performance can be difficult on any system but is particularly so on distributed shared-memory (DSM) multiprocessors. This is due to the implicit nature of communication, the unforeseen interactions among the processors, and the long remote memory latencies. Tools, called memory profilers, that allow the user to map memory behavior back to application data structures can be invaluable aids to the programmer. Un- fortunately, memory profiling is difficult to implement efficiently since most systems lack the requisite hardware support. This dissertation introduces two techniques for efficient memory profiling, each requiring hardware support on either the processor or the system node controller.
    [Show full text]
  • Microsparc-II-Usersm
    Products Rights Notice: Copyright © 1991-2008 Sun Microsystems, Inc. 4150 Network Circle, Santa Clara, California 95054, U.S.A. All Rights Reserved You understand that these materials were not prepared for public release and you assume all risks in using these materials. These risks include, but are not limited to errors, inaccuracies, incompleteness and the possibility that these materials infringe or misappropriate the intellectual property right of others. You agree to assume all such risks. THESE MATERIALS ARE PROVIDED BY THE COPYRIGHT HOLDERS AND OTHER CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS (INCLUDING ANY OF OWNER'S PARTNERS, VENDORS AND LICENSORS) BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THESE MATERIALS, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Sun, Sun Microsystems, the Sun logo, Solaris, OpenSPARC T1, OpenSPARC T2 and UltraSPARC are trademarks or registered trademarks of Sun Microsystems, Inc. in the U.S. and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the U.S. and other countries. Products bearing SPARC trademarks are based upon architecture developed by Sun Microsystems, Inc.
    [Show full text]
  • The Supersparc Microprocessor
    The SuperSPARC™ Microprocessor Technical White Paper 2550 Garcia Avenue Mountain View, CA 94043 U.S.A. © 1992 Sun Microsystems, Inc.—Printed in the United States of America. 2550 Garcia Avenue, Mountain View, California 94043-1100 U.S.A All rights reserved. This product and related documentation is protected by copyright and distributed under licenses restricting its use, copying, distribution and decompilation. No part of this product or related documentation may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any. Portions of this product may be derived from the UNIX® and Berkeley 4.3 BSD systems, licensed from UNIX Systems Laboratories, Inc. and the University of California, respectively. Third party font software in this product is protected by copyright and licensed from Sun’s Font Suppliers. RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.227-7013 and FAR 52.227-19. The product described in this manual may be protected by one or more U.S. patents, foreign patents, or pending applications. TRADEMARKS Sun, Sun Microsystems, the Sun logo, are trademarks or registered trademarks of Sun Microsystems, Inc. UNIX and OPEN LOOK are registered trademarks of UNIX System Laboratories, Inc. All other product names mentioned herein are the trademarks of their respective owners. All SPARC trademarks, including the SCD Compliant Logo, are trademarks or registered trademarks of SPARC International, Inc. SPARCstation, SPARCserver, SPARCengine, SPARCworks, and SPARCompiler are licensed exclusively to Sun Microsystems, Inc.
    [Show full text]
  • Computer Architecture Research with RISC-‐V
    Computer Architecture Research with RISC-V Krste Asanovic UC Berkeley, RISC-V Foundaon, & SiFive Inc. [email protected] www.riscv.org CARRV, Boston, MA October 14, 2017 Only Two Big Mistakes Possible when Picking Research ISA § Design your own § Use someone else’s Promise of using commercially popular ISAs for research § Ported applicaons/workloads to study § Standard soRware stacks (compilers, OS) § Real commercial hardware to experiment with § Real commercial hardware to validate models with § ExisAng implementaons to study / modify § Industry is more interested in your results 3 Types of projects and standard ISAs used by me or my group in last 30 years § Experiments on real hardware plaorms: - Transputer arrays, SPARC workstaons, MIPS workstaons, POWER workstaons, ARMv7 handhelds, x86 desktops/ servers § Research chips built around modified MIPS ISA: - T0, IRAM, STC1, Scale, Maven § FPGA prototypes/simulaons using various ISAs: - RAMP Blue (modified Microblaze), RAMP Gold/ DIABLO (SPARC v8) § Experiments using soRware architectural simulators: - SimpleScalar (PISA), SMTsim (Alpha), Simics (SPARC,x86), Bochs (x86), MARSS (x86), Gem5(SPARC), PIN (Itanium, x86), … § And of course, other groups used some others too. RealiMes of using standard ISAs § Everything only works if you don’t change anything - Stock binary applicaons - Stock libraries - Stock compiler - Stock OS - Stock hardware implementaon § Add a new instrucAon, get a new non-standard ISA! - Need source code for the apps and recompile - Impossible for most real interesAng applicaons
    [Show full text]
  • MIPS IV Instruction Set
    MIPS IV Instruction Set Revision 3.2 September, 1995 Charles Price MIPS Technologies, Inc. All Right Reserved RESTRICTED RIGHTS LEGEND Use, duplication, or disclosure of the technical data contained in this document by the Government is subject to restrictions as set forth in subdivision (c) (1) (ii) of the Rights in Technical Data and Computer Software clause at DFARS 52.227-7013 and / or in similar or successor clauses in the FAR, or in the DOD or NASA FAR Supplement. Unpublished rights reserved under the Copyright Laws of the United States. Contractor / manufacturer is MIPS Technologies, Inc., 2011 N. Shoreline Blvd., Mountain View, CA 94039-7311. R2000, R3000, R6000, R4000, R4400, R4200, R8000, R4300 and R10000 are trademarks of MIPS Technologies, Inc. MIPS and R3000 are registered trademarks of MIPS Technologies, Inc. The information in this document is preliminary and subject to change without notice. MIPS Technologies, Inc. (MTI) reserves the right to change any portion of the product described herein to improve function or design. MTI does not assume liability arising out of the application or use of any product or circuit described herein. Information on MIPS products is available electronically: (a) Through the World Wide Web. Point your WWW client to: http://www.mips.com (b) Through ftp from the internet site “sgigate.sgi.com”. Login as “ftp” or “anonymous” and then cd to the directory “pub/doc”. (c) Through an automated FAX service: Inside the USA toll free: (800) 446-6477 (800-IGO-MIPS) Outside the USA: (415) 688-4321 (call from a FAX machine) MIPS Technologies, Inc.
    [Show full text]
  • MIPS Architecture • MIPS (Microprocessor Without Interlocked Pipeline Stages) • MIPS Computer Systems Inc
    Spring 2011 Prof. Hyesoon Kim MIPS Architecture • MIPS (Microprocessor without interlocked pipeline stages) • MIPS Computer Systems Inc. • Developed from Stanford • MIPS architecture usages • 1990’s – R2000, R3000, R4000, Motorola 68000 family • Playstation, Playstation 2, Sony PSP handheld, Nintendo 64 console • Android • Shift to SOC http://en.wikipedia.org/wiki/MIPS_architecture • MIPS R4000 CPU core • Floating point and vector floating point co-processors • 3D-CG extended instruction sets • Graphics – 3D curved surface and other 3D functionality – Hardware clipping, compressed texture handling • R4300 (embedded version) – Nintendo-64 http://www.digitaltrends.com/gaming/sony- announces-playstation-portable-specs/ Not Yet out • Google TV: an Android-based software service that lets users switch between their TV content and Web applications such as Netflix and Amazon Video on Demand • GoogleTV : search capabilities. • High stream data? • Internet accesses? • Multi-threading, SMP design • High graphics processors • Several CODEC – Hardware vs. Software • Displaying frame buffer e.g) 1080p resolution: 1920 (H) x 1080 (V) color depth: 4 bytes/pixel 4*1920*1080 ~= 8.3MB 8.3MB * 60Hz=498MB/sec • Started from 32-bit • Later 64-bit • microMIPS: 16-bit compression version (similar to ARM thumb) • SIMD additions-64 bit floating points • User Defined Instructions (UDIs) coprocessors • All self-modified code • Allow unaligned accesses http://www.spiritus-temporis.com/mips-architecture/ • 32 64-bit general purpose registers (GPRs) • A pair of special-purpose registers to hold the results of integer multiply, divide, and multiply-accumulate operations (HI and LO) – HI—Multiply and Divide register higher result – LO—Multiply and Divide register lower result • a special-purpose program counter (PC), • A MIPS64 processor always produces a 64-bit result • 32 floating point registers (FPRs).
    [Show full text]
  • Instruction Level Parallelism -- Hardware Speculation and VLIW (Static Superscalar)
    Lecture 17: Instruction Level Parallelism -- Hardware Speculation and VLIW (Static Superscalar) CSE 564 Computer Architecture Summer 2017 Department of Computer Science and Engineering Yonghong Yan [email protected] www.secs.oakland.edu/~yan Topics for Instruction Level Parallelism § ILP Introduction, Compiler Techniques and Branch Prediction – 3.1, 3.2, 3.3 § Dynamic Scheduling (OOO) – 3.4, 3.5 and C.5, C.6 and C.7 (FP pipeline and scoreboard) § Hardware Speculation and Static Superscalar/VLIW – 3.6, 3.7 § Dynamic Scheduling, Multiple Issue and Speculation – 3.8, 3.9 § ILP Limitations and SMT – 3.10, 3.11, 3.12 2 REVIEW 3 Not Every Stage Takes only one Cycle § FP EXE Stage – Multi-cycle Add/Mul – Nonpiplined for DIV § MEM Stage 4 Issues of Multi-Cycle in Some Stages § The divide unit is not fully pipelined – structural hazards can occur » need to be detected and stall incurred. § The instructions have varying running times – the number of register writes required in a cycle can be > 1 § Instructions no longer reach WB in order – Write after write (WAW) hazards are possible » Note that write after read (WAR) hazards are not possible, since the register reads always occur in ID. § Instructions can complete in a different order than they were issued (out-of-order complete) – causing problems with exceptions § Longer latency of operations – stalls for RAW hazards will be more frequent. 5 Hazards and Forwarding for Longer- Latency Pipeline § H 6 Problems Arising From Writes § If we issue one instruction per cycle, how can we avoid structural
    [Show full text]
  • · Welter ABACUS 3170 FLOATING·POINT COPROCESSOR for SPARC PRELIMINARY DATA May 1989
    · WElTER ABACUS 3170 FLOATING·POINT COPROCESSOR FOR SPARC PRELIMINARY DATA May 1989 The Abac:us 3170 Is a single-c:hip noating-point c:oproc:essor for the Fujitsu 5-20/5-25 and LSI Logic L64801 implementations of the SPARC architec:ture. It inc:orporates a noating-point datapath and a noating­ point c:ontroUer. The Abac:us 3170 provides direc:t interfac:e to the integer unit and memory. It is available in speed grades of 20 and 25 MHz. Related product The Abacus 3171 single­ chip Ooating-point coproc:eaor forCypress 7C601 implementation of SPARe architecture. ( Contents Features 1 Desc:ription System Considerations 8 Spec:ific:ations 10 Pin Configuration IS Physical Dimensions 16 ( Ordering Information 16 Sales Offices back cover USING THIS DATA SHEET In the writing of this data sheet, it was assumed that the user is familiar with the SPARC architecture, as well as the hardware details of its implementation. This data sheet does not cover details that are explained in the following and other related literature: The SPARe Architecture MtD'IUIll, by Sun Microsystems SPARe MB86901 (S-25) High PerjOl7l'lQTlCe 32-Bit RIse Processor, by Fujitsu Microelectronics, Inc. L64801 Hzgh PerjOl7l'lQTlCe Open Architecture RIse Mu:roprocessor, by LSI Logic Corporation. WEITEK Abacus 3170 Floating-Point Coprocessor for SPARC May, 1989 Copyright © WEITEK Corporation 1989 1060 East Arques Avenue Sunnyvale, California 94086 Telephone (408) 738-8400 All rights reserved . WEITEK is a registered trademark of WEITEK Corporation SPARC is a trademark of Sun Microsystems, Inc. WEITEK reserves the right to make changes to these specifications at any time Printed in the United States of America 9089 65432 1 DOC 89XX ("~.
    [Show full text]