The 32-Bit PA-RISC Run- Time Architecture Document
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
The 32-Bit PA-RISC Run-Time Architecture Document, V. 1.0 for HP
The 32-bit PA-RISC Run-time Architecture Document HP-UX 11.0 Version 1.0 (c) Copyright 1997 HEWLETT-PACKARD COMPANY. The information contained in this document is subject to change without notice. HEWLETT-PACKARD MAKES NO WARRANTY OF ANY KIND WITH REGARD TO THIS MATERIAL, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Hewlett-Packard shall not be liable for errors contained herein or for incidental or consequential damages in connection with furnishing, performance, or use of this material. Hewlett-Packard assumes no responsibility for the use or reliability of its software on equipment that is not furnished by Hewlett-Packard. This document contains proprietary information which is protected by copyright. All rights are reserved. No part of this document may be photocopied, reproduced, or translated to another language without the prior written consent of Hewlett-Packard Company. CSO/STG/STD/CLO Hewlett-Packard Company 11000 Wolfe Road Cupertino, California 95014 By The Run-time Architecture Team 32-bit PA-RISC RUN-TIME ARCHITECTURE DOCUMENT 11.0 version 1.0 -1 -2 CHAPTER 1 Introduction 7 1.1 Target Audiences 7 1.2 Overview of the PA-RISC Runtime Architecture Document 8 CHAPTER 2 Common Coding Conventions 9 2.1 Memory Model 9 2.1.1 Text Segment 9 2.1.2 Initialized and Uninitialized Data Segments 9 2.1.3 Shared Memory 10 2.1.4 Subspaces 10 2.2 Register Usage 10 2.2.1 Data Pointer (GR 27) 10 2.2.2 Linkage Table Register (GR 19) 10 2.2.3 Stack Pointer (GR 30) 11 2.2.4 Space -
MIPS Architecture • MIPS (Microprocessor Without Interlocked Pipeline Stages) • MIPS Computer Systems Inc
Spring 2011 Prof. Hyesoon Kim MIPS Architecture • MIPS (Microprocessor without interlocked pipeline stages) • MIPS Computer Systems Inc. • Developed from Stanford • MIPS architecture usages • 1990’s – R2000, R3000, R4000, Motorola 68000 family • Playstation, Playstation 2, Sony PSP handheld, Nintendo 64 console • Android • Shift to SOC http://en.wikipedia.org/wiki/MIPS_architecture • MIPS R4000 CPU core • Floating point and vector floating point co-processors • 3D-CG extended instruction sets • Graphics – 3D curved surface and other 3D functionality – Hardware clipping, compressed texture handling • R4300 (embedded version) – Nintendo-64 http://www.digitaltrends.com/gaming/sony- announces-playstation-portable-specs/ Not Yet out • Google TV: an Android-based software service that lets users switch between their TV content and Web applications such as Netflix and Amazon Video on Demand • GoogleTV : search capabilities. • High stream data? • Internet accesses? • Multi-threading, SMP design • High graphics processors • Several CODEC – Hardware vs. Software • Displaying frame buffer e.g) 1080p resolution: 1920 (H) x 1080 (V) color depth: 4 bytes/pixel 4*1920*1080 ~= 8.3MB 8.3MB * 60Hz=498MB/sec • Started from 32-bit • Later 64-bit • microMIPS: 16-bit compression version (similar to ARM thumb) • SIMD additions-64 bit floating points • User Defined Instructions (UDIs) coprocessors • All self-modified code • Allow unaligned accesses http://www.spiritus-temporis.com/mips-architecture/ • 32 64-bit general purpose registers (GPRs) • A pair of special-purpose registers to hold the results of integer multiply, divide, and multiply-accumulate operations (HI and LO) – HI—Multiply and Divide register higher result – LO—Multiply and Divide register lower result • a special-purpose program counter (PC), • A MIPS64 processor always produces a 64-bit result • 32 floating point registers (FPRs). -
Superh RISC Engine SH-1/SH-2
SuperH RISC Engine SH-1/SH-2 Programming Manual September 3, 1996 Hitachi America Ltd. Notice When using this document, keep the following in mind: 1. This document may, wholly or partially, be subject to change without notice. 2. All rights are reserved: No one is permitted to reproduce or duplicate, in any form, the whole or part of this document without Hitachi’s permission. 3. Hitachi will not be held responsible for any damage to the user that may result from accidents or any other reasons during operation of the user’s unit according to this document. 4. Circuitry and other examples described herein are meant merely to indicate the characteristics and performance of Hitachi’s semiconductor products. Hitachi assumes no responsibility for any intellectual property claims or other problems that may result from applications based on the examples described herein. 5. No license is granted by implication or otherwise under any patents or other rights of any third party or Hitachi, Ltd. 6. MEDICAL APPLICATIONS: Hitachi’s products are not authorized for use in MEDICAL APPLICATIONS without the written consent of the appropriate officer of Hitachi’s sales company. Such use includes, but is not limited to, use in life support systems. Buyers of Hitachi’s products are requested to notify the relevant Hitachi sales offices when planning to use the products in MEDICAL APPLICATIONS. Introduction The SuperH RISC engine family incorporates a RISC (Reduced Instruction Set Computer) type CPU. A basic instruction can be executed in one clock cycle, realizing high performance operation. A built-in multiplier can execute multiplication and addition as quickly as DSP. -
Processor-103
US005634119A United States Patent 19 11 Patent Number: 5,634,119 Emma et al. 45 Date of Patent: May 27, 1997 54) COMPUTER PROCESSING UNIT 4,691.277 9/1987 Kronstadt et al. ................. 395/421.03 EMPLOYING ASEPARATE MDLLCODE 4,763,245 8/1988 Emma et al. ........................... 395/375 BRANCH HISTORY TABLE 4,901.233 2/1990 Liptay ...... ... 395/375 5,185,869 2/1993 Suzuki ................................... 395/375 75) Inventors: Philip G. Emma, Danbury, Conn.; 5,226,164 7/1993 Nadas et al. ............................ 395/800 Joshua W. Knight, III, Mohegan Lake, Primary Examiner-Kenneth S. Kim N.Y.; Thomas R. Puzak. Ridgefield, Attorney, Agent, or Firm-Jay P. Sbrollini Conn.; James R. Robinson, Essex Junction, Vt.; A. James Wan 57 ABSTRACT Norstrand, Jr., Round Rock, Tex. A computer processing system includes a first memory that 73 Assignee: International Business Machines stores instructions belonging to a first instruction set archi Corporation, Armonk, N.Y. tecture and a second memory that stores instructions belong ing to a second instruction set architecture. An instruction (21) Appl. No.: 369,441 buffer is coupled to the first and second memories, for storing instructions that are executed by a processor unit. 22 Fied: Jan. 6, 1995 The system operates in one of two modes. In a first mode, instructions are fetched from the first memory into the 51 Int, C. m. G06F 9/32 instruction buffer according to data stored in a first branch 52 U.S. Cl. ....................... 395/587; 395/376; 395/586; history table. In the second mode, instructions are fetched 395/595; 395/597 from the second memory into the instruction buffer accord 58 Field of Search ........................... -
Design of the RISC-V Instruction Set Architecture
Design of the RISC-V Instruction Set Architecture Andrew Waterman Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2016-1 http://www.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-1.html January 3, 2016 Copyright © 2016, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. Design of the RISC-V Instruction Set Architecture by Andrew Shell Waterman A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor David Patterson, Chair Professor Krste Asanovi´c Associate Professor Per-Olof Persson Spring 2016 Design of the RISC-V Instruction Set Architecture Copyright 2016 by Andrew Shell Waterman 1 Abstract Design of the RISC-V Instruction Set Architecture by Andrew Shell Waterman Doctor of Philosophy in Computer Science University of California, Berkeley Professor David Patterson, Chair The hardware-software interface, embodied in the instruction set architecture (ISA), is arguably the most important interface in a computer system. Yet, in contrast to nearly all other interfaces in a modern computer system, all commercially popular ISAs are proprietary. -
A Lisp Oriented Architecture by John W.F
A Lisp Oriented Architecture by John W.F. McClain Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degrees of Master of Science in Electrical Engineering and Computer Science and Bachelor of Science in Electrical Engineering at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 1994 © John W.F. McClain, 1994 The author hereby grants to MIT permission to reproduce and to distribute copies of this thesis document in whole or in part. Signature of Author ...... ;......................... .............. Department of Electrical Engineering and Computer Science August 5th, 1994 Certified by....... ......... ... ...... Th nas F. Knight Jr. Principal Research Scientist 1,,IA £ . Thesis Supervisor Accepted by ....................... 3Frederic R. Morgenthaler Chairman, Depattee, on Graduate Students J 'FROM e ;; "N MfLIT oARIES ..- A Lisp Oriented Architecture by John W.F. McClain Submitted to the Department of Electrical Engineering and Computer Science on August 5th, 1994, in partial fulfillment of the requirements for the degrees of Master of Science in Electrical Engineering and Computer Science and Bachelor of Science in Electrical Engineering Abstract In this thesis I describe LOOP, a new architecture for the efficient execution of pro- grams written in Lisp like languages. LOOP allows Lisp programs to run at high speed without sacrificing safety or ease of programming. LOOP is a 64 bit, long in- struction word architecture with support for generic arithmetic, 64 bit tagged IEEE floats, low cost fine grained read and write barriers, and fast traps. I make estimates for how much these Lisp specific features cost and how much they may speed up the execution of programs written in Lisp. -
PA-RISC 1.1 Architecture and Instruction Set Reference Manual
PA-RISC 1.1 Architecture and Instruction Set Reference Manual HP Part Number: 09740-90039 Printed in U.S.A. February 1994 Third Edition Notice The information contained in this document is subject to change without notice. HEWLETT-PACKARD MAKES NO WARRANTY OF ANY KIND WITH REGARD TO THIS MATERIAL, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Hewlett-Packard shall not be liable for errors contained herein or for incidental or consequential damages in connection with furnishing, performance, or use of this material. Hewlett-Packard assumes no responsibility for the use or reliability of its software on equipment that is not furnished by Hewlett-Packard. This document contains proprietary information which is protected by copyright. All rights are reserved. No part of this document may be photocopied, reproduced, or translated to another language without the prior written consent of Hewlett-Packard Company. Copyright © 1986 – 1994 by HEWLETT-PACKARD COMPANY Printing History The printing date will change when a new edition is printed. The manual part number will change when extensive changes are made. First Edition . November 1990 Second Edition. September 1992 Third Edition . February 1994 Contents Contents . iii Preface. ix 1 Overview . 1-1 Introduction. 1-1 System Features . 1-2 PA-RISC 1.1 Enhancements . 1-2 System Organization . 1-4 2 System Organization . 2-1 Introduction. 2-1 Memory and I/O Addressing . 2-2 Byte Ordering (Big Endian/Little Endian) . 2-3 Levels of PA-RISC. 2-5 Data Types . 2-5 Processing Resources. 2-7 3 Addressing and Access Control. -
PIPELINING Basics
PIPELINING basics • A pipelined architecture for MIPS • Hurdles in pipelining • Simple solutions to pipelining hurdles • Advanced pipelining • Conclusive remarks 1 MIPS pipelined architecture • MIPS simplified architecture can be realized by having each instruction execute in a single clock cycle, approximately as long as the 5 clocks required to complete the 5 phases. Why would this approach be unconvenient? • We already know one reason: the longer cycle would waste time in all instructions that take less to execute (fewer than 5 clocks). 2 MIPS pipelined architecture • There is another relevant reason: • By breaking down into more phases (clock cycles) the execution of each instruction, it is possible to (partially) overlap the execution of more instructions. • At eack clock cycle, while a section of the datapath takes care of an instruction, another section can be used to execute another instruction. 3 MIPS pipelined architecture • If we start a new instruction at each new clock cycle, each of the 5 phases of the multi-cycle MIPS architecture becomes a stage in the pipeline, and the pattern of execution of a sequence of instructions looks like this (Hennessy-Patterson, Fig. A.1): instr. 1 2 3 4 5 6 7 8 9 number instr. i IF ID EX MEM WB instr. i+1 IF ID EX MEM WB instr. i+2 IF ID EX MEM WB instr. i+3 IF ID EX MEM WB instr. i+4 IF ID EX MEM WB 4 MIPS pipelined architecture • Pipelining only works is one does not attempt to execute at the same time two different operations that use the same datapath resource: – as an instance, if the datapath has a single ALU, this cannot compute concurrently the effective address of a load and the subtraction of the operands in another instruction • Using reduced (simple) instructions (namely RISC) makes it fairly easy to determine at each time which datapath resources are free and which are busy. -
Brief History of Microprogramming
Microprogramming History -- Mark Smotherman A Brief History of Microprogramming Mark Smotherman Last updated: October 2012 Summary: Microprogramming is a technique to implement the control logic necessary to execute instructions within a processor. It relies on fetching low-level microinstructions from a control store and deriving the appropriate control signals as well as microprogram sequencing information from each microinstruction. Definitions and Example Although loose usage has sometimes equated the term "microprogramming" with the idea of "programming a microcomputer", this is not the standard definition. Rather, microprogramming is a systematic technique for implementing the control logic of a computer's central processing unit. It is a form of stored-program logic that substitutes for hardwired control circuitry. The central processing unit in a computer system is composed of a data path and a control unit. The data path includes registers, function units such as shifters and ALUs (arithmetic and logic units), internal processor busses and paths, and interface units for main memory and I/O busses. The control unit governs the series of steps taken by the data path during the execution of a user- visible instruction, or macroinstruction (e.g., load, add, store). Each action of the datapath is called a register transfer and involves the transfer of information within the data path, possibly including the transformation of data, address, or instruction bits by the function units. A register transfer is accomplished by gating out (sending) register contents onto internal processor busses, selecting the operation of ALUs, shifters, etc., through which that information might pass, and gating in (receiving) new values for one or more registers. -
Digital Design & Computer Arch. Lecture 17: Branch Prediction II
Digital Design & Computer Arch. Lecture 17: Branch Prediction II Prof. Onur Mutlu ETH Zürich Spring 2020 24 April 2020 Required Readings n This week q Smith and Sohi, “The Microarchitecture of Superscalar Processors,” Proceedings of the IEEE, 1995 q H&H Chapters 7.8 and 7.9 q McFarling, “Combining Branch Predictors,” DEC WRL Technical Report, 1993. 2 Recall: How to Handle Control Dependences n Critical to keep the pipeline full with correct sequence of dynamic instructions. n Potential solutions if the instruction is a control-flow instruction: n Stall the pipeline until we know the next fetch address n Guess the next fetch address (branch prediction) n Employ delayed branching (branch delay slot) n Do something else (fine-grained multithreading) n Eliminate control-flow instructions (predicated execution) n Fetch from both possible paths (if you know the addresses of both possible paths) (multipath execution) 3 Recall: Fetch Stage with BTB and Direction Prediction Direction predictor (taken?) taken? PC + inst size Next Fetch Address Program hit? Counter Address of the current branch target address Cache of Target Addresses (BTB: Branch Target Buffer) 4 Recall: More Sophisticated Direction Prediction n Compile time (static) q Always not taken q Always taken q BTFN (Backward taken, forward not taken) q Profile based (likely direction) q Program analysis based (likely direction) n Run time (dynamic) q Last time prediction (single-bit) q Two-bit counter based prediction q Two-level prediction (global vs. local) q Hybrid q Advanced algorithms -
POWER4 System Microarchitecture
by J. M. Tendler J. S. Dodson POWER4 J. S. Fields, Jr. H. Le system B. Sinharoy microarchitecture The IBM POWER4 is a new microprocessor commonplace. The RS64 and its follow-on organized in a system structure that includes microprocessors, the RS64-II [6], RS64-III [7], and RS64- new technology to form systems. The name IV [8], were optimized for commercial applications. The POWER4 as used in this context refers not RS64 initially appeared in systems operating at 125 MHz. only to a chip, but also to the structure used Most recently, the RS64-IV has been shipping in systems to interconnect chips to form systems. operating at up to 750 MHz. The POWER3 and its follow- In this paper we describe the processor on, the POWER3-II [9], were optimized for technical microarchitecture as well as the interconnection applications. Initially introduced at 200 MHz, most recent architecture employed to form systems up to systems using the POWER3-II have been operating at a 32-way symmetric multiprocessor. 450 MHz. The RS64 microprocessor and its follow-ons were also used in AS/400* systems (the predecessor Introduction to today’s eServer iSeries*). IBM announced the RISC System/6000* (RS/6000*, the POWER4 was designed to address both commercial and predecessor of today’s IBM eServer pSeries*) family of technical requirements. It implements and extends in a processors in 1990. The initial models ranged in frequency compatible manner the 64-bit PowerPC Architecture [10]. from 20 MHz to 30 MHz [1] and achieved performance First used in pSeries systems, it will be staged into the levels exceeding those of many of their contemporaries iSeries at a later date. -
Pipelining: Basic/ Intermediate Concepts and Implementation
Lecture 05: Pipelining: Basic/ Intermediate Concepts and Implementation CSE 564 Computer Architecture Summer 2017 Department of Computer Science and Engineering Yonghong Yan [email protected] www.secs.oakland.edu/~yan 1 Contents 1. Pipelining Introduction 2. The Major Hurdle of Pipelining—Pipeline Hazards 3. RISC-V ISA and its Implementation Reading: u Textbook: Appendix C u RISC-V ISA u Chisel Tutorial Pipelining: Its Natural! © Laundry Example © Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold A B C D u Washer takes 30 minutes u Dryer takes 40 minutes u “Folder” takes 20 minutes © One load: 90 minutes Sequential Laundry 6 PM 7 8 9 10 11 Midnight Time 30 40 20 30 40 20 30 40 20 30 40 20 T a s A k O B r d e r C D © Sequential laundry takes 6 hours for 4 loads © If they learned pipelining, how long would laundry take? Pipelined Laundry Start Work ASAP 6 PM 7 8 9 10 11 Midnight Time 30 40 40 40 40 20 T a Sequential laundry takes 6 hours for 4 loads s A k O r B d e r C D © Pipelined laundry takes 3.5 hours for 4 loads The Basics of a RISC Instruction Set (1/2) © RISC (Reduced Instruction Set Computer) architecture or load-store architecture: u All operations on data apply to data in register and typically change the entire register (32 or 64 bits per register). u The only operations that affect memory are load and store operation.