Processor-103

Total Page:16

File Type:pdf, Size:1020Kb

Processor-103 US005634119A United States Patent 19 11 Patent Number: 5,634,119 Emma et al. 45 Date of Patent: May 27, 1997 54) COMPUTER PROCESSING UNIT 4,691.277 9/1987 Kronstadt et al. ................. 395/421.03 EMPLOYING ASEPARATE MDLLCODE 4,763,245 8/1988 Emma et al. ........................... 395/375 BRANCH HISTORY TABLE 4,901.233 2/1990 Liptay ...... ... 395/375 5,185,869 2/1993 Suzuki ................................... 395/375 75) Inventors: Philip G. Emma, Danbury, Conn.; 5,226,164 7/1993 Nadas et al. ............................ 395/800 Joshua W. Knight, III, Mohegan Lake, Primary Examiner-Kenneth S. Kim N.Y.; Thomas R. Puzak. Ridgefield, Attorney, Agent, or Firm-Jay P. Sbrollini Conn.; James R. Robinson, Essex Junction, Vt.; A. James Wan 57 ABSTRACT Norstrand, Jr., Round Rock, Tex. A computer processing system includes a first memory that 73 Assignee: International Business Machines stores instructions belonging to a first instruction set archi Corporation, Armonk, N.Y. tecture and a second memory that stores instructions belong ing to a second instruction set architecture. An instruction (21) Appl. No.: 369,441 buffer is coupled to the first and second memories, for storing instructions that are executed by a processor unit. 22 Fied: Jan. 6, 1995 The system operates in one of two modes. In a first mode, instructions are fetched from the first memory into the 51 Int, C. m. G06F 9/32 instruction buffer according to data stored in a first branch 52 U.S. Cl. ....................... 395/587; 395/376; 395/586; history table. In the second mode, instructions are fetched 395/595; 395/597 from the second memory into the instruction buffer accord 58 Field of Search ............................ 395/375, 421.03, ing to data stored in a second branch history table. 395/800, 376,586, 587, 595,597 The first instruction set architecture may be system level 56 References Cited instructions and the second instruction set architecture may U.S. PATENT DOCUMENTS be millicodeinstructions that, for example, define a complex system level instruction and/or emulate a third instruction 3,702,007 10/1972 Davis, II ................................. 395/375 set architecture. 4.338,661 7/1982 Tredennicket al. .................... 395/375 4,477,872 10/1984 Losq ........................................ 395/375 4,679,141 7/1987 Pomerene et al. ...................... 395/375 40 Claims, 8 Drawing Sheets MEMORY SYSTEM MILLICODE ARRAY INSTRUCTION 05 BUFFER processor-103 CONDITIONAL BRANCH BRANCH RESOLUTION STORAGE LOGIC U.S. Patent May 27, 1997 Sheet 1 of 8 5,634,119 MEMORY SYSTEM MILLICODE ARRAY INSTRUCTION 05 BUFFER PROCESSOR CONDITIONAL BRANCH BRANCH RESOLUTION STORAGE LOGIC U.S. Patent May 27, 1997 Sheet 2 of 8 5,634,119 FIG. 2 FROM INSTRUCTION O3 BUFFERS 05 --------------- MILLICODE DETECTION LOGIC TO MHT/BHT STATIC ENABLE/ CONDITION DATA DISABLE er's TO KEEE ------ TO BRANCH RESOLUTION 308 LOGIC 2. U.S. Patent May 27, 1997 Sheet 3 of 8 5,634,119 FIG. 3 200 INSTRUCTION SET ARCHITECTURE 208 202 HARDWARE 20 INSTRUCTIONS MILLICODE INSTRUCTIONS 204 214 U.S. Patent May 27, 1997 Sheet 4 of 8 5,634,119 FIG. 4 2 3 4 5 BRANCH TRAP GW INSTR INSTR BS, TARGET re- INSTR LOCATION ADDRESS "it OCATION centee BA v opera oedeo O2E7A BC o2E70 7E LA 182 LTR o2E180 184 BC v O2EDE O2EDE | T | | | | | 02EDo Ee ec voteos oeso cate08 NI o2200 20c w v 20 LH | | | | | 02E210 24 N 28 STH | | | | | ec 20 oeeeeo 22 w | | | | 22e TM 220 BC v 02:223 o2E234 LH | | | | | 02E230 U.S. Patent May 27, 1997 Sheet 5 of 8 5,634,119 FIG. 5 2 3 4 5 INSTR TAKEN BRANCH GW LOCATION eINSTR BRANCHESSTARGET LOCATIONINSTR OED20 c. v OEF7c OEE20 OEF7c B | | | | OEF170 180 x OEF180 182 u 184 to v OEF38c OEF38c OEF380 U.S. Patent 5,634,119 ?I,IX|081+00 M?NIHIIM SNdWH1 U.S. Patent May 27, 1997 Sheet 8 of 8 5,634,119 FIG. 8 26 SM MMMM FIG. 9 SC3 SC2 SC SCO 5,634,119 1. 2 COMPUTER PROCESSING UNIT termined size, typically from 1024 to 4096 entries. Entries EMPLOYNG ASEPARATE MILLCODE are made only for taken branches as they are encountered. BRANCH HISTORY TABLE When the table is full, adding a new entry requires displac ing an older entry. This may be accomplished by a Least BACKGROUND OF THE INVENTION 5 Recently Used (LRU) algorithm as in caches. 1. Technical Field In principle, each branch in the stream of instructions being executed is looked up in the table, by its address, and The invention relates to computer processor units and, if it is found, its target is fetched and becomes the next more particularly, to controlling the fetching of instructions instruction in the stream. If the branch is not in the table, it in computer processing units. 10 is presumed not taken. As the execution of the branch 2. Description of the Prior Art instructions proceeds, the table is updated accordingly. If a Many of today's computer processing units (CPUs) and branch predicted to be taken is not taken, the associated table microprocessors have a Complex Instruction Set Computer entry is deleted. If a branch predicted not to be taken is (CISC) Architecture. For example, the architectures of the taken, a new entry is made for it. If the predicted target IBM System 390 and the Intel X86 and Pentium micropro 15 address is wrong, the corrected address is entered. cessors are classified as CISC Architectures. The instruction However, using the system level branch history table for set of a CISC architecture system include both simple both the system level instructions and the millicode instruc instructions (e.g. Load, or Add) and complex instructions tions is inefficient and may not improve performance due to (e.g. Edit and Load, Start Interpretive Execution or a number of factors. One of these factors, for example, is Diagnose). 20 based on the observation that millicode instructions will not Conventionally, the complex functions of the CISC archi be executed very often. Thus, when millicode instructions tecture are implemented in microcode because building are encountered, most branch entries created in the system hardware execution units to execute the complex functions level branch history table for the millicode instructions will is expensive and error prone. Additionally, implementing have been overwritten. An additional factor is that utilizing complex instructions in microcode provides flexibility to fix 25 the branch history table for millicode branches also dis problems and expandability in that additional functions can places system level branchinstructions. Thus, in some cases, be included later. However, as the number of complex performance may actually be lost by using the conventional instructions implemented in microcode grows larger and system level branch history table to predictmillicode branch larger, the cycle time of the hardware execution units in Outcomes. executing the microcode instructions and the real estate 30 required for the microcode itself becomes a limiting factor SUMMARY OF THE INVENTION in the performance of the processor/CPU. The above-stated problems and related problems of the To solve this problem, designers have implemented mil prior art are solved with the principles of the present licode routines that execute such complex functions. As 35 invention, computer processor unit employing a millicode shown in U.S. Pat. No. 5.226,164 to Nadas et al., the branch history table. The computer processing unit of the millicode routines may be stored in a separate millicode present invention includes a first memory that stores instruc array. The millicode routines may include standard system tions belonging to a first set of instructions and a second architecture instructions and/or additional hardware instruc memory that stores instructions belonging to a second set of tions that are added to improve performance. The set of 40 instructions. An instruction buffer is coupled to the first and additional hardware instructions stored in the millicode second memories, for storing instructions that are executed array may be considered to be an alternate architecture that by a processor unit. The system operates in one of two the computer processing unit can operate. modes. In a first mode, instructions are fetched from the first Additionally, the millicode routines may be used to emu memory into the instruction buffer according to data stored late a second instruction set architecture. In this case, the 45 in a first table. In the second mode, instructions are fetched additional hardware instructions stored in the millicode from the second memory into the instruction buffer accord array emulate instructions from a second instruction set ing to data stored in a second table. architecture. The first and second set of instructions typically include While the above-described system provides increased at least one branch type instruction. In the illustrative flexibility and processing speed, it leaves a number of 50 embodiment, the first table is a branch history table associ problems unsolved. One problem relates to the manipulation ated with the first set of instructions and the second table is of millicode branch instructions. In many instances, it may a branch history table associated with the second set of be desirable to predict the outcome of the millicode branch instructions. The first set of instructions may be system level instructions such that the proper instruction (next sequential instructions and the second set of instructions may be or target) can be fetched before or at least as soon as it is 55 millicode instructions that for example, define a complex needed, so that no delay occurs in executing the millicode system level instruction and/or emulate a second instruction instruction stream. set architecture. An effective strategy for predicting the outcome of branch instructions at the system level is embodied in U.S.
Recommended publications
  • The 32-Bit PA-RISC Run-Time Architecture Document, V. 1.0 for HP
    The 32-bit PA-RISC Run-time Architecture Document HP-UX 11.0 Version 1.0 (c) Copyright 1997 HEWLETT-PACKARD COMPANY. The information contained in this document is subject to change without notice. HEWLETT-PACKARD MAKES NO WARRANTY OF ANY KIND WITH REGARD TO THIS MATERIAL, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Hewlett-Packard shall not be liable for errors contained herein or for incidental or consequential damages in connection with furnishing, performance, or use of this material. Hewlett-Packard assumes no responsibility for the use or reliability of its software on equipment that is not furnished by Hewlett-Packard. This document contains proprietary information which is protected by copyright. All rights are reserved. No part of this document may be photocopied, reproduced, or translated to another language without the prior written consent of Hewlett-Packard Company. CSO/STG/STD/CLO Hewlett-Packard Company 11000 Wolfe Road Cupertino, California 95014 By The Run-time Architecture Team 32-bit PA-RISC RUN-TIME ARCHITECTURE DOCUMENT 11.0 version 1.0 -1 -2 CHAPTER 1 Introduction 7 1.1 Target Audiences 7 1.2 Overview of the PA-RISC Runtime Architecture Document 8 CHAPTER 2 Common Coding Conventions 9 2.1 Memory Model 9 2.1.1 Text Segment 9 2.1.2 Initialized and Uninitialized Data Segments 9 2.1.3 Shared Memory 10 2.1.4 Subspaces 10 2.2 Register Usage 10 2.2.1 Data Pointer (GR 27) 10 2.2.2 Linkage Table Register (GR 19) 10 2.2.3 Stack Pointer (GR 30) 11 2.2.4 Space
    [Show full text]
  • Subchapter 2.4–Hp Server Rp5400 Series
    Chapter 2 hp server rp5400 series Subchapter 2.4—hp server rp5400 series hp server rp5470 Table 2.4.1 HP Server rp5470 Specifications Server model number rp5470 Max. Network Interface Cards (cont.)–see supported I/O table Server product number A6144B ATM 155 Mb/s–MMF 10 Number of Processors 1-4 ATM 155 Mb/s–UTP5 10 Supported Processors ATM 622 Mb/s–MMF 10 PA-RISC PA-8700 Processor @ 650 and 750 MHz 802.5 Token Ring 4/16/100 Mb/s 10 Cache–Instr/data per CPU (KB) 750/1500 Dual port X.25/SDLC/FR 10 Floating Point Coprocessor included Yes Quad port X.25/FR 7 FDDI 10 Max. Additional Interface Cards–see supported I/O table 8 port Terminal Multiplexer 4 64 port Terminal Multiplexer 10 PA-RISC PA-8600 Processor @ 550 MHz Graphics/USB kit 1 kit (2 cards) Cache–Instr/data/CPU (KB) 512/1024 Public Key Cryptography 10 Floating Point Coprocessor included Yes HyperFabric 7 Electrical Characteristics TPM estimate (4 CPUs) 34,500 AC Input power 100-240V 50/60 Hz SPECweb99 (4 CPUs) 3,750 Hotswap Power supplies 2 included, 3rd for N+1 Redundant AC power inputs 2 required, 3rd for N+1 Min. memory 256 MB Current requirements at 200V 6.5 A (shared across inputs) Max. memory capacity 16 GB Typical Power dissipation (watts) 1008 W Internal Disks Maximum Power dissipation (watts) 1 1360 W Max. disk mechanisms 4 Power factor at full load .98 Max. disk capacity 292 GB kW rating for UPS loading1 1.3 Standard Integrated I/O Maximum Heat dissipation (BTUs/hour) 1 4380 - (3000 typical) Ultra2 SCSI–LVD Yes Site Preparation 10/100Base-T (RJ-45 connector) Yes Site planning and installation included No RS-232 serial ports (multiplexed from DB-25 port) 3 Depth (mm/inches) 774 mm/30.5 Web Console (including 10Base-T port) Yes Width (mm/inches) 482 mm/19 I/O buses and slots Rack Height (EIA/mm/inches) 7 EIA/311/12.25 Total PCI Slots (supports 66/33 MHz×64/32 bits) 10 Deskside Height (mm/inches) 368 mm/14.5 2 Hot-Plug Twin-Turbo (500 MB/s) and 6 Hot-Plug Turbo slots (250 MB/s) Weight (kg/lbs) Max.
    [Show full text]
  • Analysis of GPGPU Programs for Data-Race and Barrier Divergence
    Analysis of GPGPU Programs for Data-race and Barrier Divergence Santonu Sarkar1, Prateek Kandelwal2, Soumyadip Bandyopadhyay3 and Holger Giese3 1ABB Corporate Research, India 2MathWorks, India 3Hasso Plattner Institute fur¨ Digital Engineering gGmbH, Germany Keywords: Verification, SMT Solver, CUDA, GPGPU, Data Races, Barrier Divergence. Abstract: Todays business and scientific applications have a high computing demand due to the increasing data size and the demand for responsiveness. Many such applications have a high degree of parallelism and GPGPUs emerge as a fit candidate for the demand. GPGPUs can offer an extremely high degree of data parallelism owing to its architecture that has many computing cores. However, unless the programs written to exploit the architecture are correct, the potential gain in performance cannot be achieved. In this paper, we focus on the two important properties of the programs written for GPGPUs, namely i) the data-race conditions and ii) the barrier divergence. We present a technique to identify the existence of these properties in a CUDA program using a static property verification method. The proposed approach can be utilized in tandem with normal application development process to help the programmer to remove the bugs that can have an impact on the performance and improve the safety of a CUDA program. 1 INTRODUCTION ans that the program will never end-up in an errone- ous state, or will never stop functioning in an arbitrary With the order of magnitude increase in computing manner, is a well-known and critical property that an demand in the business and scientific applications, de- operational system should exhibit (Lamport, 1977).
    [Show full text]
  • Evolving GPU Machine Code
    Journal of Machine Learning Research 16 (2015) 673-712 Submitted 11/12; Revised 7/14; Published 4/15 Evolving GPU Machine Code Cleomar Pereira da Silva [email protected] Department of Electrical Engineering Pontifical Catholic University of Rio de Janeiro (PUC-Rio) Rio de Janeiro, RJ 22451-900, Brazil Department of Education Development Federal Institute of Education, Science and Technology - Catarinense (IFC) Videira, SC 89560-000, Brazil Douglas Mota Dias [email protected] Department of Electrical Engineering Pontifical Catholic University of Rio de Janeiro (PUC-Rio) Rio de Janeiro, RJ 22451-900, Brazil Cristiana Bentes [email protected] Department of Systems Engineering State University of Rio de Janeiro (UERJ) Rio de Janeiro, RJ 20550-013, Brazil Marco Aur´elioCavalcanti Pacheco [email protected] Department of Electrical Engineering Pontifical Catholic University of Rio de Janeiro (PUC-Rio) Rio de Janeiro, RJ 22451-900, Brazil Leandro Fontoura Cupertino [email protected] Toulouse Institute of Computer Science Research (IRIT) University of Toulouse 118 Route de Narbonne F-31062 Toulouse Cedex 9, France Editor: Una-May O'Reilly Abstract Parallel Graphics Processing Unit (GPU) implementations of GP have appeared in the lit- erature using three main methodologies: (i) compilation, which generates the individuals in GPU code and requires compilation; (ii) pseudo-assembly, which generates the individuals in an intermediary assembly code and also requires compilation; and (iii) interpretation, which interprets the codes. This paper proposes a new methodology that uses the concepts of quantum computing and directly handles the GPU machine code instructions. Our methodology utilizes a probabilistic representation of an individual to improve the global search capability.
    [Show full text]
  • Readingsample
    Embedded Robotics Mobile Robot Design and Applications with Embedded Systems Bearbeitet von Thomas Bräunl Neuausgabe 2008. Taschenbuch. xiv, 546 S. Paperback ISBN 978 3 540 70533 8 Format (B x L): 17 x 24,4 cm Gewicht: 1940 g Weitere Fachgebiete > Technik > Elektronik > Robotik Zu Inhaltsverzeichnis schnell und portofrei erhältlich bei Die Online-Fachbuchhandlung beck-shop.de ist spezialisiert auf Fachbücher, insbesondere Recht, Steuern und Wirtschaft. Im Sortiment finden Sie alle Medien (Bücher, Zeitschriften, CDs, eBooks, etc.) aller Verlage. Ergänzt wird das Programm durch Services wie Neuerscheinungsdienst oder Zusammenstellungen von Büchern zu Sonderpreisen. Der Shop führt mehr als 8 Millionen Produkte. CENTRAL PROCESSING UNIT . he CPU (central processing unit) is the heart of every embedded system and every personal computer. It comprises the ALU (arithmetic logic unit), responsible for the number crunching, and the CU (control unit), responsible for instruction sequencing and branching. Modern microprocessors and microcontrollers provide on a single chip the CPU and a varying degree of additional components, such as counters, timing coprocessors, watchdogs, SRAM (static RAM), and Flash-ROM (electrically erasable ROM). Hardware can be described on several different levels, from low-level tran- sistor-level to high-level hardware description languages (HDLs). The so- called register-transfer level is somewhat in-between, describing CPU compo- nents and their interaction on a relatively high level. We will use this level in this chapter to introduce gradually more complex components, which we will then use to construct a complete CPU. With the simulation system Retro [Chansavat Bräunl 1999], [Bräunl 2000], we will be able to actually program, run, and test our CPUs.
    [Show full text]
  • MT-088: Analog Switches and Multiplexers Basics
    MT-088 TUTORIAL Analog Switches and Multiplexers Basics INTRODUCTION Solid-state analog switches and multiplexers have become an essential component in the design of electronic systems which require the ability to control and select a specified transmission path for an analog signal. These devices are used in a wide variety of applications including multi- channel data acquisition systems, process control, instrumentation, video systems, etc. Switches and multiplexers of the late 1960s were designed with discrete MOSFET devices and were manufactured in small PC boards or modules. With the development of CMOS processes (yielding good PMOS and NMOS transistors on the same substrate), switches and multiplexers rapidly gravitated to integrated circuit form in the mid-1970s, with product introductions such as the Analog Devices' popular AD7500-series (introduced in 1973). A dielectrically-isolated family of these parts introduced in 1976 allowed input overvoltages of ± 25 V (beyond the supply rails) and was insensitive to latch-up. These early CMOS switches and multiplexers were typically designed to handle signal levels up to ±10 V while operating on ±15-V supplies. In 1979, Analog Devices introduced the popular ADG200-series of switches and multiplexers, and in 1988 the ADG201-series was introduced which was fabricated on a proprietary linear-compatible CMOS process (LC2MOS). These devices allowed input signals to ±15 V when operating on ±15-V supplies. A large number of switches and multiplexers were introduced in the 1980s and 1990s, with the trend toward lower on-resistance, faster switching, lower supply voltages, lower cost, lower power, and smaller surface-mount packages. Today, analog switches and multiplexers are available in a wide variety of configurations, options, etc., to suit nearly all applications.
    [Show full text]
  • A Remotely Accessible Network Processor-Based Router for Network Experimentation
    A Remotely Accessible Network Processor-Based Router for Network Experimentation Charlie Wiseman, Jonathan Turner, Michela Becchi, Patrick Crowley, John DeHart, Mart Haitjema, Shakir James, Fred Kuhns, Jing Lu, Jyoti Parwatikar, Ritun Patney, Michael Wilson, Ken Wong, David Zar Department of Computer Science and Engineering Washington University in St. Louis fcgw1,jst,mbecchi,crowley,jdd,mah5,scj1,fredk,jl1,jp,ritun,mlw2,kenw,[email protected] ABSTRACT Keywords Over the last decade, programmable Network Processors Programmable routers, network testbeds, network proces- (NPs) have become widely used in Internet routers and other sors network components. NPs enable rapid development of com- plex packet processing functions as well as rapid response to 1. INTRODUCTION changing requirements. In the network research community, the use of NPs has been limited by the challenges associ- Multi-core Network Processors (NPs) have emerged as a ated with learning to program these devices and with using core technology for modern network components. This has them for substantial research projects. This paper reports been driven primarily by the industry's need for more flexi- on an extension to the Open Network Laboratory testbed ble implementation technologies that are capable of support- that seeks to reduce these \barriers to entry" by providing ing complex packet processing functions such as packet clas- a complete and highly configurable NP-based router that sification, deep packet inspection, virtual networking and users can access remotely and use for network experiments. traffic engineering. Equally important is the need for sys- The base router includes support for IP route lookup and tems that can be extended during their deployment cycle to general packet filtering, as well as a flexible queueing sub- meet new service requirements.
    [Show full text]
  • The 32-Bit PA-RISC Run- Time Architecture Document
    The 32-bit PA-RISC Run- time Architecture Document HP-UX 10.20 Version 3.0 (c) Copyright 1985-1997 HEWLETT-PACKARD COMPANY. The information contained in this document is subject to change without notice. HEWLETT-PACKARD MAKES NO WARRANTY OF ANY KIND WITH REGARD TO THIS MATERIAL, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OFMERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Hewlett-Packard shall not be liable for errors contained herein or for incidental or consequential damages in connection with furnishing, performance, or use of this material. Hewlett-Packard assumes no responsibility for the use or reliability of its software on equipment that is not furnished by Hewlett-Packard. This document contains proprietary information which is protected by copyright. All rights are reserved. No part of this document may be photocopied, reproduced, or translated to another language without the prior written consent of Hewlett- Packard Company. CSO/STG/STD/CLO Hewlett-Packard Company 11000 Wolfe Road Cupertino, California 95014 By The Run-time Architecture Team CHAPTER 1 Introduction This document describes the runtime architecture for PA-RISC systems running either the HP-UX or the MPE/iX operating system. Other operating systems running on PA-RISC may also use this runtime architecture or a variant of it. The runtime architecture defines all the conventions and formats necessary to compile, link, and execute a program on one of these operating systems. Its purpose is to ensure that object modules produced by many different compilers can be linked together into a single application, and to specify the interfaces between compilers and linker, and between linker and operating system.
    [Show full text]
  • Design of the RISC-V Instruction Set Architecture
    Design of the RISC-V Instruction Set Architecture Andrew Waterman Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2016-1 http://www.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-1.html January 3, 2016 Copyright © 2016, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. Design of the RISC-V Instruction Set Architecture by Andrew Shell Waterman A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor David Patterson, Chair Professor Krste Asanovi´c Associate Professor Per-Olof Persson Spring 2016 Design of the RISC-V Instruction Set Architecture Copyright 2016 by Andrew Shell Waterman 1 Abstract Design of the RISC-V Instruction Set Architecture by Andrew Shell Waterman Doctor of Philosophy in Computer Science University of California, Berkeley Professor David Patterson, Chair The hardware-software interface, embodied in the instruction set architecture (ISA), is arguably the most important interface in a computer system. Yet, in contrast to nearly all other interfaces in a modern computer system, all commercially popular ISAs are proprietary.
    [Show full text]
  • A Lisp Oriented Architecture by John W.F
    A Lisp Oriented Architecture by John W.F. McClain Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degrees of Master of Science in Electrical Engineering and Computer Science and Bachelor of Science in Electrical Engineering at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 1994 © John W.F. McClain, 1994 The author hereby grants to MIT permission to reproduce and to distribute copies of this thesis document in whole or in part. Signature of Author ...... ;......................... .............. Department of Electrical Engineering and Computer Science August 5th, 1994 Certified by....... ......... ... ...... Th nas F. Knight Jr. Principal Research Scientist 1,,IA £ . Thesis Supervisor Accepted by ....................... 3Frederic R. Morgenthaler Chairman, Depattee, on Graduate Students J 'FROM e ;; "N MfLIT oARIES ..- A Lisp Oriented Architecture by John W.F. McClain Submitted to the Department of Electrical Engineering and Computer Science on August 5th, 1994, in partial fulfillment of the requirements for the degrees of Master of Science in Electrical Engineering and Computer Science and Bachelor of Science in Electrical Engineering Abstract In this thesis I describe LOOP, a new architecture for the efficient execution of pro- grams written in Lisp like languages. LOOP allows Lisp programs to run at high speed without sacrificing safety or ease of programming. LOOP is a 64 bit, long in- struction word architecture with support for generic arithmetic, 64 bit tagged IEEE floats, low cost fine grained read and write barriers, and fast traps. I make estimates for how much these Lisp specific features cost and how much they may speed up the execution of programs written in Lisp.
    [Show full text]
  • Abatement of Area and Power in Multiplexer Based on Cordic Using
    INTERNATIONAL JOURNAL OF SCIENTIFIC PROGRESS AND RESEARCH (IJSPR) ISSN: 2349-4689 Volume-10, Number - 02, 2015 Abatement of Area and Power In Multiplexer Based on Cordic Using Cadence Tool Uma P.1, AnuVidhya G.2 VLSI Design, Karpaga Vinayaga College of Engineering and Technolog, G.S.T Road,Chinna Kolambakkam-603308 Abstract – CORDIC is an iterative Algorithm to perform a wide The CORDIC Algorithm is a unified computational scheme range of functions including vector rotations, certain to perform trigonometric, hyperbolic, linear and logarithmic functions. Both non pipelined and 2 level pipelined CORDIC with 8 stages, using 1. Computations of the trigonometric functions: sin, two schemes was performed. First scheme was original unrolled cos and arctan. CORDIC and second scheme was MUX based pipelined unrolled CORDIC. Compared to first scheme, the second scheme is more 2. Computations of the hyperbolic trigonometric reliable, since the second scheme uses multiplexer and registers. functions: sinh, cosh and arctanh. By adding multiplexer the area is reduced comparatively to the 3. It also compute the exponential function, the first architecture, since the first scheme uses only addition, natural logarithm and the square root. subtraction and shifting operation in all the 8 stages.8 iterations 4. Multiplication and division. are performed and it is implemented on QUARTUS II software. For future work, the number of iterations can be increased and CORDIC revolves around the idea of "rotating" the phase of also increase the bit size. This can be implemented in (digital) a complex number, by multiplying it by a succession of CADENCE software. constant values. Keywords: CORDIC, rotation mode, multiplexer, pipelining, QUARTUS.
    [Show full text]
  • Graphical Microcode Simulator with a Reconfigurable Datapath
    Rochester Institute of Technology RIT Scholar Works Theses 12-11-2006 Graphical microcode simulator with a reconfigurable datapath Brian VanBuren Follow this and additional works at: https://scholarworks.rit.edu/theses Recommended Citation VanBuren, Brian, "Graphical microcode simulator with a reconfigurable datapath" (2006). Thesis. Rochester Institute of Technology. Accessed from This Thesis is brought to you for free and open access by RIT Scholar Works. It has been accepted for inclusion in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact [email protected]. Graphical Microcode Simulator with a Reconfigurable Datapath by Brian G VanBuren A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Engineering Supervised by Associate Professor Dr. Muhammad Shaaban Department of Computer Engineering Kate Gleason College of Engineering Rochester Institute of Technology Rochester, New York August 2006 Approved By: Dr. Muhammad Shaaban Associate Professor Primary Adviser Dr. Roy Czernikowski Professor, Department of Computer Engineering Dr. Roy Melton Visiting Assistant Professor, Department of Computer Engineering Thesis Release Permission Form Rochester Institute of Technology Kate Gleason College of Engineering Title: Graphical Microcode Simulator with a Reconfigurable Datapath I, Brian G VanBuren, hereby grant permission to the Wallace Memorial Library repor- duce my thesis in whole or part. Brian G VanBuren Date Dedication To my son. iii Acknowledgments I would like to thank Dr. Shaaban for all his input and desire to have an update microcode simulator. I would like to thank Dr. Czernikowski for his support and methodical approach to everything. I would like to thank Dr.
    [Show full text]