Implementation of the VAX)

Total Page:16

File Type:pdf, Size:1020Kb

Implementation of the VAX) Microarchitecture Choices (Implementation of the VAX) Yale N. Patt Electrical Engineering and Computer Science University of Michigan, Ann Arbor 48109-2122 ABSTRACT tics all provide challenges to the microarchitect. The VAX architecture was introduced in The VAX Architecture provides hardware imple- 1977 with its first microarchitecture, the VAX mentors with an opportunity or a nightmare, 11/780, a TTL MS1 implementation. Several depending on your point of view. Such chamc- features of the 780 clearly come from the fact that teristics as 304 opcodes, a large number of (1) it was the first implementation, (2) it was addressing modes, a large number of supported made out of ‘ITL MS1 parts, and (3) it was data types, and non-regularities in the ISA seman- intended to be fast. Since then, there have been tics all provide challenges to the microarchitect. several distinct implementations, each reflecting The VAX architecture was introduced in 1977 (1) the technology in which it was implemented, with its first microarchitecture, the VAX 11/780, (2) the performance/cost tradeoffs it was sup- a ‘ITL MS1 implementation. Since then, there posed to consider, and (3) the design methodol- have been several distinct implementations, each ogy of its implementers. reflecting (1) the technology in which it was implemented, (2) the performance/cost tradeoffs This paper is a frrst attempt at discussing it was supposed to consider, and (3) the design several VAX implementations from the stand- methodology of its implementors. This paper is a point of the choices made in the microarchitecture tirst attempt at discussing several VAX imple- as driven by the context of the device technology, mentations from the standpoint of the choices the performance/costtradeoffs, and other relevant made in the microarchitecture as driven by the considerations. context of the device technology, the The paper is organized in five sections. performance/cost tradeoffs, and other considera- Section 2 describes the VAX architecture and tions. several common aspects of its implementations. With respect to the architecture, we focus on the “stuff” that every VAX implementation must deal with. Section 3 discussesthe implementations of 1. Introduction the early machines, the 11/780 and the 11/750. Section 4 discussesthe more recent higher perfor- The VAX Architecture provides hardware mance ECL versions, the 8800 and the 8600. implementors with an opportunity or a nightmare, Section 5 offers some concluding remarks. depending on your point of view. Such charac- teristics as 304 opcodes, a large number of addressing modes, a large number of supported 2. Architecture and Implementation data types, and non-regularities in the ISA seman- 2.1. The Architecture Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, The VAX architecture was introduced in the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for 1977 by Digital Equipment Corporation. It Computing Machinery. To copy otherwise, or to republish, requires a fee clearly had as design priorities extension of the and/or specific permission. 0 1990 ACM 089791-3469/90/0002/0213 $1.50 address space to 32 bits, dense coding of the 213 instruction stream, user-friendliness with respect merit, which is required by real-time applications. to software developers, support for general Pur- Support for multiprogramming and multiprocess- pose data processing, and support for multipro- ing include a number of constructs including four gramming and multiprocessing. levels of privilege, five levels of priority, and the These design objectives resulted in the LDPCTX, SVPCTX, PROBE, CKMx, and locked specification of 244 (now, 304) opcodes, nine (or instructions. 13, or 21, or over 40 depending on how one counts) different, and for the most part orthogo- 2.2. Implementations nal, addressing modes, and more than a dozen The richness of the instruction set architec- different data types, including the doubly-linked ture, as described above, has provided, as stated list [l]. The semantics of instructions specified in the introduction, an opportunity or a nightmare that they be variable-length and byte-aligned, for hardware developers, depending on your point each containing the number of operands appropri- of view. An architecture, such as the VAX, ate to that opcode. Information regarding the admits many microarchitectures, depending on number of operands and the accesstype and data the technology available, the cost/performance type of each is part of the semantics of the formula that one is operating under, and the exe- opcode. Addressing modes, which are specified cution model that one adopts. by a variable number of bytes, are for the most It is also the case that, given the set of exe- part independent of the opcode, introducing cution models adopted in the past, certain of the further variability in the length of the VAX characteristics of the VAX almost demand the instruction. inclusion of the one appropriate mechanism for The architecture supports a virtual address handling that characteristic. space of 2**32 bytes, with mapping indirectly All VAXes are microcoded. The richness through two page tables. Unaligned memory of the instruction set urges that the flexibility of accesses, which were forbidden on the earlier microcoded control be employed, notwithstand- PDP 11, were allowed on the VAX. Other sup- ing the conventional mythology that hardwired port for software development included protec- control is somehow faster than microcode. It is tion against ill-advised uses of operands and instructive to point out that (1) hardwired control addressing modes. produces higher performance execution only in There are several examples of code density situations where the critical path is in the in the instruction set architecture. Many opcodes microsequencing function, and (2) that this specify the execution of multiple operations, both should not occur in VAX implementations if one in the instruction’s address evaluation phase and designs with the well-understood (to microarchi- in its execution phase. Many data types have tects) technique that the next control store address optional storage requirements; for example, must be obtained from information available at integers can be eight, 16, 32, etc. bits long, the start of the current microcycle. A variation of numeric character strings can be coded in ASCII this basic old technique is the recently popular- or in packed decimal, etc. Displacement address- ized delayed branch present in many ISA archi- ing modes can have byte, word, or longword tectures introduced in the last few years. offsets. Immediate operands can be specified in The orthogonal&y of the instruction set, the straight- forward way introduced in the PDP resulting in variable length instructions (from one 11, or using fewer bits, as short literals. to more than 50 bytes), the semantics of the Support for general purpose data process- autoincrement and autodecrement addressing ing included the FPD bit, a substantial number of modes, and the use of memory operands in the instructions specifically targeted to handle the execution phase of the instruction all contribute to needs of both the COBOL and FORTRAN com- the need for a back-up mechanism in the microar- pilers, and constructs to support both multipro- chitecture. The result is a back-up PC and an gramming and multiprocessing. The FPD bit RLOG stack for undoing autoincrements and allows long iterative commercial instructions to autodecrementsperformed during addressevalua- coexist with a short interrupt latency figure of tion. 214 The orthogonality of the instruction set and relied on greater use of regular structures such as the variable number of operands also demand buses. some mechanism for accessing the VAX instruc- The 780 used a smaller register file, using it tion stream many times during the execution of only for internal temporaries and target machine each instruction. The result is a decoding struc- general purpose registers, preferring to implement ture with multiple entry points to the microcode separately (and at increased performance) the for each VAX instruction, depending on the internal processor registers and the constants operand being processed, its addressing mode, needed in the data path. The 750 used a larger access type, and data type. On the 780, it is register file, preferring to use it for internal pro- called the DECODE ROM, on the 8600, the cessorregisters as well. DRAM. The 780 included no provision for introduc- The use of memory operands during the ing new constants into the data path. The 750 operate phase of an instruction, the frequency of provided two immediate formats in its microin- occurrence of those memory operands, and the struction for allowing nine bit and 32 bit con- inordinate amount of time required to access a stants to be included within the microcode. PTE (indirectly through two page tables) demand a faster way to perform virtual to physical address The 780 allowed a microsubroutine translation. The result is the Translation Buffer, a RETURN to have a destination which is offset by cache of most recently used BTEs. A common some Hamming Distance from the control store implementation technique is to make the Transla- address which contained the CALL. The 750 tion Buffer as large as possible. allowed this offset to an arbitrary distance. The 780 mechanism is faster, but less flexible; the 750 Beyond these implementation structures, mechanism is slower, but more general. there is room for divergence. The 780 had a bug in its microinstruction definition whereby the microcode could not in the 3. The Early Machines same microinstruction perform a microsubroutine CALL and use the target machine for determining The first implementation of the VAX was the next control store address. The 750 corrected the VAX 11/780, made from Schottky TTL MS1 the bug by encoding the two microorders in parts, and introduced in 1977.
Recommended publications
  • 18-447 Computer Architecture Lecture 6: Multi-Cycle and Microprogrammed Microarchitectures
    18-447 Computer Architecture Lecture 6: Multi-Cycle and Microprogrammed Microarchitectures Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 1/28/2015 Agenda for Today & Next Few Lectures n Single-cycle Microarchitectures n Multi-cycle and Microprogrammed Microarchitectures n Pipelining n Issues in Pipelining: Control & Data Dependence Handling, State Maintenance and Recovery, … n Out-of-Order Execution n Issues in OoO Execution: Load-Store Handling, … 2 Reminder on Assignments n Lab 2 due next Friday (Feb 6) q Start early! n HW 1 due today n HW 2 out n Remember that all is for your benefit q Homeworks, especially so q All assignments can take time, but the goal is for you to learn very well 3 Lab 1 Grades 25 20 15 10 5 Number of Students 0 30 40 50 60 70 80 90 100 n Mean: 88.0 n Median: 96.0 n Standard Deviation: 16.9 4 Extra Credit for Lab Assignment 2 n Complete your normal (single-cycle) implementation first, and get it checked off in lab. n Then, implement the MIPS core using a microcoded approach similar to what we will discuss in class. n We are not specifying any particular details of the microcode format or the microarchitecture; you can be creative. n For the extra credit, the microcoded implementation should execute the same programs that your ordinary implementation does, and you should demo it by the normal lab deadline. n You will get maximum 4% of course grade n Document what you have done and demonstrate well 5 Readings for Today n P&P, Revised Appendix C q Microarchitecture of the LC-3b q Appendix A (LC-3b ISA) will be useful in following this n P&H, Appendix D q Mapping Control to Hardware n Optional q Maurice Wilkes, “The Best Way to Design an Automatic Calculating Machine,” Manchester Univ.
    [Show full text]
  • A 4.7 Million-Transistor CISC Microprocessor
    Auriga2: A 4.7 Million-Transistor CISC Microprocessor J.P. Tual, M. Thill, C. Bernard, H.N. Nguyen F. Mottini, M. Moreau, P. Vallet Hardware Development Paris & Angers BULL S.A. 78340 Les Clayes-sous-Bois, FRANCE Tel: (+33)-1-30-80-7304 Fax: (+33)-1-30-80-7163 Mail: [email protected] Abstract- With the introduction of the high range version of parallel multi-processor architecture. It is used in a family of the DPS7000 mainframe family, Bull is providing a processor systems able to handle up to 24 such microprocessors, which integrates the DPS7000 CPU and first level of cache on capable to support 10 000 simultaneously connected users. one VLSI chip containing 4.7M transistors and using a 0.5 For the development of this complex circuit, a system level µm, 3Mlayers CMOS technology. This enhanced CPU has design methodology has been put in place, putting high been designed to provide a high integration, high performance emphasis on high-level verification issues. A lot of home- and low cost systems. Up to 24 such processors can be made CAD tools were developed, to meet the stringent integrated in a single system, enabling performance levels in performance/area constraints. In particular, an integrated the range of 850 TPC-A (Oracle) with about 12 000 Logic Synthesis and Formal Verification environment tool simultaneously active connections. The design methodology has been developed, to deal with complex circuitry issues involved massive use of formal verification and symbolic and to enable the designer to shorten the iteration loop layout techniques, enabling to reach first pass right silicon on between logical design and physical implementation of the several foundries.
    [Show full text]
  • Embedded Multi-Core Processing for Networking
    12 Embedded Multi-Core Processing for Networking Theofanis Orphanoudakis University of Peloponnese Tripoli, Greece [email protected] Stylianos Perissakis Intracom Telecom Athens, Greece [email protected] CONTENTS 12.1 Introduction ............................ 400 12.2 Overview of Proposed NPU Architectures ............ 403 12.2.1 Multi-Core Embedded Systems for Multi-Service Broadband Access and Multimedia Home Networks . 403 12.2.2 SoC Integration of Network Components and Examples of Commercial Access NPUs .............. 405 12.2.3 NPU Architectures for Core Network Nodes and High-Speed Networking and Switching ......... 407 12.3 Programmable Packet Processing Engines ............ 412 12.3.1 Parallelism ........................ 413 12.3.2 Multi-Threading Support ................ 418 12.3.3 Specialized Instruction Set Architectures ....... 421 12.4 Address Lookup and Packet Classification Engines ....... 422 12.4.1 Classification Techniques ................ 424 12.4.1.1 Trie-based Algorithms ............ 425 12.4.1.2 Hierarchical Intelligent Cuttings (HiCuts) . 425 12.4.2 Case Studies ....................... 426 12.5 Packet Buffering and Queue Management Engines ....... 431 399 400 Multi-Core Embedded Systems 12.5.1 Performance Issues ................... 433 12.5.1.1 External DRAMMemory Bottlenecks ... 433 12.5.1.2 Evaluation of Queue Management Functions: INTEL IXP1200 Case ................. 434 12.5.2 Design of Specialized Core for Implementation of Queue Management in Hardware ................ 435 12.5.2.1 Optimization Techniques .......... 439 12.5.2.2 Performance Evaluation of Hardware Queue Management Engine ............. 440 12.6 Scheduling Engines ......................... 442 12.6.1 Data Structures in Scheduling Architectures ..... 443 12.6.2 Task Scheduling ..................... 444 12.6.2.1 Load Balancing ................ 445 12.6.3 Traffic Scheduling ...................
    [Show full text]
  • Introduction to Microcoded Implementation of a CPU Architecture
    Introduction to Microcoded Implementation of a CPU Architecture N.S. Matloff, revised by D. Franklin January 30, 1999, revised March 2004 1 Microcoding Throughout the years, Microcoding has changed dramatically. The debate over simple computers vs complex computers once raged within the architecture community. In the end, the most popular microcoded computers survived for three reasons - marketshare, technological improvements, and the embracing of the principles used in simple computers. So the two eventually merged into one. To truly understand microcoding, one must understand why they were built, what they are, why they survived, and, finally, what they look like today. 1.1 Motivation Strictly speaking, the term architecture for a CPU refers only to \what the assembly language programmer" sees|the instruction set, addressing modes, and register set. For a given target architecture, i.e. the architecture we wish to build, various implementations are possible. We could have many different internal designs of the CPU chip, all of which produced the same effect, namely the same instruction set, addressing modes, etc. The different internal designs could then all be produced for the different models of that CPU, as in the familiar Intel case. The different models would have different speed capabilities, and probably different prices to the consumer. But the same machine languge program, say a .EXE file in the Intel/DOS case, would run on any CPU in the family. When desigining an instruction set architecture, there is a tradeoff between software and hardware. If you provide very few instructions, it takes more instructions to perform the same task, but the hardware can be very simple.
    [Show full text]
  • A Characterization of Processor Performance in the VAX-1 L/780
    A Characterization of Processor Performance in the VAX-1 l/780 Joel S. Emer Douglas W. Clark Digital Equipment Corp. Digital Equipment Corp. 77 Reed Road 295 Foster Street Hudson, MA 01749 Littleton, MA 01460 ABSTRACT effect of many architectural and implementation features. This paper reports the results of a study of VAX- llR80 processor performance using a novel hardware Prior related work includes studies of opcode monitoring technique. A micro-PC histogram frequency and other features of instruction- monitor was buiit for these measurements. It kee s a processing [lo. 11,15,161; some studies report timing count of the number of microcode cycles execute z( at Information as well [l, 4,121. each microcode location. Measurement ex eriments were performed on live timesharing wor i loads as After describing our methods and workloads in well as on synthetic workloads of several types. The Section 2, we will re ort the frequencies of various histogram counts allow the calculation of the processor events in 5 ections 3 and 4. Section 5 frequency of various architectural events, such as the resents the complete, detailed timing results, and frequency of different types of opcodes and operand !!Iection 6 concludes the paper. specifiers, as well as the frequency of some im lementation-s ecific events, such as translation bu h er misses. ?phe measurement technique also yields the amount of processing time spent, in various 2. DEFINITIONS AND METHODS activities, such as ordinary microcode computation, memory management, and processor stalls of 2.1 VAX-l l/780 Structure different kinds. This paper reports in detail the amount of time the “average’ VAX instruction The llf780 processor is composed of two major spends in these activities.
    [Show full text]
  • Digital and System Design
    Digital System Design — Use of Microcontroller RIVER PUBLISHERS SERIES IN SIGNAL, IMAGE & SPEECH PROCESSING Volume 2 Consulting Series Editors Prof. Shinsuke Hara Osaka City University Japan The Field of Interest are the theory and application of filtering, coding, trans- mitting, estimating, detecting, analyzing, recognizing, synthesizing, record- ing, and reproducing signals by digital or analog devices or techniques. The term “signal” includes audio, video, speech, image, communication, geophys- ical, sonar, radar, medical, musical, and other signals. • Signal Processing • Image Processing • Speech Processing For a list of other books in this series, see final page. Digital System Design — Use of Microcontroller Dawoud Shenouda Dawoud R. Peplow University of Kwa-Zulu Natal Aalborg Published, sold and distributed by: River Publishers PO box 1657 Algade 42 9000 Aalborg Denmark Tel.: +4536953197 EISBN: 978-87-93102-29-3 ISBN:978-87-92329-40-0 © 2010 River Publishers All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, mechanical, photocopying, recording or otherwise, without prior written permission of the publishers. Dedication To Nadia, Dalia, Dina and Peter D.S.D To Eleanor and Caitlin R.P. v This page intentionally left blank Preface Electronic circuit design is not a new activity; there have always been good designers who create good electronic circuits. For a long time, designers used discrete components to build first analogue and then digital systems. The main components for many years were: resistors, capacitors, inductors, transistors and so on. The primary concern of the designer was functionality however, once functionality has been met, the designer’s goal is then to enhance per- formance.
    [Show full text]
  • Chapter 5 the LC-3
    Instruction Set Architecture ISA = Programmer-visible components & operations • Memory organization Address space -- how may locations can be addressed? Addressibility -- how many bits per location? • Register set Chapter 5 How many? What size? How are they used? • Instruction set The LC-3 Opcodes Data types Addressing modes All information needed to write/gen machine language program Based on slides © McGraw-Hill Additional material © 2004/2005 Lewis/Martin CSE 240 5-2 LC-3 Overview: Memory and Registers LC-3 Overview: Instruction Set Memory Opcodes • Address space: 216 locations (16-bit addresses) • 16 opcodes • Addressibility: 16 bits • Operate instructions: ADD, AND, NOT, (MUL) • Data movement instructions: LD, LDI, LDR, LEA, ST, STR, STI Registers • Control instructions: BR, JSR, JSRR, RET, RTI, TRAP • Temporary storage, accessed in a single machine cycle • Some opcodes set/clear condition codes, based on result Memory access generally takes longer N = negative (<0), Z = zero (=0), P = positive (> 0) • Eight general-purpose registers: R0 - R7 Data Types Each 16 bits wide • 16-bit 2’s complement integer How many bits to uniquely identify a register? Addressing Modes • Other registers • How is the location of an operand specified? Not directly addressable, but used by (and affected by) • Non-memory addresses: register, immediate (literal) instructions PC (program counter), condition codes, MAR, MDR, etc. • Memory addresses: base+offset, PC-relative, indirect CSE 240 5-3 CSE 240 5-4 1 LC-3 Instruction Summary Operate Instructions (inside back cover) Only three operations • ADD, AND, NOT Source and destination operands are registers • Do not reference memory • ADD and AND can use “immediate” mode, (i.e., one operand is hard-wired into instruction) Will show abstracted datapath with each instruction • Illustrate when and where data moves to accomplish desired op.
    [Show full text]
  • Exploiting Coarse-Grained Parallelism to Accelerate Protein Motif Finding with a Network Processor
    Exploiting Coarse-Grained Parallelism to Accelerate Protein Motif Finding with a Network Processor Ben Wun, Jeremy Buhler, and Patrick Crowley Department of Computer Science and Engineering Washington University in St.Louis {bw6,jbuhler,pcrowley}@cse.wustl.edu Abstract The first generation of general-purpose CMPs is expected to employ a small number of sophisticated, While general-purpose processors have only recently employed superscalar CPU cores; by contrast, NPs contain many, chip multiprocessor (CMP) architectures, network processors much simpler single-issue cores. Desktop and server (NPs) have used heterogeneous multi-core architectures since processors focus on maximizing instruction-level the late 1990s. NPs differ qualitatively from workstation and parallelism (ILP) and minimizing latency to memory, server CMPs in that they replicate many simple, highly efficient while NPs are designed to exploit coarse-grained processor cores on a chip, rather than a small number of parallelism and maximize throughput. NPs are designed sophisticated superscalar CPUs. In this paper, we compare the to maximize performance and efficiency on packet performance of one such NP, the Intel IXP 2850, to that of the processing workloads; however, we believe that many Intel Pentium 4 when executing a scientific computing workload with a high degree of thread-level parallelism. Our target other workloads, in particular tasks drawn from scientific program, HMMer, is a bioinformatics tool that identifies computing, are better suited to NP-style CMPs than to conserved motifs in protein sequences. HMMer represents CMPs based on superscalar cores. motifs as hidden Markov models (HMMs) and spends most of its In this work, we study a representative scientific time executing the well-known Viterbi algorithm to align workload drawn from bioinformatics: the HMMer proteins to these models.
    [Show full text]
  • Instruction Set Architecture
    Instruction Set Architecture EE3376 1 –Adapted from notes from BYU ECE124 Topics to Cover… l MSP430 ISA l MSP430 Registers, ALU, Memory l Instruction Formats l Addressing Modes l Double Operand Instructions l Single Operand Instructions l Jump Instructions l Emulated Instructions – http://en.wikipedia.org/wiki/TI_MSP430 2 –Adapted from notes from BYU ECE124 Levels of Transformation –Problems –Algorithms – C Instructions –Language (Program) –Programmable –Assembly Language – MSP 430 ISA –Machine (ISA) Architecture –Computer Specific –Microarchitecture –Manufacturer Specific –Circuits –Devices 3 –Adapted from notes from BYU ECE124 Instruction Set Architecture l The computer ISA defines all of the programmer-visible components and operations of the computer – memory organization l address space -- how may locations can be addressed? l addressibility -- how many bits per location? – register set (a place to store a collection of bits) l how many? what size? how are they used? – instruction set l Opcodes (operation selection codes) l data types (data types: byte or word) l addressing modes (coding schemes to access data) l ISA provides all information needed for someone that wants to write a program in machine language (or translate 4 from a high-level language to machine language). –Adapted from notes from BYU ECE124 MSP430 Instruction Set Architecture l MSP430 CPU specifically designed to allow the use of modern programming techniques, such as: – the computation of jump addresses – data processing in tables – use of high-level languages such as C. l 64KB memory space with 16 16-bit registers that reduce fetches to memory. l Implements RISC architecture with 27 instructions and 7 addressing modes.
    [Show full text]
  • The Implementation of Prolog Via VAX 8600 Microcode ABSTRACT
    The Implementation of Prolog via VAX 8600 Microcode Jeff Gee,Stephen W. Melvin, Yale N. Patt Computer Science Division University of California Berkeley, CA 94720 ABSTRACT VAX 8600 is a 32 bit computer designed with ECL macrocell arrays. Figure 1 shows a simplified block diagram of the 8600. We have implemented a high performance Prolog engine by The cycle time of the 8600 is 80 nanoseconds. directly executing in microcode the constructs of Warren’s Abstract Machine. The imulemention vehicle is the VAX 8600 computer. The VAX 8600 is a general purpose processor Vimal Address containing 8K words of writable control store. In our system, I each of the Warren Abstract Machine instructions is implemented as a VAX 8600 machine level instruction. Other Prolog built-ins are either implemented directly in microcode or executed by the general VAX instruction set. Initial results indicate that. our system is the fastest implementation of Prolog on a commercrally available general purpose processor. 1. Introduction Various models of execution have been investigated to attain the high performance execution of Prolog programs. Usually, Figure 1. Simplified Block Diagram of the VAX 8600 this involves compiling the Prolog program first into an intermediate form referred to as the Warren Abstract Machine (WAM) instruction set [l]. Execution of WAM instructions often follow one of two methods: they are executed directly by a The 8600 consists of six subprocessors: the EBOX. IBOX, special purpose processor, or they are software emulated via the FBOX. MBOX, Console. and UO adamer. Each of the seuarate machine language of a general purpose computer.
    [Show full text]
  • CHAPTER 4 MARIE: an Introduction to a Simple Computer
    CHAPTER 4 MARIE: An Introduction to a Simple Computer 4.1 Introduction 219 4.2 CPU Basics and Organization 219 4.2.1 The Registers 220 4.2.2 The ALU 221 4.2.3 The Control Unit 221 4.3 The Bus 221 4.4 Clocks 225 4.5 The Input/Output Subsystem 227 4.6 Memory Organization and Addressing 227 4.7 Interrupts 235 4.8 MARIE 236 4.8.1 The Architecture 236 4.8.2 Registers and Buses 236 4.8.3 Instruction Set Architecture 238 4.8.4 Register Transfer Notation 242 4.9 Instruction Processing 244 4.9.1 The Fetch–Decode–Execute Cycle 244 4.9.2 Interrupts and the Instruction Cycle 246 4.9.3 MARIE’s I/O 249 4.10 A Simple Program 249 4.11 A Discussion on Assemblers 252 4.11.1 What Do Assemblers Do? 252 4.11.2 Why Use Assembly Language? 254 4.12 Extending Our Instruction Set 255 4.13 A Discussion on Decoding: Hardwired Versus Microprogrammed Control 262 4.13.1 Machine Control 262 4.13.2 Hardwired Control 265 4.13.3 Microprogrammed Control 270 4.14 Real-World Examples of Computer Architectures 274 4.14.1 Intel Architectures 275 4.14.2 MIPS Architectures 282 Chapter Summary 284 CMPS375 Class Notes (Chap04) Page 1 / 27 Dr. Kuo-pao Yang 4.1 Introduction 219 • In this chapter, we first look at a very simple computer called MARIE: A Machine Architecture that is Really Intuitive and Easy. • We then provide brief overviews of Intel and MIPS machines, two popular architectures reflecting the CISC (Complex Instruction Set Computer) and RISC (Reduced Instruction Set Computer) design philosophies.
    [Show full text]
  • Vhdl Projects to Reinforce Computer Architecture Classroom Instruction
    AC 2007-372: VHDL PROJECTS TO REINFORCE COMPUTER ARCHITECTURE CLASSROOM INSTRUCTION Ronald Hayne, The Citadel Ronald J. Hayne, PhD, is an Assistant Professor in the Department of Electrical and Computer Engineering at The Citadel. His professional areas of interest are digital systems and hardware description languages. He is a retired Army officer with experience in academics and Defense laboratories. Page 12.1588.1 Page © American Society for Engineering Education, 2007 VHDL Projects to Reinforce Computer Architecture Classroom Instruction Abstract Exploration of various computer architecture constructs needs reinforcement beyond pencil and paper homework problems. Unfortunately, laboratory exercises based on microprocessor trainers are limited to a single architecture and a resolution of single assembly language instructions. A hardware description language, such as VHDL, can be used to provide simulation-based application of the classroom instruction regardless of the course text. Models of computer components such as registers, memory, and ALUs can be readily defined to match textbook examples and then combined to demonstrate multiple architectural concepts. Students with basic knowledge of VHDL from their prerequisite digital logic course are able to modify and use these models to simulate computer behavior at the register transfer level with data and control signal visibility at each clock cycle. A program of instruction has been developed that uses VHDL homework exercises and a capstone design project to provide hands-on application of course concepts using modern design tools. Exercises include addressing modes, microprogrammed control, and computer arithmetic. The design project models a multi-bus architecture and hardwired control unit from the text to implement a basic instruction set.
    [Show full text]