Tiny Register Machine (TRM)

Total Page:16

File Type:pdf, Size:1020Kb

Tiny Register Machine (TRM) Reconfigurable Computing Systems (252-2210-00L) Fall 2012 Tiny Register Machine (TRM) L. Liu Department of Computer Science, ETH Zürich Fall semester, 2012 11 Introduction Jumping up a few levels of Application program s abstraction. Software O perating device drivers Architecture: the programmer’s System s view of the computer instructions Architecture Defined by instructions (operations) registers and operand locations Micro- datapaths architecture controllers Microarchitecture: how to adders Logic implement an architecture in m em ories Digital AND gates hardware C ircuits NOT gates Analog am plifiers C ircuits filters transistors Devices diodes Physics electrons Adapted from Digital Design and Computer Architecture, 2 David Money Harris & Sarah L. Harris @2007 Elsevier Assembly Language To command a computer, you must understand its language. Instructions: words in a computer’s language Instruction set: the vocabulary of a computer’s language Instructions indicate the operation to perform and the operands to use. Assembly language: human-readable format of instructions Machine language: computer-readable format (1’s and 0’s) Assembler: a tool to translate assembly code into machine code TRM (Tiny Register Machine) architecture: Designed by Prof. Niklaus Wirth and implemented on Xilinx Spartan-3, Virtex-5 and Spartan-6 by Ling Liu. A RISC processor that only contains the necessary instructions to run an application written in a high-level language. Once you’ve learned one architecture, it’s easy to learn others. 3 Architecture Design Principles Underlying design principles, as articulated by Hennessy and Patterson: 1. Simplicity favors regularity 2. Make the common case fast 3. Smaller is faster 4. Good design demands good compromises 4 Design Principle 1 Simplicity favors regularity - Consistent instruction format - Same number of operands, easier to encode and handle in hardware 5 TRM instruction encoding Design Principle 2 Make the common case fast - ALU operations are performed on registers and constants - TRM is a reduced instruction set computer (RISC) , with a small number of simple instructions. - The main characteristic of RISC architecture is to allow most instructions to be executed in one clock cycle. 6 Design Principle 3 Smaller is Faster - TRM includes only a small number of registers - Just as retrieving data from a few books on your table is faster than sorting through 1000 books, retrieving data from 32 registers is faster than retrieving it from 1000 registers or a large memory. 7 Design Principle 4 Good design demands good compromises - Multiple instruction formats allow flexibility - ADD , SUB : use 2 register operands - LD , ST : use 2 register operands and a constant - Number of instruction formats kept small - to adhere to design principles 1 and 3 (simplicity favors regularity and smaller is faster). 8 TRM Machine Language Computers only understand 1’s and 0’s Machine language: binary representation of instructions 18-bit instructions Three instruction types: Type a : arithmetical and logical operations Type b : load and store instructions Type c : branch instructions (for jumping) 9 Type a: Arithmetical and Logical Instructions 1 or 2 register operands: Rs : source registers Rd : destination register n: immediate, zero-extended Other fields: op : the operation code or opcode regSel : bit 10 (1 means source operand comes from Rs register) 10 The TRM Registers Name Register Number Usage R0~R7 0 ~ 7 Working register R7 7 Normally used as link register R6 6 Normally used as stack pointer PC Program pointer C Carry flag (1 bit) N Sign flag (1 bit) Z Zero flag (1 bit) V Overflow flag (1 bit) 11 can be ignored The Power of the Stored Program 18-bit instructions and 32-bit data stored in memory Sequence of instructions: only difference between two applications (for example, a text editor and a video game) To run a new program: No rewiring required Simply store new program in memory The processor hardware executes the program: fetches (reads) the instructions from instruction memory in sequence performs the specified operation The program counter (PC) keeps track of the current instruction In TRM, programs start at memory address 0x000 13 The Stored Program Assembly Code Machine Code LD R2, [R0+32] 0x31100 ADD R1, R2 0x08C02 SUB R0, 12 0x0C00C SUB R0, R5 0x0C405 Stored Program Address Instructions 003 0C405 002 0C00C 001 08C02 000 31100 PC Instruction Memory 14 Type b: Load and Store Instructions memory base address register ( Rs ) 7-bit offset (n), zero-extended Bit 10 is always 0 for TRM, 1 for VTRM 15 op instruction operation code (binary) 1100 LD Rd, Rs, n If Rs = R7 then 1100ddd0nnnnnnnsss Rd := mem[n] else Rd := mem[Rs+n] 1101 ST Rd, Rs, n If Rs = R7 then 1101ddd0nnnnnnnsss Mem[n] := Rd else Mem[Rs+n] := Rd Type c: Branch Instructions Jump conditions (cond ) 10/14-bit address offset operand ( off ) 17 Branch Instructions op instruction operation code (binary) 1110 Bc n PC := PC + 1 + n, 1110ccccnnnnnnnnnn on condition c 1111 BL n R7 := PC + 1; 1111nnnnnnnnnnnnnn PC := PC +1+n 18 Branching Allows a program to execute instructions out of sequence. Types of branches: Conditional branches • branch if equal ( BEQ ) • branch if not equal ( BNE ) Unconditional branches • BT • jump register ( BR ) • jump and link ( BL, BLR ) 19 cond condition meaning Mnemonic 0000 Z Zero / equal BEQ (BZS) 0001 ~Z Non-zero / unequal BNE (BZC) 0010 C Carry / above or equal (unsigned) BAE (BCS) 0011 ~C No carry / below (unsigned) BB (BCC) 0100 N Negative BN (BNS) 0101 ~N Not negative BNN (BNC) 0110 V Overflow BO (BVS) 0111 ~V No overflow BNO (BVC) 1000 ~(~C | Z) Carry and no zero / above (unsigned) BA 1001 ~C | Z No carry or zero / below or equal (unsigned) BBE 1010 ~(N ≠V) N=V / greater or equal (signed) BGE 1011 N≠V N≠V / less (signed) BLT 1100 ~((N ≠V) | Z) greater or equal and ~ZF / greater (signed) BGT 1101 (N ≠V) | Z less or Z / less or equal (signed) BLE 1110 TRUE Always BT (B) 1111 FALSE Never BF N = bit 31 of result Z = all 32 bits are zero C = carry V = overflow Conditional Branching (*TRM assembly*) MOV R0, 4 (*R0 = 4*) MOV R1, 1 (*R1 = 1*) SUB R0, R1 (*R0 = R0 – R1 = 3*) BNE target (*branch is taken*) ADD R1, 3 (*not executed*) target: (*label*) ADD R1, R1 (*R1 = 1+1 = 2*) Labels indicate instruction locations in a program. 21 The Branch Not Taken (*TRM assembly*) MOV R0, 4 (*R0 = 4*) MOV R1, 1 (*R1 = 1*) SUB R0, R1 (*R0 = R0 – R1 = 3*) BEQ target (*branch is not taken*) ADD R1, 3 (*executed*) target: (*label*) ADD R1, R1 (*R1 = 1+1 = 2*) 22 Unconditional Branching / Jumping ( BT) (*TRM assembly*) MOV R0, 4 (*R0 = 4*) MOV R1, 1 (*R1 = 1*) BT target (*jump to target*) ROR R1 2 (* not executed*) target: ADD R1, R0 (*R1 = 1 + 4 = 5*) 23 Review: Instruction Formats 24 High-Level Code Constructs if statements if/else statements while loops for loops 25 If Statement High-level code TRM assembly code (* R0 = f, R1 = g, R2 = h R3 = i, R4 = j*) IF i = j THEN f := g + h; f := f – i; 26 If Statement High-level code TRM assembly code (* R0 = f, R1 = g, R2 = h R3 = i, R4 = j*) IF i = j THEN SUB R4, R3 f := g + h; BNE L1 ADD R1, R2 f := f – i; MOV R0, R1 L1: SUB R0, R3 Done: BT Done Notice that the assembly tests for the opposite case ( i != j ) than the test in the high-level code ( i == j ). 27 If / Else Statement High-level code TRM assembly code (* R0 = f, R1 = g, R2 = h R3 = i, R4 = j *) IF i = j THEN f := g + h; ELSE f := f – i; 28 If / Else Statement High-level code TRM assembly code (* R0 = f, R1 = g, R2 = h R3 = i, R4 = j*) IF i = j THEN SUB R4, R3 f := g + h; BNE L1 ELSE ADD R1, R2 f := f – i; MOV R0, R1 BT Done L1: SUB R0, R3 Done: BT Done 29 While Loops High-level code TRM assembly code (* determines the power (*R0 = pow, R1 = x*) of x such that 2 x = 128*) VAR pow, x: INTEGER; BEGIN pow := 1; x := 0; WHILE pow # 128 DO pow := pow * 2; x := x + 1; END END 30 While Loops High-level code TRM assembly code (* determines the power (*R0 = pow, R1 = x*) of x such that 2 x = 128*) MOV R0, 1 VAR pow, x: INTEGER; MOV R1, 0 BEGIN while: MOV R2, R0 pow := 1; SUB R2, 128 x := 0; BEQ done ADD R0, R0 WHILE pow # 128 DO ADD R1, 1 pow := pow * 2; BT while x := x + 1; done: BT done END END Notice that the assembly tests for the opposite case ( pow == 128 ) than the test in the high-level code ( pow != 128 ). 31 For Loops The general form of a for loop is: FOR initialization TO condition BY loop operation DO loop body END initialization: executes before the loop begins condition: is tested at the beginning of each iteration loop operation: executes at the end of each iteration loop body: executes each time the condition is met 32 For Loops High-level code TRM assembly code (* add the numbers from 0 (*R0 = i, R1 = sum*) to 9*) VAR sum, i: INTEGER BEGIN sum := 0; FOR i:=0 TO 9 BY 1 DO sum := sum + i; END END 33 For Loops High-level code TRM assembly code (* add the numbers from 0 (*R0 = i, R1 = sum*) to 9*) MOV R1, 0 VAR MOV R0, 0 sum, i: INTEGER for: MOV R2, R0 BEGIN SUB R2, 10 BEQ done sum := 0; ADD R1, R0 ADD R0, 1 FOR i:=0 TO 9 BY 1 DO BT for sum := sum + i; done: BT done END END Notice that the assembly tests for the opposite case ( i == 10 ) than the test in the high-level code ( i != 10 ).
Recommended publications
  • PERL – a Register-Less Processor
    PERL { A Register-Less Processor A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by P. Suresh to the Department of Computer Science & Engineering Indian Institute of Technology, Kanpur February, 2004 Certificate Certified that the work contained in the thesis entitled \PERL { A Register-Less Processor", by Mr.P. Suresh, has been carried out under my supervision and that this work has not been submitted elsewhere for a degree. (Dr. Rajat Moona) Professor, Department of Computer Science & Engineering, Indian Institute of Technology, Kanpur. February, 2004 ii Synopsis Computer architecture designs are influenced historically by three factors: market (users), software and hardware methods, and technology. Advances in fabrication technology are the most dominant factor among them. The performance of a proces- sor is defined by a judicious blend of processor architecture, efficient compiler tech- nology, and effective VLSI implementation. The choices for each of these strongly depend on the technology available for the others. Significant gains in the perfor- mance of processors are made due to the ever-improving fabrication technology that made it possible to incorporate architectural novelties such as pipelining, multiple instruction issue, on-chip caches, registers, branch prediction, etc. To supplement these architectural novelties, suitable compiler techniques extract performance by instruction scheduling, code and data placement and other optimizations. The performance of a computer system is directly related to the time it takes to execute programs, usually known as execution time. The expression for execution time (T), is expressed as a product of the number of instructions executed (N), the average number of machine cycles needed to execute one instruction (Cycles Per Instruction or CPI), and the clock cycle time (), as given in equation 1.
    [Show full text]
  • Computer Architecture Basics
    V03: Computer Architecture Basics Prof. Dr. Anton Gunzinger Supercomputing Systems AG Technoparkstrasse 1 8005 Zürich Telefon: 043 – 456 16 00 Email: [email protected] Web: www.scs.ch Supercomputing Systems AG Phone +41 43 456 16 00 Technopark 1 Fax +41 43 456 16 10 8005 Zürich www.scs.ch Computer Architecture Basic 1. Basic Computer Architecture Diagram 2. Basic Computer Architectures 3. Von Neumann versus Harvard Architecture 4. Computer Performance Measurement 5. Technology Trends 2 Zürich 20.09.2019 © by Supercomputing Systems AG CONFIDENTIAL 1.1 Basic Computer Architecture Diagram Sketch the basic computer architecture diagram Describe the functions of the building blocks 3 Zürich 20.09.2019 © by Supercomputing Systems AG CONFIDENTIAL 1.2 Basic Computer Architecture Diagram Describe the execution of a single instruction 5 Zürich 20.09.2019 © by Supercomputing Systems AG CONFIDENTIAL 2 Basic Computer Architecture 2.1 Sketch an Accumulator Machine 2.2 Sketch a Register Machine 2.3 Sketch a Stack Machine 2.4 Sketch the analysis of computing expression in a Stack Machine 2.5 Write the micro program for an Accumulator, a Register and a Stack Machine for the instruction: C:= A + B and estimate the numbers of cycles 2.6 Write the micro program for an Accumulator, a Register and a Stack Machine for the instruction: C:= (A - B)2 + C and estimate the numbers of cycles 2.7 Compare Accumulator, Register and Stack Machine 7 Zürich 20.09.2019 © by Supercomputing Systems AG CONFIDENTIAL 2.1 Accumulator Machine Draw the Accumulator Machine 8 Zürich
    [Show full text]
  • Computers, Complexity, and Controversy
    Instruction Sets and Beyond: Computers, Complexity, and Controversy Robert P. Colwell, Charles Y. Hitchcock m, E. Douglas Jensen, H. M. Brinkley Sprunt, and Charles P. Kollar Carnegie-Mellon University t alanc o tyreceived Instruction set design is important, but lt1Mthi it should not be driven solely by adher- ence to convictions about design style, ,,tl ica' ch m e n RISC or CISC. The focus ofdiscussion o09 Fy ipt issues. RISC should be on the more general question of the assignment of system function- have guided ality to implementation levels within years. A study of an architecture. This point of view en- d yield a deeper un- compasses the instruction set-CISCs f hardware/software tend to install functionality at lower mputer performance, the system levels than RISCs-but also JOCUS on rne assignment iluence of VLSI on processor design, takes into account other design fea- ofsystem functionality to and many other topics. Articles on tures such as register sets, coproces- RISC research, however, often fail to sors, and caches. implementation levels explore these topics properly and can While the implications of RISC re- within an architecture, be misleading. Further, the few papers search extend beyond the instruction and not be guided by that present comparisons with com- set, even within the instruction set do- whether it is a RISC plex instruction set computer design main, there are limitations that have or CISC design. often do not address the same issues. not been identified. Typical RISC As a result, even careful study of the papers give few clues about where the literature is likely to give a distorted RISC approach might break down.
    [Show full text]
  • Programmable Digital Microcircuits - a Survey with Examples of Use
    - 237 - PROGRAMMABLE DIGITAL MICROCIRCUITS - A SURVEY WITH EXAMPLES OF USE C. Verkerk CERN, Geneva, Switzerland 1. Introduction For most readers the title of these lecture notes will evoke microprocessors. The fixed instruction set microprocessors are however not the only programmable digital mi• crocircuits and, although a number of pages will be dedicated to them, the aim of these notes is also to draw attention to other useful microcircuits. A complete survey of programmable circuits would fill several books and a selection had therefore to be made. The choice has rather been to treat a variety of devices than to give an in- depth treatment of a particular circuit. The selected devices have all found useful ap• plications in high-energy physics, or hold promise for future use. The microprocessor is very young : just over eleven years. An advertisement, an• nouncing a new era of integrated electronics, and which appeared in the November 15, 1971 issue of Electronics News, is generally considered its birth-certificate. The adver• tisement was for the Intel 4004 and its three support chips. The history leading to this announcement merits to be recalled. Intel, then a very young company, was working on the design of a chip-set for a high-performance calculator, for and in collaboration with a Japanese firm, Busicom. One of the Intel engineers found the Busicom design of 9 different chips too complicated and tried to find a more general and programmable solu• tion. His design, the 4004 microprocessor, was finally adapted by Busicom, and after further négociation, Intel acquired marketing rights for its new invention.
    [Show full text]
  • ABSTRACT Virtualizing Register Context
    ABSTRACT Virtualizing Register Context by David W. Oehmke Chair: Trevor N. Mudge A processor designer may wish for an implementation to support multiple reg- ister contexts for several reasons: to support multithreading, to reduce context switch overhead, or to reduce procedure call/return overhead by using register windows. Conventional designs require that each active context be present in its entirety, increasing the size of the register file. Unfortunately, larger register files are inherently slower to access and may lead to a slower cycle time or additional cycles of register access latency, either of which reduces overall performance. We seek to bypass the trade-off between multiple context support and register file size by mapping registers to memory, thereby decoupling the logical register requirements of active contexts from the contents of the physical register file. Just as caches and virtual memory allow a processor to give the illusion of numerous multi-gigabyte address spaces with an average access time approach- ing that of several kilobytes of SRAM, we propose an architecture that gives the illusion of numerous active contexts with an average access time approaching that of a single active context using a conventionally sized register file. This dis- sertation introduces the virtual context architecture, a new architecture that virtual- izes logical register contexts. Complete contexts, whether activation records or threads, are kept in memory and are no longer required to reside in their entirety in the physical register file. Instead, the physical register file is treated as a cache of the much larger memory-mapped logical register space. The implementation modifies the rename stage of the pipeline to trigger the movement of register val- ues between the physical register file and the data cache.
    [Show full text]
  • Turing Machines [Fa’16]
    Models of Computation Lecture 6: Turing Machines [Fa’16] Think globally, act locally. — Attributed to Patrick Geddes (c.1915), among many others. We can only see a short distance ahead, but we can see plenty there that needs to be done. — Alan Turing, “Computing Machinery and Intelligence” (1950) Never worry about theory as long as the machinery does what it’s supposed to do. — Robert Anson Heinlein, Waldo & Magic, Inc. (1950) It is a sobering thought that when Mozart was my age, he had been dead for two years. — Tom Lehrer, introduction to “Alma”, That Was the Year That Was (1965) 6 Turing Machines In 1936, a few months before his 24th birthday, Alan Turing launched computer science as a modern intellectual discipline. In a single remarkable paper, Turing provided the following results: • A simple formal model of mechanical computation now known as Turing machines. • A description of a single universal machine that can be used to compute any function computable by any other Turing machine. • A proof that no Turing machine can solve the halting problem—Given the formal description of an arbitrary Turing machine M, does M halt or run forever? • A proof that no Turing machine can determine whether an arbitrary given proposition is provable from the axioms of first-order logic. This is Hilbert and Ackermann’s famous Entscheidungsproblem (“decision problem”). • Compelling arguments1 that his machines can execute arbitrary “calculation by finite means”. Although Turing did not know it at the time, he was not the first to prove that the Entschei- dungsproblem had no algorithmic solution.
    [Show full text]
  • ECE 4750 Computer Architecture Topic 1: Microcoding
    ECE 4750 Computer Architecture Topic 1: Microcoding Christopher Batten School of Electrical and Computer Engineering Cornell University http://www.csl.cornell.edu/courses/ece4750 slide revision: 2013-09-01-10-42 Instruction Set Architecture Microcoded MIPS Processor Microcoding Discussion & Trends Agenda Instruction Set Architecture IBM 360 Instruction Set MIPS Instruction Set ISA to Microarchitecture Mapping Microcoded MIPS Processor Microcoded MIPS Microarchitecture #1 Microcoded MIPS Microarchitecture #2 Microcoding Discussion and Trends ECE 4750 T01: Microcoding 2 / 45 • Instruction Set Architecture • Microcoded MIPS Processor Microcoding Discussion & Trends Instruction Set Architecture I Contract between software & hardware Application Algorithm I Typically specified as all of the Programming Language programmer-visible state (registers & Operating System memory) plus the semantics of instructions Instruction Set Architecture Microarchitecture that operate on this state Register-Transfer Level IBM 360 was first line of machines to Gate Level I Circuits separate ISA from microarchitecture and Devices implementation Physics ... the structure of a computer that a machine language programmer must understand to write a correct (timing independent) program for that machine. — Amdahl, Blaauw, Brooks, 1964 ECE 4750 T01: Microcoding 3 / 45 • Instruction Set Architecture • Microcoded MIPS Processor Microcoding Discussion & Trends Compatibility Problem at IBM I By early 1960’s, IBM had several incompatible lines of computers! . Defense : 701 . Scientific : 704, 709, 7090, 7094 . Business : 702, 705, 7080 . Mid-Sized Business : 1400 . Decimal Architectures : 7070, 7072, 7074 I Each system had its own: . Instruction set . I/O system and secondary storage (tapes, drums, disks) . Assemblers, compilers, libraries, etc . Market niche ECE 4750 T01: Microcoding 4 / 45 • Instruction Set Architecture • Microcoded MIPS Processor Microcoding Discussion & Trends IBM 360: A General-Purpose Register Machine I Processor State .
    [Show full text]
  • Register Machines Storing Ordinals and Running for Ordinal Time
    2488 Register Machines Storing Ordinals and Running for Ordinal Time Ryan Siders 6/106 0 Outline • Assumptions about Motion and Limits on Programming • Without well-structured programming: • Well-Structured Programs: • Ordinal Computers with Well-Structured Programs: • Well-Structured programs halt. • Scratch Pad, Clocks, and Constants • Abstract Finite-time Computability Theory • Jacopini-Bohm Theorem holds if... • How many registers are universal? • 3 registers, No. 4 registers, Yes. • Main Theorem: Decide Any Constructible Set • Recursive Truth Predicate • Pop • Push • The Program to find Truth • Zero? 6/106 1 Assumptions about Motion and Limits on Programming Input, Memory, and Time for Hypercomputation: • Time is wellfounded so that the run is deterministic. • We want input and memory to be ordinals, too. • Then we can compare input length to run-time and define complexity classes. Definition An Ordinal Computer stores ordinals in its 0 < n < ! registers, runs for ordinal time, and executes a pro- gram that can alter the registers. At Limit times, we want the following to be determined: • which command line is active, and • what the values of the ordinals are. They are determined by: • assumptions on the behaviour of the registers and flow of control in the program (limiting the universe in which the programming model can exists), and • limitations on the class of programs that can be used. 6/106 2 Without well-structured programming: Definition: An Infinite-time Ordinal-storing Register Ma- chine is a register machine storing ordinal values and running for ordinal time, with a programming language including the three instructions: • Erase: Zero(x); • Increment: x + +; • Switch: if x = y goto i else j; whose state (the registers' values and active command) obeys the following three rules at a limit time: • If the command \Zero(x)" is called at each time t 2 T , then x is 0 at time sup T , too.
    [Show full text]
  • Part IV: Turing Machines and Undecidability
    Part IV: Turing Machines and Undecidability We are about to begin our exploration of the two outer circles of the language hierarchy, as well as the background, the area outside all of the circles. Up until now, we have been placing limitations on what we could do in order that we could discover simple solutions when they exist. Now we are going to tear down all the barriers and explore the full power of formal computation. We will discover a whole range of problems that become solvable once we do that. We will also discover that there are fundamental limitations on what we can compute, regardless of the specific model with which we choose to work. Part IV 264 Turing Machines and Undecidability 17 Turing Machines We need a new kind of automaton that has two properties: • It must be powerful enough to describe all computable things. In this respect, it should be like real computers and unlike FSMs and PDAs. • It must be simple enough that we can reason formally about it. In this respect, it should be like FSMs and PDAs and unlike real computers. 17.1 Definition, Notation and Examples In our discussion of pushdown automata, it became clear that a finite state controller augmented by a single stack was not powerful enough to be able to execute even some very simple programs. What else must be added in order to acquire the necessary power? One answer is a second stack. We will explore that idea in Section 17.5.2. 17.1.1 What Is a Turing Machine? A more straightforward approach is to eliminate the stack and replace it by a more flexible form of infinite storage, a writeable tape.
    [Show full text]
  • Jon Stokes Jon
    Inside the Machine the Inside A Look Inside the Silicon Heart of Modern Computing Architecture Computer and Microprocessors to Introduction Illustrated An Computers perform countless tasks ranging from the business critical to the recreational, but regardless of how differently they may look and behave, they’re all amazingly similar in basic function. Once you understand how the microprocessor—or central processing unit (CPU)— Includes discussion of: works, you’ll have a firm grasp of the fundamental concepts at the heart of all modern computing. • Parts of the computer and microprocessor • Programming fundamentals (arithmetic Inside the Machine, from the co-founder of the highly instructions, memory accesses, control respected Ars Technica website, explains how flow instructions, and data types) microprocessors operate—what they do and how • Intermediate and advanced microprocessor they do it. The book uses analogies, full-color concepts (branch prediction and speculative diagrams, and clear language to convey the ideas execution) that form the basis of modern computing. After • Intermediate and advanced computing discussing computers in the abstract, the book concepts (instruction set architectures, examines specific microprocessors from Intel, RISC and CISC, the memory hierarchy, and IBM, and Motorola, from the original models up encoding and decoding machine language through today’s leading processors. It contains the instructions) most comprehensive and up-to-date information • 64-bit computing vs. 32-bit computing available (online or in print) on Intel’s latest • Caching and performance processors: the Pentium M, Core, and Core 2 Duo. Inside the Machine also explains technology terms Inside the Machine is perfect for students of and concepts that readers often hear but may not science and engineering, IT and business fully understand, such as “pipelining,” “L1 cache,” professionals, and the growing community “main memory,” “superscalar processing,” and of hardware tinkerers who like to dig into the “out-of-order execution.” guts of their machines.
    [Show full text]
  • Register Based Genetic Programming on FPGA Computing Platforms
    Appears in EuroGP 2000 Presented here with additional revisions Register Based Genetic Programming on FPGA Computing Platforms Heywood M.I.1 Zincir-Heywood A.N.2 {1Dokuz Eylül University, 2Ege University} Dept. Computer Engineering, Bornova, 35100 Izmir. Turkey [email protected] Abstract. The use of FPGA based custom computing platforms is proposed for implementing linearly structured Genetic Programs. Such a context enables consideration of micro architectural and instruction design issues not normally possible when using classical Von Neumann machines. More importantly, the desirability of minimising memory management overheads results in the imposition of additional constraints to the crossover operator. Specifically, individuals are described in terms of the number of pages and page length, where the page length is common across individuals of the population. Pairwise crossover therefore results in the swapping of equal length pages, hence minimising memory overheads. Simulation of the approach demonstrates that the method warrants further study. 1 Introduction Register based machines represent a very efficient platform for implementing Genetic Programs (GP) which are organised as a linear structure. That is to say the GP does not manipulate a tree but a register machine program. By the term ‘register machine’ it is implied that the operation of the host CPU is expressed in terms of operations on sets of registers, where some registers are associated with specialist hardware elements (e.g. as in the ‘Accumulator’ register and the Algorithmic Logic Unit). The principle motivation for such an approach is to both significantly speed up the operation of the GP itself through direct hardware implementation and to minimise the source code footprint of the GP kernel.
    [Show full text]
  • Appendix K Survey of Instruction Set Architectures
    K.1 Introduction K-2 K.2 A Survey of RISC Architectures for Desktop, Server, and Embedded Computers K-3 K.3 The Intel 80x86 K-30 K.4 The VAX Architecture K-50 K.5 The IBM 360/370 Architecture for Mainframe Computers K-69 K.6 Historical Perspective and References K-75 K Survey of Instruction Set Architectures RISC: any computer announced after 1985. Steven Przybylski A Designer of the Stanford MIPS K-2 ■ Appendix K Survey of Instruction Set Architectures K.1 Introduction This appendix covers 10 instruction set architectures, some of which remain a vital part of the IT industry and some of which have retired to greener pastures. We keep them all in part to show the changes in fashion of instruction set architecture over time. We start with eight RISC architectures, using RISC V as our basis for compar- ison. There are billions of dollars of computers shipped each year for ARM (includ- ing Thumb-2), MIPS (including microMIPS), Power, and SPARC. ARM dominates in both the PMD (including both smart phones and tablets) and the embedded markets. The 80x86 remains the highest dollar-volume ISA, dominating the desktop and the much of the server market. The 80x86 did not get traction in either the embed- ded or PMD markets, and has started to lose ground in the server market. It has been extended more than any other ISA in this book, and there are no plans to stop it soon. Now that it has made the transition to 64-bit addressing, we expect this architecture to be around, although it may play a smaller role in the future then it did in the past 30 years.
    [Show full text]