Fpgas for Custom Computing Machines, Napa Progress Function As the General Purpose Systems

Total Page:16

File Type:pdf, Size:1020Kb

Fpgas for Custom Computing Machines, Napa Progress Function As the General Purpose Systems Comparing the Performance of FPGA-Based Custom Computers with General-purpose Computers for DSP Applications Neil W. Bergmann and J. Craig Mudge CSIRO/Flinders Joint Research Centre in Information Technology Flinders University, GPO Box 2100, Adelaide, SA 5001, AUSTRALIA Tel: 61-8-201-3 109; Fax: 61-8-201-3507 E-mail: [email protected], [email protected] Ab st r ac t conventional desktop workstations, and often signifi- When FPGA logic circuits are incorporated within a cantly greater than the best reported results using stored-program computer, the result is a machine where "conventional" super-computers, especially for those the programmer can design both the sofhvare and the algorithms which can be decomposed into many, simple, hardware that will execute that software. This paper first parallel processing tasks. Many Digital Signal describes some of the mre important custom computers, Processing (DSP) algorithms can be decomposed into and their potential weakness as DSP implementation parallel tasks, but each task often involves relatively platforms. It then describes a new custom computing complex operations such as a multiply-accumulate. DSP architecture which is specifically designed for efficient algorithms are therefore less clearly suitable candidates implementation of DSP algorithms. Finally, it presents a for efficient implementation on a custom computer. simple performance comparison of a number of DSP This paper surveys some of the most important custom implementation alternatives, and concludes that (i) the computers, presents the authors' work on a new custom new custom computing architecture is worthy of further computing architecture specifically designed to support investigation, and (ii) that custom computers based only DSP applications, and analyses the performance of on FPGA execution units show little performance various implementation alternatives for DSP algorithms. improvement over state-of-the art workstations. 2. Previous methods of customising 1. Custom computing computers for DSP Field Programmable Gate Arrays are now a popular There is a constant tension in computer design implementation style for digital logic systems and sub- between being general purpose, i.e., doing a wide range systems [l]. Where the programming configuration is of computational tasks moderately well, and being held in static RAM, the logic function implemented by application specific, i.e., doing a smaller range of those FPGAs can be dynamically reconfigured, in computational tasks much better, usually at the cost of fractions of a second, by rewriting the contents of the either increased system resources or of poorer "general SRAM configuration memory. When such PGA logic purpose" performance. circuits are incorporated within a stored-program There have been many different approaches computer, the result is a machine where the programmer investigated over the years for improving the can design both the software and the hardware that will performance of a general purpose computer for the execute that software. Such a machine, where the implementation of DSP algorithms. Custom computers hardware can be reconfigured and customised on a represent the latest technology which shows some program-by-program basis, is called a custom computer promise for this task. PI. A common approach to improving the performance of Several researchers report algorithm speed-up rates of a general purpose computer for specific applications is hundreds or thousands of times compared to the addition of application specific hardware, such as 164 0-8186-5490-2/94$03.00 0 1994 IEEE graphics accelerators for computer displays, or image and boards, with each board containing 16 Xilinx 4010 video compression chips for multi-media workstations. FPGAs [6], connected as a linear array, plus an extra These can provide excellent speedup for that specific Xilinx 4010 for control, with all of the FFGAs having application, but provide no performance improvement some additional interconnections via a central crossbar when other applications are run. This approach switch. The SPLASH boards can communicate with a represents one extreme of the generality/cost spectrum. SUN Sparcstation host via input and output FIFOs, Another common approach, at the opposite end of the which are on an additional interface board connecting spectrum, is the addition of a general-purpose parallel the SUN and SPLASH array. processing sub-system to a host processor. Such add-on Once configured, data is streamed through the parallel processing boards have seemed an obvious and SPLASH processor using the systolic array programming attractive enhancement to desktop computers for some model, with results streaming back to the host. The time, but they have achieved only limited success in the SPLASH processor is most suited to simple streaming marketplace. It is our conjecture that this is because of operations, and has shown significant speedups over difficulty of programming, and high cost resulting from conventional supercomputers for tasks such as text low sales volumes. Additionally, such parallel searching and genetic database searching. processing systems have commonly been inefficient at SPLASH is usually programmed by specifying the implementing the fine-grain, communications-intensive function of the FPGAs using VHDL, which is then parallel algorithms associated with DSP applications. automatically translated into an FPGA configuration file. Recently, general-purpose, Digital Signal Processor chips Current research [5] is examining the use of other such as the TMS32OC40 [3] have become available. programming languages, such as data parallel C. These combine high speed, floating-point arithmetic performance with high speed interprocessor 3.2 Programmable Active Memory communication channels incorporating individual DMA controllers. Arrays of such chips provide a formidable The PAM (Programmable Active Memory) has been challenge for other DSP implementation altematives. under development at DECs Paris Research Labs for The idea of a computer which can be customised, several years [7], [8]. The latest version, PeRLe-1, under programmer control, on an application-by- consists of a 5x5 array of Xilinx 3090 FTGAs, connected application basis, is not new. A writable control store to local 32-bit wide RAM banks, and also, via a lOOMB/s within a micro-programmed computer allows the TURBOchannel interface, to a DEC desktop workstation. programmer to design application-specific machine The workstation writes data to the RAM banks, which is instructions, which can make better use of the existing processed by the Xilinx array and returned to the RAM functional units within the computer, and hence improve banks, from where the host then retrieves the results. the performance of specific applications, including those Programming the PAM consists of designing software within DSP [4]. Such performance improvement has components for the host, and hardware components for been limited because of the interpretative nature of the PAM array. The latter can be done by writing a microprogramming. program in a conventional programming language (Lisp, A custom computer goes one step further than a C++, and Esterel are used) using a specialised library. writable control store by allowing design of new The program describes logic modules by their bit-level functional units, rather than simply making better use of logic equations, or by using standard library modules existing functional units. such as adders, and registers. Ten applications are described in [8], including long 3. Some Existing Custom Computers multiplication, RSA cryptography, data compression, string matching, heat and Laplace equations, a In this section, we examine three of the best known Boltzmann machine neural network, 3-D graphics custom computers as examples of the current state of the acceleration, and the discrete cosine transform. Results art in this area. are very encouraging; e.g., the PAM implementation of 512-bit RSA cryptography was faster than any other 3.1 SPLASH reported implementation in any technology as of February 1990, and 10 times faster than the next best SPLASH and SPLASH I1 [5] are custom computers reported implementation on a custom VLSI circuit. which have been developed at the Supercomputer Research Corporation, Maryland. The SPLASH I1 pro- cessor consists of an extendable number of processor 165 3.3 The Virtual Computer cessing element. For this reason, we are now exploring a new custom computing architecture explicitly targetted at The Virtual Computer [9], from the Virtual Computer the efficient implementation of digital signal and image Corporation, provides approximately 500,000 processing applications. programmable logic gates, using an array of 52 Xilinx 4010 FPGAs and 24 ICUBE Field Programmable 4.2 Our New Architecture Interconnect Devices, 8 megabytes of SRAM, and 16k x 16-bit 25ns dual-port RAMs. The system also has 3 x Our decision to concentrate our efforts on a custom 64-bit 1/0 ports - one for hardware configuration / read- computing architecture which is specifically designed for back, and two for general purpose 1/0 such as connection the support of DSP algorithms leads us naturally to the to a host workstation. The central processing array of 40 provision of specific hardware support for the arithmetic Xilinx FPGAs, called the Virtual Array, consists of four operations which dominate many DSP algorithms, but Virtual Pipelines, each with 10 FPGAs connected to which are costly
Recommended publications
  • Historical Perspective and Further Reading 162.E1
    2.21 Historical Perspective and Further Reading 162.e1 2.21 Historical Perspective and Further Reading Th is section surveys the history of in struction set architectures over time, and we give a short history of programming languages and compilers. ISAs include accumulator architectures, general-purpose register architectures, stack architectures, and a brief history of ARMv7 and the x86. We also review the controversial subjects of high-level-language computer architectures and reduced instruction set computer architectures. Th e history of programming languages includes Fortran, Lisp, Algol, C, Cobol, Pascal, Simula, Smalltalk, C+ + , and Java, and the history of compilers includes the key milestones and the pioneers who achieved them. Accumulator Architectures Hardware was precious in the earliest stored-program computers. Consequently, computer pioneers could not aff ord the number of registers found in today’s architectures. In fact, these architectures had a single register for arithmetic instructions. Since all operations would accumulate in one register, it was called the accumulator , and this style of instruction set is given the same name. For example, accumulator Archaic EDSAC in 1949 had a single accumulator. term for register. On-line Th e three-operand format of RISC-V suggests that a single register is at least two use of it as a synonym for registers shy of our needs. Having the accumulator as both a source operand and “register” is a fairly reliable indication that the user the destination of the operation fi lls part of the shortfall, but it still leaves us one has been around quite a operand short. Th at fi nal operand is found in memory.
    [Show full text]
  • Computer Organization EECC 550 • Introduction: Modern Computer Design Levels, Components, Technology Trends, Register Transfer Week 1 Notation (RTN)
    Computer Organization EECC 550 • Introduction: Modern Computer Design Levels, Components, Technology Trends, Register Transfer Week 1 Notation (RTN). [Chapters 1, 2] • Instruction Set Architecture (ISA) Characteristics and Classifications: CISC Vs. RISC. [Chapter 2] Week 2 • MIPS: An Example RISC ISA. Syntax, Instruction Formats, Addressing Modes, Encoding & Examples. [Chapter 2] • Central Processor Unit (CPU) & Computer System Performance Measures. [Chapter 4] Week 3 • CPU Organization: Datapath & Control Unit Design. [Chapter 5] Week 4 – MIPS Single Cycle Datapath & Control Unit Design. – MIPS Multicycle Datapath and Finite State Machine Control Unit Design. Week 5 • Microprogrammed Control Unit Design. [Chapter 5] – Microprogramming Project Week 6 • Midterm Review and Midterm Exam Week 7 • CPU Pipelining. [Chapter 6] • The Memory Hierarchy: Cache Design & Performance. [Chapter 7] Week 8 • The Memory Hierarchy: Main & Virtual Memory. [Chapter 7] Week 9 • Input/Output Organization & System Performance Evaluation. [Chapter 8] Week 10 • Computer Arithmetic & ALU Design. [Chapter 3] If time permits. Week 11 • Final Exam. EECC550 - Shaaban #1 Lec # 1 Winter 2005 11-29-2005 Computing System History/Trends + Instruction Set Architecture (ISA) Fundamentals • Computing Element Choices: – Computing Element Programmability – Spatial vs. Temporal Computing – Main Processor Types/Applications • General Purpose Processor Generations • The Von Neumann Computer Model • CPU Organization (Design) • Recent Trends in Computer Design/performance • Hierarchy
    [Show full text]
  • Digital Semiconductor Alpha 21064 and Alpha 21064A Microprocessors Hardware Reference Manual
    Digital Semiconductor Alpha 21064 and Alpha 21064A Microprocessors Hardware Reference Manual Order Number: EC–Q9ZUC–TE Abstract: This document contains information about the following Alpha microprocessors: 21064-150, 21064-166, 21064-200, 21064A-200, 21064A-233, 21064A-275, 21064A-275-PC, and 21064A-300. Revision/Update Information: This manual supersedes the Alpha 21064 and Alpha 21064A Microprocessors Hard- ware Reference Manual (EC–Q9ZUB–TE). Digital Equipment Corporation Maynard, Massachusetts June 1996 While Digital believes the information included in this publication is correct as of the date of publication, it is subject to change without notice. Digital Equipment Corporation makes no representations that the use of its products in the manner described in this publication will not infringe on existing or future patent rights, nor do the descriptions contained in this publication imply the granting of licenses to make, use, or sell equipment or software in accordance with the description. © Digital Equipment Corporation 1996. All rights reserved. Printed in U.S.A. AlphaGeneration, Digital, Digital Semiconductor, OpenVMS, VAX, VAX DOCUMENT, the AlphaGeneration design mark, and the DIGITAL logo are trademarks of Digital Equipment Corporation. Digital Semiconductor is a Digital Equipment Corporation business. GRAFOIL is a registered trademark of Union Carbide Corporation. Windows NT is a trademark of Microsoft Corporation. All other trademarks and registered trademarks are the property of their respective owners. This document was prepared using VAX DOCUMENT Version 2.1. Contents Preface ..................................................... xix 1 Introduction to the 21064/21064A 1.1 Introduction ......................................... 1–1 1.2 The Architecture . ................................... 1–1 1.3 Chip Features ....................................... 1–2 1.4 Backward Compatibility ...............................
    [Show full text]
  • Phaser 600 Color Printer User Manual
    WELCOME! This is the home page of the Phaser 600 Color Printer User Manual. See Tips on using this guide, or go immediately to the Contents. Phaser® 600 Wide-Format Color Printer Tips on using this guide ■ Use the navigation buttons in Acrobat Reader to move through the document: 1 23 4 5 6 1 and 4 are the First Page and Last Page buttons. These buttons move to the first or last page of a document. 2 and 3 are the Previous Page and Next Page buttons. These buttons move the document backward or forward one page at a time. 5 and 6 are the Go Back and Go Forward buttons. These buttons let you retrace your steps through a document, moving to each page or view in the order visited. ■ For best results, use the Adobe Acrobat Reader version 2.1 to read this guide. Version 2.1 of the Acrobat Reader is supplied on your printer’s CD-ROM. ■ Click on the page numbers in the Contents on the following pages to open the topics you want to read. ■ You can click on the page numbers following keywords in the Index to open topics of interest. ■ You can also use the Bookmarks provided by the Acrobat Reader to navigate through this guide. If the Bookmarks are not already displayed at the left of the window, select Bookmarks and Page from the View menu. ■ If you have difficultly reading any small type or seeing the details in any of the illustrations, you can use the Acrobat Reader’s Magnification feature.
    [Show full text]
  • RISC-V Geneology
    RISC-V Geneology Tony Chen David A. Patterson Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2016-6 http://www.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-6.html January 24, 2016 Copyright © 2016, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. Introduction RISC-V is an open instruction set designed along RISC principles developed originally at UC Berkeley1 and is now set to become an open industry standard under the governance of the RISC-V Foundation (www.riscv.org). Since the instruction set architecture (ISA) is unrestricted, organizations can share implementations as well as open source compilers and operating systems. Designed for use in custom systems on a chip, RISC-V consists of a base set of instructions called RV32I along with optional extensions for multiply and divide (RV32M), atomic operations (RV32A), single-precision floating point (RV32F), and double-precision floating point (RV32D). The base and these four extensions are collectively called RV32G. This report discusses the historical precedents of RV32G. We look at 18 prior instruction set architectures, chosen primarily from earlier UC Berkeley RISC architectures and major proprietary RISC instruction sets. Among the 122 instructions in RV32G: ● 6 instructions do not have precedents among the selected instruction sets, ● 98 instructions of the 116 with precedents appear in at least three different instruction sets.
    [Show full text]
  • Effectiveness of the MAX-2 Multimedia Extensions for PA-RISC 2.0 Processors
    Effectiveness of the MAX-2 Multimedia Extensions for PA-RISC 2.0 Processors Ruby Lee Hewlett-Packard Company HotChips IX Stanford, CA, August 24-26,1997 Outline Introduction PA-RISC MAX-2 features and examples Mix Permute Multiply with Shift&Add Conditionals with Saturation Arith (e.g., Absolute Values) Performance Comparison with / without MAX-2 General-Purpose Workloads will include Increasing Amounts of Media Processing MM a b a b 2 1 2 1 b c b c functionality 5 2 5 2 A B C D 1 2 22 2 2 33 3 4 55 59 A B C D 1 2 A B C D 22 1 2 22 2 2 2 2 33 33 3 4 55 59 3 4 55 59 Distributed Multimedia Real-time Information Access Communications Tool Tool Computation Tool time 1980 1990 2000 Multimedia Extensions for General-Purpose Processors MAX-1 for HP PA-RISC (product Jan '94) VIS for Sun Sparc (H2 '95) MAX-2 for HP PA-RISC (product Mar '96) MMX for Intel x86 (chips Jan '97) MDMX for SGI MIPS-V (tbd) MVI for DEC Alpha (tbd) Ideally, different media streams map onto both the integer and floating-point datapaths of microprocessors images GR: GR: 32x32 video 32x64 ALU SMU FP: graphics FP:16x64 Mem 32x64 audio FMAC PA-RISC 2.0 Processor Datapath Subword Parallelism in a General-Purpose Processor with Multimedia Extensions General Regs. y5 y6 y7 y8 x5 x6 x7 x8 x1 x2 x3 x4 y1 y2 y3 y4 Partitionable Partitionable 64-bit ALU 64-bit ALU 8 ops / cycle Subword Parallel MAX-2 Instructions in PA-RISC 2.0 Parallel Add (modulo or saturation) Parallel Subtract (modulo or saturation) Parallel Shift Right (1,2 or 3 bits) and Add Parallel Shift Left (1,2 or 3 bits) and Add Parallel Average Parallel Shift Right (n bits) Parallel Shift Left (n bits) Mix Permute MAX-2 Leverages Existing Processing Resources FP: INTEGER FLOAT GR: 16x64 General Regs.
    [Show full text]
  • Alpha 21064A Microprocessors Data Sheet
    Alpha 21064A Microprocessors Data Sheet Order Number: EC–QFGKC–TE This document contains information about the following Alpha micro- processors: 21064A–200, 21064A–233, 21064A–275, 21064A–275–PC, and 21064A–300. Revision/Update Information: This document supersedes the Alpha 21064A–233, –275 Microprocessor Data Sheet, EC–QFGKB–TE. Digital Equipment Corporation Maynard, Massachusetts January 1996 While Digital believes the information included in this publication is correct as of the date of publication, it is subject to change without notice. Digital Equipment Corporation makes no representations that the use of its products in the manner described in this publication will not infringe on existing or future patent rights, nor do the descriptions contained in this publication imply the granting of licenses to make, use, or sell equipment or software in accordance with the description. © Digital Equipment Corporation 1995, 1996. All rights reserved. Printed in U.S.A. AlphaGeneration, Digital, Digital Semiconductor, OpenVMS, VAX, VAX DOCUMENT, the AlphaGeneration design mark, and the DIGITAL logo are trademarks of Digital Equipment Corporation. Digital Semiconductor is a Digital Equipment Corporation business. GRAFOIL is a registered trademark of Union Carbide Corporation. Windows NT is a trademark of Microsoft Corporation. All other trademarks and registered trademarks are the property of their respective owners. This document was prepared using VAX DOCUMENT Version 2.1. Contents 1 Overview ........................................... 1 2 Signal Names and Functions . ........................... 6 3 Instruction Set ....................................... 17 3.1 Instruction Summary ............................... 17 3.2 IEEE Floating-Point Instructions . ................... 23 3.3 21064A IEEE Floating-Point Conformance .............. 25 3.4 VAX Floating-Point Instructions . ................... 28 3.5 Required PALcode Function Codes .
    [Show full text]
  • Design of the RISC-V Instruction Set Architecture
    Design of the RISC-V Instruction Set Architecture Andrew Waterman Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2016-1 http://www.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-1.html January 3, 2016 Copyright © 2016, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. Design of the RISC-V Instruction Set Architecture by Andrew Shell Waterman A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor David Patterson, Chair Professor Krste Asanovi´c Associate Professor Per-Olof Persson Spring 2016 Design of the RISC-V Instruction Set Architecture Copyright 2016 by Andrew Shell Waterman 1 Abstract Design of the RISC-V Instruction Set Architecture by Andrew Shell Waterman Doctor of Philosophy in Computer Science University of California, Berkeley Professor David Patterson, Chair The hardware-software interface, embodied in the instruction set architecture (ISA), is arguably the most important interface in a computer system. Yet, in contrast to nearly all other interfaces in a modern computer system, all commercially popular ISAs are proprietary.
    [Show full text]
  • Computer Architectures an Overview
    Computer Architectures An Overview PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 25 Feb 2012 22:35:32 UTC Contents Articles Microarchitecture 1 x86 7 PowerPC 23 IBM POWER 33 MIPS architecture 39 SPARC 57 ARM architecture 65 DEC Alpha 80 AlphaStation 92 AlphaServer 95 Very long instruction word 103 Instruction-level parallelism 107 Explicitly parallel instruction computing 108 References Article Sources and Contributors 111 Image Sources, Licenses and Contributors 113 Article Licenses License 114 Microarchitecture 1 Microarchitecture In computer engineering, microarchitecture (sometimes abbreviated to µarch or uarch), also called computer organization, is the way a given instruction set architecture (ISA) is implemented on a processor. A given ISA may be implemented with different microarchitectures.[1] Implementations might vary due to different goals of a given design or due to shifts in technology.[2] Computer architecture is the combination of microarchitecture and instruction set design. Relation to instruction set architecture The ISA is roughly the same as the programming model of a processor as seen by an assembly language programmer or compiler writer. The ISA includes the execution model, processor registers, address and data formats among other things. The Intel Core microarchitecture microarchitecture includes the constituent parts of the processor and how these interconnect and interoperate to implement the ISA. The microarchitecture of a machine is usually represented as (more or less detailed) diagrams that describe the interconnections of the various microarchitectural elements of the machine, which may be everything from single gates and registers, to complete arithmetic logic units (ALU)s and even larger elements.
    [Show full text]
  • Virtual Memory CS740 October 13, 1998
    Virtual Memory CS740 October 13, 1998 Topics • page tables • TLBs • Alpha 21X64 memory system Levels in a Typical Memory Hierarchy cache virtual memory C 4 Ba 8 B 4 KB CPU c Memory CPU Memory diskdisk regs h regs e register cache memory disk memory reference reference reference reference size: 200 B 32 KB / 4MB 128 MB 20 GB speed: 3 ns 6 ns 100 ns 10 ms $/Mbyte: $256/MB $2/MB $0.10/MB block size: 4 B 8 B 4 KB larger, slower, cheaper – 2 – CS 740 F’98 Virtual Memory Main memory acts as a cache for the secondary storage (disk) Virtual addresses Physical addresses Address translation Disk addresses Increases Program-Accessible Memory • address space of each job larger than physical memory • sum of the memory of many jobs greater than physical memory – 3 – CS 740 F’98 Address Spaces • Virtual and physical address spaces divided into equal-sized blocks – “Pages” (both virtual and physical) • Virtual address space typically larger than physical • Each process has separate virtual address space Physical addresses (PA) Virtual addresses (VA) 0 0 address translation VP 1 PP2 Process 1: VP 2 2n-1 PP7 (Read-only library code) 0 Process 2: VP 1 VP 2 PP10 2n-1 2m-1 – 4 – CS 740 F’98 Other Motivations Simplifies memory management • main reason today • Can have multiple processes resident in physical memory • Their program addresses mapped dynamically – Address 0x100 for process P1 doesn’t collide with address 0x100 for process P2 • Allocate more memory to process as its needs grow Provides Protection • One process can’t interfere with another – Since
    [Show full text]
  • Address Translation & Caches Outline
    Operating Systems & Memory Systems: Address Translation & Caches CPS 220 Professor Alvin R. Lebeck Fall 2001 Outline • Review • TLBs • Page Table Designs • Interaction of VM and Caches Admin • HW #4 Due today • Project Status report due Thursday © Alvin R. Lebeck 2001 CPS 220 2 Page 1 Virtual Memory: Motivation Virtual • Process = Address Space + thread(s) of Physical control • Address space = PA – programmer controls movement from disk – protection? – relocation? • Linear Address space – larger than physical address space » 32, 64 bits v.s. 28-bit physical (256MB) • Automatic management © Alvin R. Lebeck 2001 CPS 220 3 Virtual Memory • Process = virtual address space + thread(s) of control • Translation – VA -> PA – What physical address does virtual address A map to – Is VA in physical memory? • Protection (access control) – Do you have permission to access it? © Alvin R. Lebeck 2001 CPS 220 4 Page 2 Segmented Virtual Memory • Virtual address (232, 264) to Physical Address mapping (230) • Variable size, base + offset, contiguous in both VA and PA Virtual Physical 0x1000 0x0000 0x1000 0x6000 0x2000 0x9000 0x11000 © Alvin R. Lebeck 2001 CPS 220 5 Paged Virtual Memory • Virtual address (232, 264) to Physical Address mapping (228) – virtual page to physical page frame Virtual page number Offset • Fixed Size units for access control & translation Virtual Physical 0x1000 0x0000 0x1000 0x6000 0x2000 0x9000 0x11000 © Alvin R. Lebeck 2001 CPS 220 6 Page 3 Page Table • Kernel data structure (per process) • Page Table Entry (PTE) – VA -> PA translations (if none page fault) – access rights (Read, Write, Execute, User/Kernel, cached/uncached) – reference, dirty bits • Many designs – Linear, Forward mapped, Inverted, Hashed, Clustered • Design Issues – support for aliasing (multiple VA to single PA) – large virtual address space – time to obtain translation © Alvin R.
    [Show full text]
  • Database Integration
    I DATABASE INTEGRATION ALPHA SERVERS & WORKSTATIONS Digital ALPHA 21164 CPU Technical Journal Editorial The Digital TechnicalJournal is a refereed Cyrix is a trademark of Cyrix Corporation. Jane C. Blake, Managing Editor journal published quarterly by Digital dBASE is a trademark and Paradox is Helen L. Patterson, Editor Equipment Corporation, 30 Porter Road a registered trademark of Borland Kathleen M. Stetson, Editor LJ02/D10, Littleton, Massachusetts 01460. International, Inc. Subscriptionsto the Journal are $40.00 Circulation (non-U.S. $60) for four issues and $75.00 EDA/SQL is a trademark of Information Catherine M. Phillips, Administrator (non-U.S. $115) for eight issues and must Builders, Inc. Dorothea B. Cassady, Secretary be prepaid in U.S. funds. University and Encina is a registered trademark of Transarc college professors and Ph.D. students in Corporation. Production the electrical engineering and computer Excel and Microsoft are registered pde- Terri Autieri, Production Editor science fields receive complimentary sub- marks and Windows and Windows NT are Anne S. Katzeff, Typographer scriptions upon request. Orders, inquiries, trademarks of Microsoft Corporation. Joanne Murphy, Typographer and address changes should be sent to the Peter R Woodbury, Illustrator Digital TechnicalJournal at the published- Hewlett-Packard and HP-UX are registered by address. Inquiries can also be sent elec- trademarks of Hewlett-Packard Company. Advisory Board tronically to [email protected]. Single copies INGRES is a registered trademark of Ingres Samuel H. Fuller, Chairman and back issues are available for $16.00 each Corporation. Richard W. Beane by calling DECdirect at 1-800-DIGITAL Donald Z. Harbert (1-800-344-4825).
    [Show full text]