Coherent Network Interfaces for Fine-Grain Communication

Total Page:16

File Type:pdf, Size:1020Kb

Coherent Network Interfaces for Fine-Grain Communication To appear in the Proceedings of the 23rd International Symposium on Computer Architecture (ISCA), 1996 Coherent Network Interfaces for Fine-Grain Communication Shubhendu S. Mukherjee, Babak Falsafi, Mark D. Hill, and David A. Wood Computer Sciences Department University of Wisconsin-Madison Madison, Wisconsin 53706-1685 USA {shubu,babak,markhill,david}@cs.wisc.edu Abstract 1 Introduction Historically, processor accesses to memory-mapped device Most current computer systems do not efficiently support fine- registers have been marked uncachable to insure their visibility to grain communication. Processors receive data from external the device. The ubiquity of snooping cache coherence, however, devices, such as high-speed networks, through DMA and uncach- makes it possible for processors and devices to interact with cach- able device registers. A processor becomes aware of an external able, coherent memory operations. Using coherence can improve performance by facilitating burst transfers of whole cache blocks event (e.g., a message arrival) via interrupts or by polling and reducing control overheads (e.g., for polling). uncached status registers. Both notification mechanisms are This paper begins an exploration of network interfaces (NIs) costly: interrupts have high latency and polling wastes processor that use coherence—coherent network interfaces (CNIs)—to cycles and other system resources. A processor sends data with an improve communication performance. We restrict this study to NI/ uncachable store, a mechanism that is rarely given first-class sup- CNIs that reside on coherent memory or I/O buses, to NI/CNIs that port. Both uncachable loads and stores incur high overhead are much simpler than processors, and to the performance of fine- because they carry small amounts of data (e.g., 4-16 bytes), which grain messaging from user process to user process. fails to use the full transfer bandwidth between a processor and a Our first contribution is to develop and optimize two mecha- device. Optimizations such as block copy [42] or special store nisms that CNIs use to communicate with processors. A cachable buffers [42, 23] can help improve the performance of uncachable device register—derived from cachable control registers [39,40]— accesses by transferring data in chunks. However, these optimiza- is a coherent, cachable block of memory used to transfer status, control, or data between a device and a processor. Cachable tions are processor-specific, may require new instructions [42, 23], queues generalize cachable device registers from one cachable, and may be restricted in their use [42]. coherent memory block to a contiguous region of cachable, coher- Snooping cache coherence mechanisms, on the other hand, are ent blocks managed as a circular queue. supported by almost all current processors and memory buses. Our second contribution is a taxonomy and comparison of four These mechanisms allow a processor to quickly and efficiently CNIs with a more conventional NI. Microbenchmark results show obtain a cache block’s worth of data (e.g., 32-128 bytes) from that CNIs can improve the round-trip latency and achievable another processor or memory. bandwidth of a small 64-byte message by 37% and 125% respec- This paper explores leveraging the first-class support given to tively on the memory bus and 74% and 123% respectively on a coherent I/O bus. Experiments with five macrobenchmarks show snooping cache coherence to improve communication between that CNIs can improve the performance by 17-53% on the memory processors and network interfaces (NIs). NIs need attention, bus and 30-88% on the I/O bus. because progress in high-bandwidth, low-latency networks is rap- idly making NIs a bottleneck. Rather than try to explore the entire NI design space here, we focus our efforts three ways: •First, we concentrate on NIs that reside on memory or I/O buses. In contrast, other research has examined placing NIs in proces- sor registers [5,15,21], in the level-one cache controller [1], and on the level-two cache bus [10]. Our NIs promise lower cost ThisThis w orkwork is issupported supported in inpart part by by Wright Wright Laboratory Laboratory A vionicsAvionics Directorate, Director- than the other alternatives, given the economics of current Airate, Force Air FMaterialorce Material Command, Command, USAF USAF, under, undergrant #F33615-94-1-1525grant #F33615-94-1- and microprocessors and higher integration level we expect in the ARP1525A orderand ARP no.A B550, order NSF no. B550, PYI A NSFward PYI CCR-9157366, Award CCR-9157366, NSF Grant NSF MIP- future. Nevertheless, closer integration is desirable if it can be 9225097,Grant MIP-9225097, an I.B.M. cooperati an A.T.&Tve fello. graduatewship, felloandwship, donations and donationsfrom A.T.&T. made economically viable. Bellfrom Laboratories, A.T.&T. Bell Digital Laboratories, Equipment Digital Corporation, Equipment Corporation,Sun Microsystems, Sun ThinkingMicrosystems, Machines Thinking Corporation, Machines and Corporation, Xerox Corporation. and Xerox Our Corporation. Thinking •Second, we limit ourselves to relatively simple NIs—similar in MachinesOur Thinking CM-5 Machines was purchased CM-5 wthroughas purchased NSF Institutionalthrough NSF Infrastructure Institutional complexity to the Thinking Machines CM-5 NI [29] or a DMA GrantInfrastructure No. CDA-9024618 Grant No. CDwithA-9024618 matching withfunding matching from fundingthe Uni vfromersity the of engine. In contrast, other research has examined complex, pow- WUniisconsinversity Graduate of Wisconsin School. Graduate The U.S. School. Government The U.S. is Goauthorizedvernment to is repro- erful NIs that integrate an integer processor core [28, 38] to offer duceauthorized and distrib to reproduceute reprints and for distrib Govuteernmental reprints purposesfor Governmental notwithstanding pur- higher performance at higher cost. While both simple and com- anposesy cop notwithstandingyright notation anthereon.y copyright The notationviews and thereon. conclusions The vie wscontained and plex NIs are interesting, we concentrate on simple NIs where hereinconclusions are those contained of the authors herein andare thoseshould of not the be authors interpreted and should as necessarily not be coherence has not yet been fully exploited. representinginterpreted asthe necessarily official policies representing or endorsements, the official policies either ore xpressedendorse- or •Third, we focus on program-controlled fine-grain communica- implied,ments, ofeither the eWrightxpressed Laboratory or implied, A vionicsof the Wright Directorate Laboratory or the A U.S.vionics Gov- tion between peer user processes, as required by demanding par- ernment.Directorate or the U.S. Government. allel computing applications. This includes notifying the . receiving process that data is available without requiring an interrupt. In contrast, DMA devices send larger messages to remote memory, and only optionally notify the receiving process with a relatively heavy-weight interrupt. We explore a class of coherent network interfaces (CNIs) that reside on a processor node’s memory or coherent I/O bus and par- ticipate in the cache coherence protocol. CNIs interact with a coherent bus like Stanford DASH’s RC/PCPU [30], but support small 64-byte message by 37% and 125% respectively on a mem- messaging rather than distributed shared memory. CNIs commu- ory bus and 74% and 123% respectively on a coherent I/O bus. nicate with the processor through two mechanisms: cachable Experiments with five macrobenchmarks show that a CNI can device registers (CDRs) and cachable queues (CQs). A CDR— improve the performance by 17-53% on the memory bus and 30- derived from cachable control registers [39, 40]—is a coherent, 88% on the I/O bus. cachable block of memory used to transfer status, control, or data We see our paper having two main contributions. First, we between a device and a processor. In the common case of develop cachable queues, including using lazy pointers, message unchanging information, e.g., polling, a CDR removes unneces- valid bits, and sense-reverse. Second, we do the first micro- and sary bus traffic because repeated accesses hit in the cache. When macro-benchmarks comparison of alternative CNIs—exposed by changes do occur, CDRs use the underlying coherence protocol to our taxonomy—with a conventional NI. transfer messages a full cache block at a time. Cachable queues A weakness of this paper, however, is that we do not do an in- (CQs) are a new mechanism that generalize CDRs from one cach- depth comparison of our proposals with DMA. The magnitude of able, coherent memory block to a contiguous region of cachable, this deficiency depends on how important one expects DMA to be coherent blocks managed as a circular queue to amortize control compared to fine-grain communication in future systems. Some overheads. To maximize performance we exploit several critical argue that DMA will become more important as techniques like optimizations: lazy pointers, message valid bits, and sense- User-level DMA [3] reduce DMA initiation overheads. Others reverse. Because CQs look, smell, and act like normal cachable argue DMA will become less important as processors add block memory, message send and receive overheads are extremely low: copy instructions [42] (making the breakeven size for DMA a cache miss plus several cache hits. Furthermore, if the system larger) and as the marginal cost of adding another processor supports prefetching or an update-based coherence protocol, even diminishes [48] (making it less expensive to temporarily waste a the cache miss may be eliminated. Because CNIs transfer mes- processor). sages a cache block at a time, the sustainable bandwidth is much The rest of this paper describes CDRs and CQs in detail greater than conventional program-controlled NIs—such as the (Section 2), presents CNI taxonomy and implementations, CM-5 NI [44]—that rely on slower uncachable loads and stores. (Section 3), describes evaluation methodology (Section 4), ana- For symmetric multiprocessors (SMPs), which are often limited lyzes results (Section 5), reviews related work (Section 6), and by memory bus bandwidth, the reduced bus occupancy for access- concludes (Section 7).
Recommended publications
  • Ross Technology RT6224K User Manual 1 (Pdf)
    Full-service, independent repair center -~ ARTISAN® with experienced engineers and technicians on staff. TECHNOLOGY GROUP ~I We buy your excess, underutilized, and idle equipment along with credit for buybacks and trade-ins. Custom engineering Your definitive source so your equipment works exactly as you specify. for quality pre-owned • Critical and expedited services • Leasing / Rentals/ Demos equipment. • In stock/ Ready-to-ship • !TAR-certified secure asset solutions Expert team I Trust guarantee I 100% satisfaction Artisan Technology Group (217) 352-9330 | [email protected] | artisantg.com All trademarks, brand names, and brands appearing herein are the property o f their respective owners. Find the Ross Technology RT6224K-200/512S at our website: Click HERE 830-0016-03 Rev A 11/15/96 PRELIMINARY Colorado 4 RT6224K hyperSPARC CPU Module Features D Based on ROSS’ fifth-generation D SPARC compliant — Zero-wait-state, 512-Kbyte or hyperSPARC processor — SPARC Instruction Set Architec- 1-Mbyte 2nd-level cache — RT620D Central Processing Unit ture (ISA) Version 8 compliant — Demand-paged virtual memory (CPU) — Conforms to SPARC Reference management — RT626 Cache Controller, Memory MMU Architecture D Module design Management, and Tag Unit — Conforms to SPARC Level 2 MBus — MBus-standard form factor: 3.30” (CMTU) Module Specification (Revision 1.2) (8.34 cm) x 5.78” (14.67 cm) — Four (512-Kbyte) or eight D Dual-clock architecture — Provides CPU upgrade path at (1-Mbyte) RT628 Cache Data Units module level (CDUs) Ċ CPU scaleable up to
    [Show full text]
  • Fieldserver – EZ Gateway M-Bus to Modbus & Bacnet Start-Up Guide FS-EZX-MBUS-MOD-BAC
    FieldServer – EZ Gateway M-Bus to Modbus & BACnet Start-up Guide FS-EZX-MBUS-MOD-BAC APPLICABILITY & EFFECTIVITY Effective for all systems manufactured after March 2019. Document Revision: 2.D T18604 EZ Gateway M-Bus to Modbus & BACnet Start-up Guide Technical Support Please call us for any technical support needs related to the FieldServer product. Sierra Monitor Corporation 1991 Tarob Court Milpitas, CA 95035 Website: www.sierramonitor.com U.S. Support Information: +1 408 964-4443 +1 800 727-4377 Email: [email protected] EMEA Support Information: +31 33 808 0590 Email: [email protected] Technical Support EZ Gateway M-Bus to Modbus & BACnet Start-up Guide TABLE OF CONTENTS 1 About the EZ Gateway......................................................................................................................................... 5 2 Certification .......................................................................................................................................................... 5 2.1 BTL Mark – BACnet Testing Laboratory ........................................................................................................ 5 3 Supplied equipment ............................................................................................................................................ 5 4 Installing the EZ Gateway ................................................................................................................................... 6 4.1 Mounting ........................................................................................................................................................
    [Show full text]
  • An Overview of Soc Buses
    Vojin Oklobdzija/Digital Systems and Applications 6195_C007 Page Proof page 1 11.7.2007 2:16am Compositor Name: JGanesan 7 An Overview of SoC Buses 7.1 Introduction....................................................................... 7-1 7.2 On-Chip Communication Architectures ........................ 7-2 Background . Topologies . On-Chip Communication Protocols . Other Interconnect Issues . Advantages and M. Mitic´ Disadvantages of On-Chip Buses M. Stojcˇev 7.3 System-On-Chip Buses ..................................................... 7-4 AMBA Bus . Avalon . CoreConnect . STBus . Wishbone . University of Nisˇ CoreFrame . Manchester Asynchronous Bus for Low Energy . Z. Stamenkovic´ PI Bus . Open Core Protocol . Virtual Component Interface . m IHP GmbH—Innovations for High SiliconBackplane Network Performance Microelectronics 7.4 Summary.......................................................................... 7-15 7.1 Introduction The electronics industry has entered the era of multimillion-gate chips, and there is no turning back. This technology promises new levels of integration on a single chip, called the system-on-a-chip (SoC) design, but also presents significant challenges to the chip designers. Processing cores on a single chip may number well into the high tens within the next decade, given the current rate of advancements [1]. Interconnection networks in such an environment are, therefore, becoming more and more important [2]. Currently, on-chip interconnection networks are mostly implemented using buses. For SoC applications, design reuse becomes easier if standard internal connection buses are used for interconnecting components of the design. Design teams developing modules intended for future reuse can design interfaces for the standard bus around their particular modules. This allows future designers to slot the reuse module into their new design simply, which is also based around the same standard bus [3].
    [Show full text]
  • Computer Architectures an Overview
    Computer Architectures An Overview PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 25 Feb 2012 22:35:32 UTC Contents Articles Microarchitecture 1 x86 7 PowerPC 23 IBM POWER 33 MIPS architecture 39 SPARC 57 ARM architecture 65 DEC Alpha 80 AlphaStation 92 AlphaServer 95 Very long instruction word 103 Instruction-level parallelism 107 Explicitly parallel instruction computing 108 References Article Sources and Contributors 111 Image Sources, Licenses and Contributors 113 Article Licenses License 114 Microarchitecture 1 Microarchitecture In computer engineering, microarchitecture (sometimes abbreviated to µarch or uarch), also called computer organization, is the way a given instruction set architecture (ISA) is implemented on a processor. A given ISA may be implemented with different microarchitectures.[1] Implementations might vary due to different goals of a given design or due to shifts in technology.[2] Computer architecture is the combination of microarchitecture and instruction set design. Relation to instruction set architecture The ISA is roughly the same as the programming model of a processor as seen by an assembly language programmer or compiler writer. The ISA includes the execution model, processor registers, address and data formats among other things. The Intel Core microarchitecture microarchitecture includes the constituent parts of the processor and how these interconnect and interoperate to implement the ISA. The microarchitecture of a machine is usually represented as (more or less detailed) diagrams that describe the interconnections of the various microarchitectural elements of the machine, which may be everything from single gates and registers, to complete arithmetic logic units (ALU)s and even larger elements.
    [Show full text]
  • Applications of the Scalable Coherent Interface to Data Acquisition At
    CERN LIBRARIES, GENEVA e8a ¥foi`<¥ lllll flliIllllllllllllllllllllllllllll SCOOOO0123 SP:. Ib RDC- EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH ... 3 Q` 6 CERN DRDC 92-6 QJan.DRDCPBS 2 , 1992 Add.1 Applications of the Scalable Coherent Interface to Data Acqu1s1t1on at LHC A. Bogaertsl, J. Buytaertz, R. Divia, H. Miillerl, C. Parkman, P. Ponting, D. Samyn CERN, Geneva, Switzerland B. Skaali, G. Midttun, D. Wormald, J. Wikne University of Oslo, Physics Department, Norway S. Falciano, F. Cesaroni INFN Sezione di Roma and University of Rome, La Sapienza, Italy V.I. Vinogradov Institute of Nuclear Research, Academy of Sciences, Moscow, Russia K. Lochsen, B. Solbergs Dolphin SCI Technology A.S., Oslo, Norway A. Guglielmi, A. Pastore Digital Equipment Corporation {DEC}, Joint Project at CERN F·H. Worm, J. Bovier Creative Electronic Systems {CES}, Geneva, Switzerland C. Davis Radstone Technology plc, Towcester, UK joint spokesmen Fellow at CERN Scientific associate at CERN, funded by Norwegian Research Council for Science and Humanities OCR Output $U?£(.<> Addendum to P33 The proposed work on SCI during 1992 has been revised due to delays in availibil ity of SCI chips. The revised plans are summarized in the new appendices A-D below. Some parts of the project for which the necessary items will not be available in 1992 (such as cache controller chip, the SCI-fiber version, the SCI-SCI bridge and the Futurebus+ bridge) have been postponed till 1993. This concerns primarily the cache coherent readout proposed for the 3"" level trigger. The content and spirit of proposal P33 is however unchanged. We will concentrate on building an SCI demonstration system which should test the data moving properties first, and at the same time prepare the first stage of an SCI ringlet system.
    [Show full text]
  • Intesisbox® Bacnet Server M-Bus Meters
    ® IntesisBox BACnet Server M-Bus Meters User Manual r1.3 eng Issue date: 08/2019 ® IntesisBox BACnet Server – M-Bus User Manual r1.3 eng © Intesis Software S.L.U. 2019 All Rights Reserved. Information in this document is subject to change without notice. The software described in this document is furnished under a license agreement or nondisclosure agreement. The software may be used only in accordance with the terms of those agreements. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or any means electronic or mechanical, including photocopying and recording for any purpose other than the purchaser’s personal use without the written permission of Intesis Software S.L.U. Intesis Software S.L.U. Milà i Fontanals, 1 bis 08700 Igualada Spain TRADEMARKS All trademarks and trade names used in this document are acknowledged to be the copyright of their respective holders. 2/41 © Intesis Software S.L.U. - All rights reserved URL http://www.intesisbox.com email [email protected] IntesisBox is a registered trademark of Intesis Software SLU tel +34 938047134 ® IntesisBox BACnet Server – M-Bus User Manual r1.3 eng Gateway for the integration of M-Bus devices into BACnet MSTP or BACnet IP enabled monitoring and control systems. Order code: IBBACMEB0100000, 10 M-Bus devices IBBACMEB0200000, 20 M-Bus devices IBBACMEB0600000, 60 M-Bus devices IBBACMEB1200000, 120 M-Bus devices 3/41 © Intesis Software S.L.U. - All rights reserved URL http://www.intesisbox.com email [email protected] IntesisBox is a registered trademark of Intesis Software SLU tel +34 938047134 ® IntesisBox BACnet Server – M-Bus User Manual r1.3 eng INDEX 1 Description ..............................................................................................................................................
    [Show full text]
  • PC Hardware Contents
    PC Hardware Contents 1 Computer hardware 1 1.1 Von Neumann architecture ...................................... 1 1.2 Sales .................................................. 1 1.3 Different systems ........................................... 2 1.3.1 Personal computer ...................................... 2 1.3.2 Mainframe computer ..................................... 3 1.3.3 Departmental computing ................................... 4 1.3.4 Supercomputer ........................................ 4 1.4 See also ................................................ 4 1.5 References ............................................... 4 1.6 External links ............................................. 4 2 Central processing unit 5 2.1 History ................................................. 5 2.1.1 Transistor and integrated circuit CPUs ............................ 6 2.1.2 Microprocessors ....................................... 7 2.2 Operation ............................................... 8 2.2.1 Fetch ............................................. 8 2.2.2 Decode ............................................ 8 2.2.3 Execute ............................................ 9 2.3 Design and implementation ...................................... 9 2.3.1 Control unit .......................................... 9 2.3.2 Arithmetic logic unit ..................................... 9 2.3.3 Integer range ......................................... 10 2.3.4 Clock rate ........................................... 10 2.3.5 Parallelism .........................................
    [Show full text]
  • Verifying Distributed Directory-Based Cache Coherence Protocols: S3.Mp, a Case Study
    Verifying Distributed Directory-based Cache Coherence Protocols: S3.mp, a Case Study Fong Pong, Andreas Nowatzyk*, Gunes Aybay* and Michel Dubois Department of EE-Systems *Sun Microsystems Computer Corporation University of Southern California Technology Development Group Los Angeles, CA 90089-2562 Mountain View, CA 94043 Abstract. This paper presents the results for the verification of the S3.mp cache coherence protocol. The S3.mp protocol uses a distributed directory with limited number of pointers and hardware sup- ported overflow handling that keeps processing nodes sharing a data block in a singly linked list. The complexity of the protocol is high and its validation is challenging because of the distributed algorithm used to maintain the linked lists and the non-FIFO network. We found several design errors, including an error which only appears in verification models of more than three processing nodes, which is very unlikely to be detected by intensive simulations. We believe that methods described in this paper are applicable to the verification of other linked list based protocols such as the IEEE Scalable Coherent Interface. 1 Introduction S3.mp (Sun’s Scalable Shared memory MultiProcessor) is a research project which imple- ments a distributed cache-coherent shared-memory system [12]. In S3.mp, cache coherence is supported by a distributed directory-based protocol with a small, fixed number of pointers and a hardware supported overflow mechanism which keeps processing nodes sharing a data block in singly linked lists. Cache coherence protocols that use linked lists have been proposed by Thapar [18] and are also used in the Scalable Coherent Interface (SCI) protocol [7].
    [Show full text]
  • Rc1701xx-MBUS Datasheet (Rev
    RC1701-MBUS4 Radiocrafts RC1701HP-MBUS4 Embedded Wireless Solutions RC1701VHP-MBUS4 Wireless M-Bus High power N Mode RF Transceiver Module EN 13757-4:2013) Product Description The RC1701HP-MBUS is part of a compact surface-mounted Wireless M-Bus module family that measures only 12.7 x 25.4 x 3.7 mm. The module contains a communication controller with embedded wireless M-Bus protocol software supporting the new EN13757-4:2018 Mode N and is pre-certified for operation under the European radio regulations. Applications • Wireless M-Bus • Automatic Meter Reading (AMR) • Advanced Metering Infrastructure (AMI) • Gas and Water meters • Electricity meters • Heat meters, Heat cost allocators • Readers and concentrators • Asset Tracking and Tracing Note: The number of LGA pads differ from photo, see page 8 for details Features • Embedded Wireless M-Bus protocol supporting EN 13757-4:2018 mode N • High power, long range (20 km Line-Of-Sight) • Industry leading Wireless M-Bus protocol stack • Completely shielded • Pin compatible with the RC11XX (including –MBUS, –KNX, -TM and RC232 versions) and 2.4 GHz versions RC2500/2500HP from Radiocrafts • 12.7 x 25.4 x 3.7 mm compact module for SMD mounting • No external components except antenna • Antenna tuning feature • 2.8 – 3.6 V supply voltage, ultra low power modes • Conforms with EU RED directive (EN 300 220, EN 301 489, EN 60950) • EN300-220 Cat 1 receiver option for optimum selectivity • Configurable LBT as defined in CIG UNI/TS 11291-11-4 • Output power table in 3 dB steps from -27 dBm to +27/30 dBm as defined in CIG UNI/TS 11291-11-4 Quick Reference Data Parameter RC1701HP-MBUS Unit Frequency bands 169.4 – 169.8125 MHz Number of channels 41 Data rate 2.4, 4.8, 6.4, 19.2 kbps Max output power + 27/30 dBm dBm Sensitivity, (2.4 / 4.8 / 19.2) -119/-115/-107 dBm Supply voltage 2.8 – 3.6 Volt Current consumption, RX /IDLE 31.7 mA Current consumption, TX (+27/30 dBm) 403 / 703 mA Current consumption, SLEEP Max 2.0 uA Temperature range -30 to +85 C ©2018 Radiocrafts AS RC1701xx-MBUS Datasheet (rev.
    [Show full text]
  • General Spare Node
    !VIK401 - Dino General SPARe Node Specification December 16, 1992 MK401- Dino General SPARe Node OVERVIEW The MK401 (Dino) is the first board in the new machine architecture. As such it incorporates several new devices and concepts. A complete mechanical redesign has resulted in a card size of 14 inches by 15 inches. The ElanlElite communications network is supported, and a new supervisory structure based around the CAN (Control Area Network) serial bus protocol is implemented. All input/output connectors are brought to the front panel of the cards, removing the need for 10 'dongles'. Built-in SCSI drives are also supported with backplane SCSI wiring. The board is MBus based with a single MBus (single node). On this MBus are placed two full size MBus processor card sites. This allows two SPARC CPUs to be supported, either as one dual processor card or two single processor cards. Also on the MBus is the Elan communication processor which has two Elan links to the backplane. An MBus to SBus converter chip drives an SBus which has three single sized SBus master card sites in addition to two SBus DMA controllers supporting an Ethernet port and two SCSI ports. The SBus cards and one of the MBus cards have front panel access space. A memory controller supports up to 128MBytes of error checked DRAM using 4Mbit DRAM technology. Support for 16Mbit DRAMs is included and allows a total of 512MBytes of DRAM on the node. An additional controller supports an 8bit peripheral bus, the lOBus, with a variety of devices attached. These include a Boot ROM, real time clock, two serial ports, keyboard and mouse ports, an interface to the CAN bus and miscellaneous node control functions.
    [Show full text]
  • SPARC/MEM-20 Installation Guide P/N 204227Edition6.0 P/N October 2001
    SPARC/MEM-20 Installation Guide P/N 204227 Edition 6.0 October 2001 Copyright The information in this publication is subject to change without notice. Force Computers, GmbH reserves the right to make changes without notice to this, or any of its products, to improve reliability, performance, or design. Force Computers, GmbH shall not be liable for technical or editorial errors or omissions contained herein, nor for indirect, special, incidental, or consequential damages resulting from the furnishing, performance, or use of this material. This information is pro- vided “as is” and Force Computers, GmbH expressly disclaims any and all warranties, express, implied, statutory, or otherwise, including without limitation, any express, statutory, or implied warranty of merchantability, fitness for a particular purpose, or non-infringement. This publication contains information protected by copyright. This publication shall not be reproduced, transmitted, or stored in a retrieval system, nor its contents used for any purpose, without the prior written consent of Force Computers, GmbH. Force Computers, GmbH assumes no responsibility for the use of any circuitry other than circuitry that is part of a product of Force Computers, GmbH. Force Computers, GmbH does not convey to the purchaser of the product described herein any license under the patent rights of Force Computers, GmbH nor the rights of others. Copyright 2001 by Force Computers, GmbH. All rights reserved. The Force logo is a trademark of Force Computers, GmbH. IEEE is a registered trademark of the Institute for Electrical and Electronics Engineers, Inc. PICMG, CompactPCI, and the CompactPCI logo are registered trademarks and the PICMG logo is a trademark of the PCI Indus- trial Computer Manufacturer’s Group.
    [Show full text]
  • Rpm: a Rapid Prototyping Engine for Multiprocessor Systems1
    Appeared in IEEE Computer, February 1995 RPM: A RAPID PROTOTYPING ENGINE FOR MULTIPROCESSOR SYSTEMS1 Luiz Andre Barroso, Sasan Iman, Jaeheon Jeong, Koray Öner, Krishnan Ramamurthy and Michel Dubois Department of Electrical Engineering - Systems University of Southern California Los Angeles, CA 90089-2562 (213)740-4475 [email protected] Abstract In multiprocessor systems, processing nodes contain a processor, some cache and a share of the system memory, and are connected through a scalable interconnect. The sys- tem memory partitions may be shared (shared-memory systems) or disjoint (message- passing systems). Within each class of systems many architectural variations are possible. Fair comparisons among systems are difficult because of the lack of a common hardware platform to implement the different architectures. RPM (Rapid Prototyping engine for Multiprocessors) is a hardware emulator for the rapid prototyping of various multiprocessor architectures. In RPM, the hardware of the target machine is emulated by reprogrammable controllers implemented with Field-Pro- grammable Gate Arrays (FPGAs). The processors, memories and interconnect are off-the- shelf and their relative speeds can be modified to emulate various component technolo- gies. Every emulation is an actual incarnation of the target machine and therefore software written for the target machine can be easily ported on it with little modification and with- out instrumentation of the code. In this paper, we describe the architecture of RPM, its performance and the proto- typing methodology. We also compare our approach with simulation and breadboard pro- totyping. Keywords: Field-Programmable Gate Arrays (FPGAs), message-passing multicomputers, shared-memory multiprocessors, design verification, performance evaluation, simulation. 1.
    [Show full text]