Controller for a Synchronous DRAM That Maximizes Throughput by Allowing Memory

Total Page:16

File Type:pdf, Size:1020Kb

Controller for a Synchronous DRAM That Maximizes Throughput by Allowing Memory TIIIIIHINII US005630096A United States Patent [191 [11] Patent Number: 5 ,630,096 Zuravletf et al. [45] Date of Patent: May 13,1997 [54] CONTROLLER FOR A SYNCHRONOUS FOREIGN PATENT DOCUMENTS DRAM THAT MAXIMIZES THROUGHPUT BY ALLOWING MEMORY REQUESTS AND 549139 6/1993 European Pat. Off. COMMANDS TO BE ISSUED OUT OF OTHER PUBLICATIONS ORDER IBM Technical Disclosure Bulletin, vol. 33, No. 6A, pp. [75] Inventors: William K. Zuravlelf, Mountajnview; 269-272, Nov. 1990. Timothy RObillSOll, BOuld6I Creek, IBM Technical Disclosure Bulletin, vol. 33, No. 6A, pp. both of Calif. 265-266, Nov. 1990. _ _ _ _ IBM Technical Disclosure Bulletin, vol. 31, No. 9, pp. [73] Assrgnee: Microunity Systems Engineering, Inc., 351454, Feb 1989 Sunnyvale’ Cahf' IBM Technical Disclosure Bulletin, vol. 33, No. 3A, pp. 441-442, Aug. 1990. [21] APPI' NO‘: 43737 5 Micron Semiconductor, Inc., Spec Sheet. MT48LC2M8S1 [22] Filed: May 10’ 1995 S, cover page and pp. 33 and 37, 1993. [5 1] Int. Cl.6 . .. G06F 13/00 Primary Examiner—Frank J . Asta [52] US. Cl. ................................... .. 395/481; 364/DIG. 1; A?vmey, Agent vrFim1—B11n1S,D0ane,Sw¢cker & Mathis 365/233; 395/432; 395/477; 395/478; 395/485; 395/494; 395/496 [57] ABSTRACT [58] Field of Search ........................... .. 365/233; 395/432, A controller for a synchronous DRAM is provided for 395/477, 478, 481, 485, 494, 496 maximizing throughput of memory requests to the synchro . nous DRAM. The controller maintains the spacing between ~ [56] References cued the cormnands to conform with the speci?cations for the 5TB CU} [E synchronous DRAMs while preventing gaps from occurring U'S' P NT DO NTS in the data slots to the synchronous DRAM. Furthermore, 4,685,088 8/1987 Iannucci . the controller allows memory requests and commands to be 4,691,302 9/ 1937 Mawmusch - issued out of order so that the throughput may be maximized 4,734,833 3/1933 Tide" 3 by overlapping required operations which do not speci?cally 4740924 4/192: 361m ' involve data transfer. To achieve this maximized throughput, Hgsdtemon ‘ memory requests are tagged for indicating a sending order. , , ee et a1. 5,179,667 V1993 Iyer_ Thereafter, the memory requests may be arbrtrated when 5,193,193 3,1993 Iyer _ con?icting memory requests are queued and this arbltratlon 5 253 ,214 10/1993 Henmm . process is then decoded for simultaneously updating sched 5,253,357 10/1993 Allen eta1.. uling constraints. The memory requests may be further 5,276,856 1/1994 Norsworthy et all . quali?ed based on the scheduling constraints and a com 5278967 1/ 1994 . mand stack of memory request is then developed for modi 5233377 2/ 1994 G?smlelet 31- - fying update queues. The controller also functions by receiv gtliiai‘tlfe't a1 ing a controller clock signal and generating an SDRAM 5,311,483 5/1994 Takasugi . ' ' clock s1gnal by d1v1d1ng tl'ns controller clock signal. 5,381,536 l/l995 Phelps et al. ......................... .. 395/375 5,513,148 4/1996 Zagar .................................... .. 365/233 20 Claims, 6 Drawing Sheets -100 f 10 f 2Q 40 / 50 K BANK DATA PATH 111%’ DRIVER sum BANK DATA PATH RETURN DATA OUT <,.._‘—..__ DATA PATH <3: CONTROLLER CONTROL CLOCK BLOCK US. Patent May 13, 1997 Sheet 1 0f 6 5,630,096 [10 [2Q 40 [so [100 BANK DATA PATH | ( m 5%» ' DRIVER SDRAM DATABANK PATH I / OUT : REIURNDATA PATHDATA <:" coumoum , CONTROL CLOCK —" BLOCK W 210 W 220 K 23 O / 240 COMMAND BANK BANK C-SWE CONSTANT — UPDAIE QUAUHCATION ARBITRAHON UPDATE UNIT UNAT UNIT UNAT A '. I '. L -_ ' _ -_ B—STATE(n) ' -_ FIG. 2 US. Patent May 13, 1997 Sheet 2 0f 6 5,630,096 310 314 f 300 16MB sumf f 300 6 4“ 8 SigAM D 3 2/ 16/ A CLK D 3/2 161/ D A‘ m PROCESSORMICRO- ‘X; PROCESSORMICRO- f 316 16 64MB sum 0 A cu< FIG. 3(a) FIG. 3(b) / 300 [- 320 / 300 /- 322 D -E/-X 16MB SDRAM D ?f-X 64MB SDRAM mm / ‘ D A CU‘ MICRO- H“ A CLK PROCESSOR A/@ __—J PROCESSOR A/C _—f cu< CLK FIG. 3(0) FIG. 3(d) U.S. Patent May 13, 1997 Sheet 3 of 6 5,630,096 300 / 300 f 52, 8 0 n mm MICRO PROCESSOR A/C PROCESSOR A/C cu< CLK FIG. 3(e) FIG. / 300 300 354 [-350 16 16MB SDRAM X D a MiCRO D A (1K MICRO PROCESSOR A/c PROCESSOR A/(j 352\Ll CLK 16MB SDRAM D A (1K FIG. 3(9) FIG. 3(h) U.S. Patent May 13, 1997 Sheet 6 0f 6 5,630,096 Q=Zz .rJ.\L.J A_ __z= 53222231g@880;8888 ImILJNNENNNx.x.H..x.E28 xx;OiXXX \Q..w15.c= xxxxxxxxxmfWm@20880 E;2:2;lgzigfoi2%:955Eggnazfi2102 xgwiEggragga:08“82x3::8:g#22iE as;22;@023205222xxx; E8;XXX c_|_ XXXXXXXXZxxxxxxxxixxxxxxé hikigégkg::_C 1.1.? o Emma 25m V620$358 8E581 mam5%.156E200 60.6235w $2,528 GE2.5&8 SE65Sm21:51156528 E56mam21351$50528 mam£55m156E200 82: 5,630,096 1 2 CONTROLLER FOR A SYNCHRONOUS and meeting JEDEC standards for synchronous DRAMs so DRAM THAT MAXIMIZES THROUGHPUT that versatile synchronous DRAMs may be provided and BY ALLOWING MEMORY REQUESTS AND applied in many design applications. COMMANDS TO BE ISSUED OUT OF ORDER SUMMARY BACKGROUND An object of the present invention is to control a syn chronous DRAM by interfacing an external device, such as The present invention is directed to a controller for a microprocessor. for reading and writing to the synchronous maximizing throughput of memory requests from an exter DRAM. nal device to a synchronous DRAM. More particularly, the Another object of the present invention is to provide a present invention is directed to a controller which prioritizes controller for a synchronous DRAM which can buffer and multiple memory requests from the external device and process multiple memory requests so that greater issues reordered memory requests to the synchronous throughput, or an equivalent throughput at less latency, can DRAM so that the throughput from the external device to the be achieved by the synchronous DRAM. synchronous DRAM is maximized. A still further object of the present invention is to provide Synchronous DRAMs are relatively new devices which a controller for is suing and completing requests out of order are similar to conventional DRAMs but the synchronous with respect to the received or issued order so that the DRAMs have some important differences. The architecture throughput of the synchronous DRAM is improved by of the synchronous DRAMs is similar to conventional 20 overlapping required operations, which do not speci?cally DRAMs. For instance, the synchronous DRAMs have mul involve data transfer, with operations involving data trans tiplexed address pins, control pins such as RAS, CAS, CS. fer. WE. and bidirectional data pins. Also, the synchronous DRAMs activate a page as does the conventional DRAM A still further object of the present invention is to provide a controller for a synchronous DRAM that schedules and then subsequent accesses to that page occur faster. Accordingly, a precharge operation must be performed 25 memory request commands as closely together as possible before another page is activated. within the timing constraints of the synchronous DRAM so that the throughput of the memory requests is maximized. One di?erence between synchronous DRAMs and con ventional DRAMs is that all input signals are required to These objects of the present invention are ful?lled by have a set-up and hold time with respect to the clock input providing a controller for a synchronous DRAM comprising in synchronous DRAMs. The hold time is referenced to the a sorting unit for receiving memory requests and sorting said same clock input. The outputs also have a clock to output memory requests based on their addresses and a throughput delay referenced to the same clock. Thereby, the synchro maximizing unit for processing said memory requests to the nous characteristics are provided. Furthermore, the synchro synchronous DRAM in response to scheduling which maxi mizes the use of data slots by the synchronous DRAM. The nous DRAMs are pipelined which means that the latency is 35 generally greater than one clock cycle. As a result, second controller is able to prioritize and issue multiple requests to and third synchronous DRAM commands can be sent before the synchronous DRAM in a different order than was the data from the original write request arrives at the received or issued such that the out of order memory synchronous DRAM. Also, the synchronous DRAMs have requests improve the throughput to the synchronous DRAM. two internal banks of data paths which generally correspond In particular, the controller issues memory requests as to separate memory arrays sharing I/O pins. The two internal closely together as possible while maintaining the timing banks of memory paths are a JEDEC standard for synchro constraints of the synchronous DRAM based on its speci nous DRAMs. An example of a known synchronous DRAM ?cations. is a 2 MEG><8 SDRAM from Micron Semiconductor, Inc._. The objects of the present invention are also ful?lled by providing a method for controlling a synchronous DRAM model no. MT48LC2M8SS1S. 45 In the synchronous DRAMs, almost all I/O timings are comprising the steps of receiving memory requests and referenced to the input clock. Minimum parameters such as sorting said memory requests based on their addresses, and CAS latency remain but are transformed from electrical maximizing throughput of said memory requests to the timing requirements to logical requirements so that they are synchronous DRAM so that use of data slots by the syn chronous DRAM is maximized.
Recommended publications
  • This Thesis Has Been Submitted in Fulfilment of the Requirements for a Postgraduate Degree (E.G
    This thesis has been submitted in fulfilment of the requirements for a postgraduate degree (e.g. PhD, MPhil, DClinPsychol) at the University of Edinburgh. Please note the following terms and conditions of use: This work is protected by copyright and other intellectual property rights, which are retained by the thesis author, unless otherwise stated. A copy can be downloaded for personal non-commercial research or study, without prior permission or charge. This thesis cannot be reproduced or quoted extensively from without first obtaining permission in writing from the author. The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the author. When referring to this work, full bibliographic details including the author, title, awarding institution and date of the thesis must be given. Towards the Development of Flexible, Reliable, Reconfigurable, and High- Performance Imaging Systems by Jalal Khalifat A thesis submitted in partial fulfilment of the requirements for the degree of DOCTOR OF PHILOSOPHY The University of Edinburgh May 2016 Declaration I hereby declare that this thesis was composed and originated entirely by myself except where explicitly stated in the text, and that this work has not been submitted for any other degree or professional qualifications. Jalal Khalifat May 2016 Edinburgh, U.K. I Acknowledgements All praises are due to Allah, the most gracious, the most merciful, after three and a half years of continuous work, this work comes to end and it is the moment that I have to thank all the people who have supported me throughout my PhD study and who have contributed to the completion of this thesis.
    [Show full text]
  • Performance of Image and Video Processing with General-Purpose
    Performance of Image and Video Processing with General-Purpose Processors and Media ISA Extensions y Parthasarathy Ranganathan , Sarita Adve , and Norman P. Jouppi Electrical and Computer Engineering y Western Research Laboratory Rice University Compaq Computer Corporation g fparthas,sarita @rice.edu [email protected] Abstract Media processing refers to the computing required for the creation, encoding/decoding, processing, display, and com- This paper aims to provide a quantitative understanding munication of digital multimedia information such as im- of the performance of image and video processing applica- ages, audio, video, and graphics. The last few years tions on general-purpose processors, without and with me- have seen significant advances in this area, but the true dia ISA extensions. We use detailed simulation of 12 bench- promise of media processing will be seen only when ap- marks to study the effectiveness of current architectural fea- plications such as collaborative teleconferencing, distance tures and identify future challenges for these workloads. learning, and high-quality media-rich content channels ap- Our results show that conventional techniques in current pear in ubiquitously available commodity systems. Fur- processors to enhance instruction-level parallelism (ILP) ther out, advanced human-computer interfaces, telepres- provide a factor of 2.3X to 4.2X performance improve- ence, and immersive and interactive virtual environments ment. The Sun VIS media ISA extensions provide an ad- hold even greater promise. ditional 1.1X to 4.2X performance improvement. The ILP One obstacle in achieving this promise is the high com- features and media ISA extensions significantly reduce the putational demands imposed by these applications.
    [Show full text]
  • On Implementation of MPEG-2 Like Real-Time Parallel Media Applications on MDSP Soc Cradle Architecture
    On Implementation of MPEG-2 like Real-Time Parallel Media Applications on MDSP SoC Cradle Architecture Ganesh Yadav1, R. K. Singh2, and Vipin Chaudhary1 1 Dept. of Computer Science, Wayne State University fganesh@cs.,[email protected] 2 Cradle Technologies, [email protected] Abstract. In this paper we highlight the suitability of MDSP 3 architec- ture to exploit the data, algorithmic, and pipeline parallelism o®ered by video processing algorithms like the MPEG-2 for real-time performance. Most existing implementations extract either data or pipeline parallelism along with Instruction Level Parallelism (ILP) in their implementations. We discuss the design of MP@ML decoding system on shared memory MDSP platform and give insights on building larger systems like HDTV. We also highlight how the processor scalability is exploited. Software implementation of video decompression algorithms provides flexibility, but at the cost of being CPU intensive. Hardware implementations have a large development cycle and current VLIW dsp architectures are less flexible. MDSP platform o®ered us the flexibilty to design a system which could scale from four MSPs (Media Stream Processor is a logical clus- ter of one RISC and two DSP processors) to eight MSPs and build a single-chip solution including the IO interfaces for video/audio output. The system has been tested on CRA2003 board. Speci¯c contributions include the multiple VLD algorithm and other heuristic approaches like early-termination IDCT for fast video decoding. 1 Introduction Software programmable SoC architectures eliminate the need for designing ded- icated hardware accelerators for each standard we want to work with. With the rapid evolution of standards like MPEG-2, MPEG-4, H.264, etc.
    [Show full text]
  • XOS Advanced Media Processor
    XOS Advanced Media Processor The XOS Advanced Media Processor is a high performance live media processor for broadcast and streaming applications. Key Business Benefits Application versatility The XOS Advanced Media Processor is the latest generation of Harmonic software-based video appliances. XOS can be used for either broadcast or streaming applications, and is adapted to multiple deployment environments. Classic infrastructures are supported with SDI, ASI, and satellite RF interfaces. Full-IP architectures are also supported: XOS handles MPEG compressed formats, as well as the latest SMPTE ST 2022-6 and SMPTE ST 2110 standards. GRAPHICS TRANSCODE STATMUX MULTIPLEXING ENCRYPT SDI TS over IP PACKAGING SAT RECEPTION OTT INGEST DECRYPT 2022-6 BROADCAST 2110 DVB-S2X XOS HLS STREAMING XOS Advanced Media Processor Inputs and Functionality XOS is packed with features to address any kind of video processing application. In addition to its market-leading compression engine, XOS integrates a broad range of audio codecs, including Dolby AC-4, an advanced video pre-processing engine, a broadcast multiplexer with statmux support, and a state-of-the-art packager for streaming applications. From decoding to encoding, from HDR processing to audio loudness management, Harmonic has you covered. Improved cost of ownership XOS Advanced Media Processor’s unparalleled function integration and performance dramatically reduce the number of appliances required for a given application, significantly improving your cost of ownership. As a software solution, XOS
    [Show full text]
  • A Bibliography of Publications in IEEE Micro
    A Bibliography of Publications in IEEE Micro Nelson H. F. Beebe University of Utah Department of Mathematics, 110 LCB 155 S 1400 E RM 233 Salt Lake City, UT 84112-0090 USA Tel: +1 801 581 5254 FAX: +1 801 581 4148 E-mail: [email protected], [email protected], [email protected] (Internet) WWW URL: http://www.math.utah.edu/~beebe/ 16 September 2021 Version 2.108 Title word cross-reference -Core [MAT+18]. -Cubes [YW94]. -D [ASX19, BWMS19, DDG+19, Joh19c, PZB+19, ZSS+19]. -nm [ABG+16, KBN16, TKI+14]. #1 [Kah93i]. 0.18-Micron [HBd+99]. 0.9-micron + [Ano02d]. 000-fps [KII09]. 000-Processor $1 [Ano17-58, Ano17-59]. 12 [MAT 18]. 16 + + [ABG+16]. 2 [DTH+95]. 21=2 [Ste00a]. 28 [BSP 17]. 024-Core [JJK 11]. [KBN16]. 3 [ASX19, Alt14e, Ano96o, + AOYS95, BWMS19, CMAS11, DDG+19, 1 [Ano98s, BH15, Bre10, PFC 02a, Ste02a, + + Ste14a]. 1-GHz [Ano98s]. 1-terabits DFG 13, Joh19c, LXB07, LX10, MKT 13, + MAS+07, PMM15, PZB+19, SYW+14, [MIM 97]. 10 [Loc03]. 10-Gigabit SCSR93, VPV12, WLF+08, ZSS+19]. 60 [Gad07, HcF04]. 100 [TKI+14]. < [BMM15]. > [BMM15]. 2 [Kir84a, Pat84, PSW91, YSMH91, ZACM14]. [WHCK18]. 3 [KBW95]. II [BAH+05]. ∆ 100-Mops [PSW91]. 1000 [ES84]. 11- + [Lyl04]. 11/780 [Abr83]. 115 [JBF94]. [MKG 20]. k [Eng00j]. µ + [AT93, Dia95c, TS95]. N [YW94]. x 11FO4 [ASD 05]. 12 [And82a]. [DTB01, Dur96, SS05]. 12-DSP [Dur96]. 1284 [Dia94b]. 1284-1994 [Dia94b]. 13 * [CCD+82]. [KW02]. 1394 [SB00]. 1394-1955 [Dia96d]. 1 2 14 [WD03]. 15 [FD04]. 15-Billion-Dollar [KR19a].
    [Show full text]
  • The Bus Interface for 32-Bit Microprocessors That
    Freescale Semiconductor, Inc. MPC60XBUSRM 1/2004 Rev. 0.1 . c n I , r o t c u d n o c i PowerPC™ Microprocessor m e S Family: e l a c The Bus Interface for 32-Bit s e Microprocessors e r F For More Information On This Product, Go to: www.freescale.com Freescale Semiconductor, Inc... Freescale Semiconductor,Inc. F o r M o r G e o I n t f o o : r w m w a t w i o . f n r e O e n s c T a h l i e s . c P o r o m d u c t , Freescale Semiconductor, Inc. Overview 1 Signal Descriptions 2 Memory Access Protocol 3 Memory Coherency 4 . System Status Signals 5 . c n I Additional Bus Configurations 6 , r o t Direct-Store Interface 7 c u d n System Considerations 8 o c i m Processor Summary A e S e Processor Clocking Overview B l a c s Processor Upgrade Suggestions C e e r F L2 Considerations for the PowerPC 604 Processor D Coherency Action Tables E Glossary of Terms and Abbreviations GLO Index IND For More Information On This Product, Go to: www.freescale.com Freescale Semiconductor, Inc. 1 Overview 2 Signal Descriptions 3 Memory Access Protocol 4 Memory Coherency . 5 System Status Signals . c n I 6 Additional Bus Configurations , r o t 7 Direct-Store Interface c u d n 8 System Considerations o c i m A Processor Summary e S e B Processor Clocking Overview l a c s C Processor Upgrade Suggestions e e r F D L2 Considerations for the PowerPC 604 Processor E Coherency Action Tables GLO Glossary of Terms and Abbreviations IND Index For More Information On This Product, Go to: www.freescale.com Freescale Semiconductor, Inc.
    [Show full text]
  • Architectural Trade-Offs in a Latency Tolerant Gallium Arsenide Microprocessor
    Architectural Trade-offs in a Latency Tolerant Gallium Arsenide Microprocessor by Michael D. Upton A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Electrical Engineering) in The University of Michigan 1996 Doctoral Committee: Associate Professor Richard B. Brown, CoChairperson Professor Trevor N. Mudge, CoChairperson Associate Professor Myron Campbell Professor Edward S. Davidson Professor Yale N. Patt © Michael D. Upton 1996 All Rights Reserved DEDICATION To Kelly, Without whose support this work may not have been started, would not have been enjoyed, and could not have been completed. Thank you for your continual support and encouragement. ii ACKNOWLEDGEMENTS Many people, both at Michigan and elsewhere, were instrumental in the completion of this work. I would like to thank my co-chairs, Richard Brown and Trevor Mudge, first for attracting me to Michigan, and then for allowing our group the freedom to explore many different ideas in architecture and circuit design. Their guidance and motivation combined to make this a truly memorable experience. I am also grateful to each of my other dissertation committee members: Ed Davidson, Yale Patt, and Myron Campbell. The support and encouragement of the other faculty on the project, Karem Sakallah and Ron Lomax, is also gratefully acknowledged. My friends and former colleagues Mark Rossman, Steve Sugiyama, Ray Farbarik, Tom Rossman and Kendall Russell were always willing to lend their assistance. Richard Oettel continually reminded me of the valuable support of friends and family, and the importance of having fun in your work. Our corporate sponsors: Cascade Design Automation, Chronologic, Cadence, and Metasoft, provided software and support that made this work possible.
    [Show full text]
  • A Media Enhanced Vector Architecture for Embedded Memory Systems
    A Media-Enhanced Vector Architecture for Embedded Memory Systems Christoforos Kozyrakis Report No. UCB/CSD-99-1059 July 1999 Computer Science Division (EECS) University of California Berkeley, California 94720 A Media-Enhanced Vector Architecture for Embedded Memory Systems by Christoforos Kozyrakis Research Project Submitted to the Department of Electrical Engineering and Computer Sciences, Uni- versity of California at Berkeley, in partial satisfaction of the requirements for the degree of Master of Science, Plan II. Approval for the Report and Comprehensive Examination: Committee: Professor David A. Patterson Research Advisor (Date) ******* Professor Katherine Yelick Second Reader (Date) A Media-Enhanced Vector Architecture for Embedded Memory Systems Christoforos Kozyrakis M.S. Report Abstract Next generation portable devices will require processors with both low energy consump- tion and high performance for media functions. At the same time, modern CMOS technology creates the need for highly scalable VLSI architectures. Conventional processor architec- tures fail to meet these requirements. This paper presents the architecture of Vector IRAM (VIRAM), a processor that combines vector processing with embedded DRAM technology. Vector processing achieves high multimedia performance with simple hardware, while em- bedded DRAM provides high memory bandwidth at low energy consumption. VIRAM pro- vides ¯exible support for media data types, short vectors, and DSP features. The vector pipeline is enhanced to hide DRAM latency without using caches. The peak performance is 3.2 GFLOPS (single precision) and maximum memory bandwidth is 25.6 GBytes/s. With a target power consumption of 2 Watts for the vector pipeline and the memory system, VIRAM supports 1.6 GFLOPS/Watt. For a set of representative media kernels, VIRAM sustains on average 88% of its peak performance, outperforming conventional SIMD media extensions and DSP processors by factors of 4.5 to 17.
    [Show full text]
  • Ilore: Discovering a Lineage of Microprocessors
    iLORE: Discovering a Lineage of Microprocessors Samuel Lewis Furman Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Master of Science in Computer Science & Applications Kirk Cameron, Chair Godmar Back Margaret Ellis May 24, 2021 Blacksburg, Virginia Keywords: Computer history, systems, computer architecture, microprocessors Copyright 2021, Samuel Lewis Furman iLORE: Discovering a Lineage of Microprocessors Samuel Lewis Furman (ABSTRACT) Researchers, benchmarking organizations, and hardware manufacturers maintain repositories of computer component and performance information. However, this data is split across many isolated sources and is stored in a form that is not conducive to analysis. A centralized repository of said data would arm stakeholders across industry and academia with a tool to more quantitatively understand the history of computing. We propose iLORE, a data model designed to represent intricate relationships between computer system benchmarks and computer components. We detail the methods we used to implement and populate the iLORE data model using data harvested from publicly available sources. Finally, we demonstrate the validity and utility of our iLORE implementation through an analysis of the characteristics and lineage of commercial microprocessors. We encourage the research community to interact with our data and visualizations at csgenome.org. iLORE: Discovering a Lineage of Microprocessors Samuel Lewis Furman (GENERAL AUDIENCE ABSTRACT) Researchers, benchmarking organizations, and hardware manufacturers maintain repositories of computer component and performance information. However, this data is split across many isolated sources and is stored in a form that is not conducive to analysis. A centralized repository of said data would arm stakeholders across industry and academia with a tool to more quantitatively understand the history of computing.
    [Show full text]
  • Scalable Vector Media-Processors for Embedded Systems
    Scalable Vector Media-processors for Embedded Systems Christoforos Kozyrakis Report No. UCB/CSD-02-1183 May 2002 Computer Science Division (EECS) University of California Berkeley, California 94720 Scalable Vector Media-processors for Embedded Systems by Christoforos Kozyrakis Grad. (University of Crete, Greece) 1996 M.S. (University of California at Berkeley) 1999 A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the GRADUATE DIVISION of the UNIVERSITY of CALIFORNIA at BERKELEY Committee in charge: Professor David A. Patterson, Chair Professor Katherine Yelick Professor John Chuang Spring 2002 Scalable Vector Media-processors for Embedded Systems Copyright Spring 2002 by Christoforos Kozyrakis Abstract Scalable Vector Media-processors for Embedded Systems by Christoforos Kozyrakis Doctor of Philosophy in Computer Science University of California at Berkeley Professor David A. Patterson, Chair Over the past twenty years, processor designers have concentrated on superscalar and VLIW architectures that exploit the instruction-level parallelism (ILP) available in engineering applications for workstation systems. Recently, however, the focus in computing has shifted from engineering to multimedia applications and from workstations to embedded systems. In this new computing environment, the performance, energy consumption, and development cost of ILP processors renders them ineffective despite their theoretical generality. This thesis focuses on the development of efficient architectures for embedded multimedia systems. We argue that it is possible to design processors that deliver high performance, have low energy consumption, and are simple to implement. The basis for the argument is the ability of vector architectures to exploit efficiently the data-level parallelism in multimedia applications.
    [Show full text]
  • Mpact R Media Processor
    Mpact TM R Media Processor by Chromatic Research Preliminary Data Sheet Features Description The new Mpact R media processor with a higher clock An Mpact media processor is a high performance copro- rate and an integrated display RAMDAC, continues to cessor for Windows 95 based PCs. Loaded with Mpact keep the Mpact product line the complete PC multimedia mediaware modules, the media processor adds or en- solution with the highest performance, integration and hances support for all seven multimedia functions: flexibility, and the lowest total cost. • Video: MPEG-1 & -2 decode, MPEG-1 encode • Complete • 2D graphics: Full VGA, SVGA support, acceleration of video • A total hardware solution for all multimedia ports: display playback and GUI through GDI and DirectDraw monitor, audio, telephone, video and joystick/MIDI • 3D graphics: Full 3D acceleration through Direct3D • Integrated software for all major multimedia functions on an • Digital Audio: Industry-standard Sound Card compatibility, x86 Windows 95 PC 3D Audio (SRS) and 3D positional audio effects • High Performance (DirectSound), Dolby AC-3Ô and MPEG-1 decode and • Very Long Instruction Word (VLIW) processor with 3.6 billion Wave Table and Waveguide synthesis operations per second peak in the Mpact R/3600 version • Fax/modem: Data up to 33.6 kb/s, fax up to 14.4 kb/s and • 5 concurrent I/O and memory controllers DSVD (Digital Simultaneous Voice and Data) • 132 MB/s PCI bus mastering interface • Telephony: Full-duplex speakerphone, Voicemail and caller • Concurrent operation with MRK multitasking real-time kernel ID • Dynamically shared processing with x86 host for highest • Videophone: H.324 over POTS and H.320 over ISDN total system efficiency using the MRM resource manager An Mpact media processor uses an optimized real-time • Flexible multitasking kernel to simultaneously execute these • Loadable mediaware modules provide the latest multimedia functionality using evolving APIs and operating system Mpact mediaware modules.
    [Show full text]
  • Organization of the Motorola 88110 Superscalar RISC Microprocessor
    Organization of the Motorola 88110 Superscalar RISC Microprocessor Motorola’s second-generation RISC microprocessor employs advanced techniques for exploit- ing instruction-level parallelism, including superscak instruction issue, out-of-order instruc- tion completion, speculative execution, dynamic hx&ruction rescheduling, and two parallel, high-bandwidth, on-chip caches. Designed to serve as the central processor in low-cost per- sonal computers and workstations, the 88110 supports demanding graphics and digital signal processing applications. Keith Diefendorff otorola designers conceived of the puters and workstation systems. Thus, our de- 88000 RISC (reduced instruction- sign objective was good performance at a given Michael Allen set computer) architecture to cost, rather than ultimate performance at any cost. Dl simplify construction of micropro- We recognized that the personal computer envi- Motorola cessors capable of exploiting high degrees of in- ronment is moving toward highly interactive soft- struction-level parallelism without sacrificing clock ware, user-oriented interfaces, voice and image speed. The designers held architectural complexity processing, and advanced graphics and video, to a minimum to eliminate pipeline bottlenecks all of which would place extremely high demands and remove limitations on concurrent instruction on integer, floating-point, and graphics process- execution. ing capabilities. At the same time, we realized The 88100/200 is the first implementation of the 88110 would have to meet these performance the 88000 architecture. It is a three-chip set, re- demands while operating with the inexpensive quiring one CPU (88100) chip and two (or more> DRAM systems typically found in low-cost per- cache (88200) chips. The CPU’s simple scalar sonal computers.
    [Show full text]