Access Order and Effective Bandwidth for Streams on a Direct Rambus Memory Sung I

Total Page:16

File Type:pdf, Size:1020Kb

Access Order and Effective Bandwidth for Streams on a Direct Rambus Memory Sung I Access Order and Effective Bandwidth for Streams on a Direct Rambus Memory Sung I. Hong, Sally A. McKee†, Maximo H. Salinas, Robert H. Klenke, James H. Aylor, Wm. A. Wulf Dept. of Electrical and Computer Engineering †Dept. of Computer Science University of Virginia University of Utah Charlottesville, VA 22903 Salt Lake City, Utah 84112 Abstract current DRAM page forces a new page to be accessed. The Processor speeds are increasing rapidly, and memory speeds are overhead time required to do this makes servicing such a request not keeping up. Streaming computations (such as multi-media or significantly slower than one that hits the current page. The order of scientific applications) are among those whose performance is requests affects the performance of all such components. Access most limited by the memory bottleneck. Rambus hopes to bridge the order also affects bus utilization and how well the available processor/memory performance gap with a recently introduced parallelism can be exploited in memories with multiple banks. DRAM that can deliver up to 1.6Gbytes/sec. We analyze the These three observations — the inefficiency of traditional, performance of these interesting new memory devices on the inner dynamic caching for streaming computations; the high advertised loops of streaming computations, both for traditional memory bandwidth of Direct Rambus DRAMs; and the order-sensitive controllers that treat all DRAM transactions as random cacheline performance of modern DRAMs — motivated our investigation of accesses, and for controllers augmented with streaming hardware. a hardware streaming mechanism that dynamically reorders For our benchmarks, we find that accessing unit-stride streams in memory accesses in a Rambus-based memory system. cacheline bursts in the natural order of the computation exploits This paper explains how the details of the Direct Rambus from 44-76% of the peak bandwidth of a memory system composed interface affect sustained, streaming accesses, and presents analytic of a single Direct RDRAM device, and that accessing streams via a and simulation results for inner loops of streaming kernels streaming mechanism with a simple access ordering scheme can performed on a memory system composed of a single Direct improve performance by factors of 1.18 to 2.25. RDRAM device. We evaluate two memory interleaving schemes, deriving bounds on the percentage of available memory bandwidth 1. Introduction exploited when accessing streams via (a) cacheline accesses in the natural order of the computation, and (b) streaming hardware that As processors continue to become faster and to consume more dynamically reorders accesses. We find that the former approach bandwidth, conventional DRAM memory systems will have more generally fails to exploit much of the potential memory bandwidth. and more difficulty keeping up. The kinds of applications that are Adding hardware support for streaming improves the performance particularly affected by the growing processor-memory of these Direct RDRAM systems, allowing computations on performance gap include scientific computations, multi-media streams of a thousand or more elements to utilize nearly all of the codecs, encryption, signal processing, and text searching. Although available memory bandwidth. A system with streaming support data caches perform well for some access patterns, the vectors used outperforms a traditional Rambus system by factors of up to 2.25 in these streaming computations are normally much too large to for stride one and 2.20 for strides bigger than a cacheline. cache in their entirety, and each element is typically visited only once during lengthy portions of the computation. This lack of 2. Background temporal locality of reference makes caching less effective, and performance becomes limited by the speed of the memory system. To put the analysis and results in Section 5 and Section 6 in The new Direct Rambus DRAMs (RDRAMs) propose to bridge perspective, this section describes basic Dynamic Random Access the current performance gap with a pipelined microarchitecture that Memory (DRAM) organization and operation, compares and allows direct control of all DRAM row and column resources contrasts the timing parameters of several types of current DRAMs, concurrently with data transfer operations [21]. The RDRAM and explains how the new Direct RDRAMs work. memory architecture merits study for several reasons: its interface, 2.1 DRAM basics architecture, and timing are unique; it advertises a peak bandwidth of 1.6Gbytes/sec, a significant improvement over that of other DRAM storage cell arrays are typically rectangular, and thus a data currently available memory devices; all of the top thirteen DRAM access sequence consists of a row access (RAS, or row address suppliers are actively developing Direct RDRAMS; and Intel has strobe signal) followed by a one or more column accesses (CAS, or selected the Direct Rambus technology to become its next PC main- column address strobe signal). During RAS, the row address is memory standard [7]. presented to the DRAM. In page mode, data in the storage cells of Like nearly all modern DRAMs, Direct RDRAMs implement a the decoded row are moved into a bank of sense amplifiers (the form of page mode operation. In page mode, memory devices “sense amps” or page buffer), which serves as a row cache. During behave as if implemented with a single line, or page, of cache on CAS, the column address is decoded and the selected data is read chip. A memory access falling outside the address range of the from the sense amps. Consecutive accesses to the current row — called page hits — require only a CAS, allowing data to be accessed Sung Hong’s current address: Lockheed Martin Federal Systems, 9500 at the maximum frequency. Godwin Dr. Manassas, VA 20110, [email protected]. The key timing parameters used to analyze DRAM performance initiate a precharge operation. The smallest addressable data size is include the row-access time (tRAC ), column-access time (tCAC ), 128 bits (two 64-bit stream elements). The full memory bandwidth page-mode cycle time (tPC ), and random read/write cycle time cannot be utilized unless all words in a DATA packet are used. Note (tRC ). Typical timing parameter values for various common the distinction between the RDRAM transfer rate (800 MHz), the DRAMs are given in Figure 1 [14][21]. Extended Data Out (EDO) RDRAM interface clock rate (400 MHz), and the packet transfer DRAMs are similar to fast-page mode DRAMs, except that data rate (100 MHz). All references to cycles in the following sections buffers are added between the column decoder and the input/output are in terms of the 400 MHz interface clock. buffer. These buffers permit data transfer to extend beyond the CAS signal de-assertion, allowing a faster page-cycle time. Burst-EDO * DRAMs transfer larger blocks of data by incorporating an internal Timings for a Min -50 -800 Direct RDRAM Part counter. After the memory controller transfers the initial address, t this internal counter generates the subsequent memory addresses in CYCLE interface clock cycle time (400 MHz) 2.5ns the block. SDRAMs synchronize all inputs and outputs to a system 4 t t packet transfer time CYCLE clock, allowing an even faster page-cycle time. PACK 10ns 11 t Fast-Page Direct t min interval between ROW & COL packets CYCLE EDO Burst-EDO SDRAM RCD 27.5 ns Mode RDRAM t 50 50 52 50 50 page precharge time: min interval between RAC 10 t t ROW precharge (PRER) & activate (ACT) CYCLE t 13 13 10 9 20 RP 25 ns CAC packets nsec tRC 95 89 90 100 85 column/precharge overlap: max overlap 1 t t 30 20 15 10 10* t CYCLE PC CPOL between last COL packet & start of row PRER 2.5 ns max 33 50 66 100 400 MHz freq. page hit latency: delay between start of COL 8 t t CYCLE CAC packet & valid data 20 ns *The packet transfer time, since tPC doesn’t apply here. page miss latency: delay between start of ROW Figure 1 Typical DRAM timing parameters 20 t t ACT request & valid data CYCLE RAC 50 ns ( tRCD ++tCAC 1 ) extra cycle 2.2 Rambus DRAMs page miss cycle time: min interval between 34 t t successive ROW ACT requests (random read/ CYCLE Although the memory core — the banks and sense amps — of RC 85 ns RDRAMs is similar to that of other DRAMs, the architecture and write cycle time for single device) interface are unique. An RDRAM is actually an interleaved Row/row packet delay: min delay between con- 8 t memory system integrated onto a single memory chip. Its pipelined t secutive ROW accesses to the same RDRAM CYCLE RR 20 ns microarchitecture supports up to four outstanding requests. device Currently, all 64 Mbit RDRAMs incorporate at least eight independent banks of memory. Some RDRAM cores incorporate Roundtrip bus delay: latency between start of 16 banks in a “double bank” architecture, but two adjacent banks COL packet & valid data (added to read page- 2 tCYCLE tRDLY cannot be accessed simultaneously, making the total number of hit times, since DATA packet travels in opposite 5 ns independent banks effectively eight [20]. direction of commands; no delay for writes) First-generation Base RDRAMs use a 64-bit or 72-bit internal Read/write bus turnaround: interval between 6 tCYCLE bus and a 64-to-8 or 72-to-9 bit multiplexer to deliver bandwidth of tRW writing & reading ( tPACK + t ) 15 n 500 to 600 Mbytes/sec. Second-generation Concurrent RDRAMs RDLY deliver the same peak bandwidth, but an improved protocol allows *These are the key parameters that affect our study. For complete timing better bandwidth utilization by handling multiple concurrent parameters and most recent data sheets, see http://www.rambus.com/. transactions. Current, third-generation Direct RDRAMs double the Figure 2 RDRAM timing parameter definitions external data bus width from 8/9-bits to 16/18-bits and increase the clock frequency from 250/300 MHz to 400 MHz.
Recommended publications
  • Modeling System Signal Integrity Uncertainty Considerations
    WHITE PAPER Intel® FPGA Modeling System Signal Integrity Uncertainty Considerations Authors Abstract Ravindra Gali This white paper describes signal integrity (SI) mechanisms that cause system-level High-Speed I/O Applications timing uncertainty and how these mechanisms are modeled in the Intel® Quartus® Engineering, Intel® Corporation Prime software Timing Analyzer to achieve timing closure for external memory interface designs. Zhi Wong By using the Intel Quartus Prime software to achieve timing closure for external High-Speed I/O Applications memory interfaces, a designer does not need to allocate a separate SI timing Engineering, Intel Corporation budget to account for simultaneous switching output (SSO), simultaneous Navid Azizi switching input (SSI), intersymbol interference (ISI), and board-level crosstalk for Software Engineeringr flip-chip device families such as Stratix® IV and Arria® II FPGAs for typical user Intel Corporation implementation of external memory interfaces following good board design practices. John Oh High-Speed I/O Applications Introduction Engineering, Intel Corporation The widening performance gap between FPGAs, microprocessors, and memory Arun VR devices, along with the growth of memory-intensive applications, are driving the Memory I/O Applications Engineering, need for faster memory technologies. This push to higher bandwidths has been Intel Corporation accompanied by an increase in the signal count and the signaling rates of FPGAs and memory devices. In order to attain faster bandwidths, device makers continue to reduce the supply voltage. Initially, industry-standard DIMMs operated at 5 V. However, due to improvements in DRAM storage density, the operating voltage was decreased to 3.3 V (SDR), then to 2.5V (DDR), 1.8 V (DDR2), 1.5 V (DDR3), and 1.35 V (DDR3) to allow the memory to run faster and consume less power.
    [Show full text]
  • PATENT PLEDGES Jorge L. Contreras*
    PATENT PLEDGES Jorge L. Contreras* ABSTRACT An increasing number of firms are making public pledges to limit the enforcement of their patents. In doing so, they are entering a little- understood middle ground between the public domain and exclusive property rights. The best-known of these patent pledges are FRAND commitments, in which patent holders commit to license their patents to manufacturers of standardized products on terms that are “fair, reasonable and non-discriminatory.” But patent pledges have been appearing in settings well beyond standard-setting, including open source software, green technology and the life sciences. As a result, this increasingly prevalent private ordering mechanism is beginning to reshape the role and function of patents in the economy. Despite their proliferation, little scholarship has explored the phenomenon of patent pledges beyond FRAND commitments and standard- setting. This article fills this gap by providing the first comprehensive descriptive account of patent pledges across the board. It offers a four-part taxonomy of patent pledges based on the factors that motivate patent holders to make them and the effect they are intended to have on other market actors. Using this classification system, it argues that pledges likely to induce reliance in other market actors should be treated as “actionable” * Associate Professor, S.J. Quinney College of Law, University of Utah and Senior Policy Fellow, American University Washington College of Law. The author thanks Jonas Anderson, Clark Asay, Marc Sandy Block, Mark Bohannon, Matthew Bye, Michael Carrier, Michael Carroll, Colleen Chien, Thomas Cotter, Carter Eltzroth, Carissa Hessick, Meredith Jacob, Jay Kesan, Anne Layne-Farrar, Irina Manta, Sean Pager, Gideon Parchomovsky, Arti Rai, Amelia Rinehart, Cliff Rosky, Daniel Sokol and Duane Valz for their helpful comments, suggestions and discussion of this article and contributions of data to the Patent Pledge Database at American University.
    [Show full text]
  • Big Data, AI, and the Future of Memory
    Big Data, AI, and the Future of Memory Steven Woo Fellow and Distinguished Inventor, Rambus Inc. May 15, 2019 Memory Advancements Through Time 1990’s 2000’s 2010’s 2020’s Synchronous Memory Graphics Memory Low Power Memory Ultra High Bandwidth for PCs for Gaming for Mobile Memory for AI Faster Compute + Big Data Enabling Explosive Growth in AI 1980s Annual Size of the Global Datasphere – 1990s Now 175 ZB More 180 Accuracy Compute Neural Networks 160 140 120 Other Approaches 100 Zettabytes 80 60 40 20 Scale (Data Size, Model Size) 2010 2015 2020 2025 Source: Adapted from Jeff Dean, “Recent Advances in Artificial Intelligence and the Source: Adapted from Data Age 2025, sponsored by Seagate Implications for Computer System Design,” HotChips 29 Keynote, August 2017 with data from IDC Global DataSphere, Nov 2018 Key challenges: Moore’s Law ending, energy efficiency growing in importance ©2019 Rambus Inc. 3 AI Accelerators Need Memory Bandwidth Google TPU v1 1000 TPU Roofline Inference on newer silicon (Google TPU K80 Roofline HSW Roofline v1) built for AI processing largely limited LSTM0 by memory bandwidth LSTM1 10 MLP1 MLP0 v nVidia K80 CNN0 Inference on older, general purpose Intel Haswell CNN1 hardware (Haswell, K80) limited by LSTM0 1 LSTM1 compute and memory bandwidth TeraOps/sec (log scale) (log TeraOps/sec MLP1 MLP0 CNN0 0.1 CNN1 LSTM0 1 10 100 1000 Memory bandwidth is a critical LSTM1 Ops/weight byte (log scale) resource for AI applications = Google TPU v1 = nVidia K80 = Intel Haswell N. Jouppi, et.al., “In-Datacenter Performance Analysis of a Tensor Processing Unit™,” https://arxiv.org/ftp/arxiv/papers/1704/1704.04760.pdf ©2019 Rambus Inc.
    [Show full text]
  • Download Attachment
    NON-CONFIDENTIAL 2010-1556 UNITED STATES COURT OF APPEALS FOR THE FEDERAL CIRCUIT ASUSTEK COMPUTER INC., ASUS COMPUTER INTERNATIONAL, INC., BFG TECHNOLOGIES, INC., BIOSTAR MICROTECH (U.S.A.) CORP., BIOSTAR MICROTECH INTERNATIONAL CORP., DIABLOTEK, INC., EVGA CORP., G.B.T., INC., GIGA-BYTE TECHNOLOGY CO., LTD., HEWLETT-PACKARD COMPANY, MSI COMPUTER CORP., MICRO-STAR INTERNATIONAL COMPANY, LTD., GRACOM TECHNOLOGIES LLC (FORMERLY KNOWN AS PALIT MULTIMEDIA, INC.), PALIT MICROSYSTEMS LTD., PINE TECHNOLOGY (MACAO COMMERCIAL OFFSHORE) LTD., AND SPARKLE COMPUTER COMPANY, LTD. Appellants, — v. — INTERNATIONAL TRADE COMMISSION, Appellee, and RAMBUS, INC., Intervenor, and NVIDIA CORPORATION, Intervenor. ______________ 2010-1557 ______________ NVIDIA CORPORATION, Appellant, — v. — INTERNATIONAL TRADE COMMISSION, Appellee, and RAMBUS, INC., Intervenor. ______________ ON APPEAL FROM THE UNITED STATES INTERNATIONAL TRADE COMMISSION IN INVESTIGATION NO. 337-TA-661 ______________ NON-CONFIDENTIAL REPLY BRIEF OF APPELLANTS NVIDIA CORPORATION ET AL. _______________ *Caption Continued on Next Page COMPANION CASES TO: 2010-1483 RAMBUS, INC., Appellant, — v. — INTERNATIONAL TRADE COMMISSION, Appellee, and NVIDIA CORPORATION ET AL., Intervenors. ______________ RUFFIN B. CORDELL I. NEEL CHATTERJEE MARK S. DAVIES ANDREW R. KOPSIDAS RICHARD S. SWOPE RACHEL M. MCKENZIE FISH & RICHARDSON P.C. NITIN GAMBHIR LAUREN J. PARKER 1425 K Street, NW, 11th Floor ORRICK, HERRINGTON ORRICK, HERRINGTON Washington, DC 20005 & SUTCLIFFE LLP & SUTCLIFFE LLP Tel. No. 202-626-6449
    [Show full text]
  • DDR400/333/266, Dual DDR, RDRAM 16 Bit and 32 Bit, SDRAM
    Ace’s Hardware Granite Bay: Memory Technology Shootout Granite Bay: Memory Technology Shootout By Johan De Gelas – December 2002 Dual-Channel DDR SDRAM Arrives for the Pentium 4 DDR400/333/266, Dual DDR, RDRAM 16 bit and 32 bit, SDRAM... almost every memory technology on the market is available for the Pentium 4 platform. One of our previous technical articles discussed the advantages and disadvantages of the different architectures of Rambus and SDRAM based memory technology such as DDR and DDR-II. In this article, we will investigate how the different memory technologies and their supporting chipsets compare on the test bench. The following motherboards were tested: • The ASUS P4T533 features the i850E chipset and 32 bit RDRAM • The ASUS P4T533-C comes with the same chipset but uses two channels of 16 bit RIMMs • The MSI 648 Max comes with SIS 648 chipset which unofficially supports DDR400 • The MSI i845PE comes with Intel's newest i845 chipset, which officially support DDR333 • The Tyan Trinity 7205 and MSI GNB Max feature the Dual DDR266 Granite Bay chipset We are well aware that there have already many tests with Pentium 4 chipsets, Granite Bay included. So why bother to publish another on Ace’s Hardware? The focus of this article is on the memory technology supported by these chipsets. This article will offer you a insight in how the different memory technologies compare in a wide variety of applications. We'll investigate in depth what the advantages and disadvantages are of each memory technology and try to find out what are the reasons behind this.
    [Show full text]
  • Appendix to Brief of Appellee and Cross-Appellant Rambus Inc
    PUBLIC UNITED STATES OF AMERICA BEFORE FEDERAL TRADE COMMISSION COMMISSIONERS: Deborah Platt Majoras, Chairman Orson Swindle Thomas B. Leary Pamela Jones Harbour Jon Leibowitz ) In the Matter of ) RAMBUS INCORPORATED, ) ) a corporation. ) Docket No. 9302 ) ) ) APPENDIX TO BRIEF OF APPELLEE AND CROSS-APPELLANT RAMBUS INC. Pursuant to the Commission’s October 4, 2004 Order granting Rambus leave to file an appendix, Rambus submits this appendix to its appeal brief containing a glossary of terms. -1- US1DOCS 4782131v1 Glossary of Terms Auto precharge: DRAMs store information as minute quantities of electrical charge in memory cells – no charge is interpreted as “0" and positive charge as a “1.” Sense amplifiers are circuits on the DRAM that sense the charge in a memory cell and amplify it when information is to be read from the DRAM. Before the sense amplifiers can perform this function, they must be “precharged” to an intermediate charged state. “Auto precharge” is a feature that was originally found in RDRAMs and later adopted by SDRAMs and DDR SDRAMs that allows the controller to determine whether the sense amplifiers are to be automatically precharged – that is, precharged without the need for a separate precharge command – at the end of a read or write operation. Bit/Byte: A bit or “binary digit” is the unit of information used by digital computers that takes on only two values – “0" or “1." Each memory cell in a DRAM stores a single bit. A “byte” usually refers to eight bits. Since each bit in a byte can take on two values, a byte can take on 28, or 256, possible values.
    [Show full text]
  • The Intel Random Number Generator
    ® THE INTEL RANDOM NUMBER GENERATOR CRYPTOGRAPHY RESEARCH, INC. WHITE PAPER PREPARED FOR INTEL CORPORATION Benjamin Jun and Paul Kocher April 22, 1999 Information in this white paper is provided without guarantee or warranty of any kind. This review represents the opinions of Cryptography Research and may or may not reflect opinions of Intel Corporation. Characteristics of the Intel RNG may vary with design or process changes. © 1999 by Cryptography Research, Inc. and Intel Corporation. 1. Introduction n = − H K∑ pi log pi , Good cryptography requires good random i=1 numbers. This paper evaluates the hardware- where pi is the probability of state i out of n based Intel Random Number Generator (RNG) possible states and K is an optional constant to for use in cryptographic applications. 1 provide units (e.g., log(2) bit). In the case of a Almost all cryptographic protocols require random number generator that produces a k-bit the generation and use of secret values that must binary result, pi is the probability that an output be unknown to attackers. For example, random will equal i, where 0 ≤ i < 2k . Thus, for a number generators are required to generate -k perfect random number generator, pi = 2 and public/private keypairs for asymmetric (public the entropy of the output is equal to k bits. This key) algorithms including RSA, DSA, and means that all possible outcomes are equally Diffie-Hellman. Keys for symmetric and hybrid (un)likely, and on average the information cryptosystems are also generated randomly. present in the output cannot be represented in a RNGs are also used to create challenges, nonces sequence shorter than k bits.
    [Show full text]
  • Chang-Hong Wu Distinguished Engineer, Juniper Networks the INTERNET EXPLOSION
    ASICS: THE HEART OF MODERN ROUTERS Chang-Hong Wu Distinguished Engineer, Juniper Networks THE INTERNET EXPLOSION # Web Sites 130EB/yr Internet Capacity 162M # Connected Devices 1B Total Digitized Information 420EB # Google Searches/Month 100M 31B/mo 12EB/yr 40M 110EB 4PB/yr 60PB/yr 9.5M 160M 25M 33K 1 1.7M 2.7B/mo 1988 1993 1998 2003 2008 Exponential growth, no matter how you measure it! The clearest indication of value delivered to end-users 2 Copyright © 2010 Juniper Networks, Inc. DRIVING FORCE BEHIND EXPONENTIAL GROWTH C S C S N C S Information N System N Digital Stored Pipelining Microprocessor Multi-core Computing Program Computing Digital Circuit Packet TCP/IP Transmission Switching Switching HPN Networking Flash Digital Core Disk DRAM Storage Memory Storage 3 Copyright © 2010 Juniper Networks, Inc. COMPUTER PERFORMANCE: 1988-2008 228 500,000 X over 20 years 226 224 222 220 218 System CAGR: 1.9x /year 216 214 12 2 Super Computers 210 28 26 Megahertz Megahertz / MFlops 24 Microprocessor CAGR: 1.3x /year 22 20 „88 „89 „90 „91 „92 „93 „94 „95 „96 „97 „98 „99 „00 „01 „02 „03 „04 „05 „06 „07 „08 4 Copyright © 2010 Juniper Networks, Inc. ROUTER PERFORMANCE 1988 – 2008 1000,000 X over 20 years (2x /year) 224 Post-ASIC era: 2.2x /year TX T1600 222 220 T640 M160 218 Pre-ASIC era: 1.6x /year M40 216 214 212 Interface CAGR: 1.7x /year 210 28 26 Megabits per second 24 22 20 „88 „89 „90 „91 „92 „93 „94 „95 „96 „97 „98 „99 „00 „01 „02 „03 „04 „05 „06 „07 „08 5 Copyright © 2010 Juniper Networks, Inc.
    [Show full text]
  • Rambus Buys Into CXL Interconnect Ecosystem with Two New Deals June 17 2021
    Market Insight Report Reprint Rambus buys into CXL interconnect ecosystem with two new deals June 17 2021 John Abbott The addition of PLDA and AnalogX will speed up Rambus’ initiative to enable memory expansion and pooling in disaggregated infrastructure through the emerging Compute Express Link interconnect ecosystem. CXL 2.0 is a PCIe-based technology intended to make it easier to connect CPUs with memory and specialist accelerators, and to separate memory from physical servers to improve memory bandwidth, capacity and efficiency. This report, licensed to Rambus, developed and as provided by S&P Global Market Intelligence (S&P), was published as part of S&P’s syndicated market insight subscription service. It shall be owned in its entirety by S&P. This report is solely intended for use by the recipient and may not be reproduced or re-posted, in whole or in part, by the recipient without express permission from S&P. Market Insight Report Reprint Introduction Snapshot Rambus has acquired two companies, PDLA and Acquirer Rambus AnalogX. The silicon memory IP company is looking to boost its portfolio of new products and IP through Targets PLDA, AnalogX contributing to the emerging CXL (Compute Express Subsector Semiconductor IP Link) interconnect ecosystem. CXL 2.0, launched last November, is a PCIe-based technology intended to Deal value Not disclosed make it easier to connect CPUs with memory and Date announced 16-Jun-21 specialist accelerators, and to separate memory Closing date, Q3 2021 from physical servers to improve memory bandwidth, expected capacity and efficiency. Advisers None disclosed Rambus intends to combine its existing serial connection, memory and security skills, and IP with the newly acquired IP and engineers from the two companies to produce CXL buffering chips and memory controllers, powering next-generation PCIe 6.0 and CXL 3.0 devices.
    [Show full text]
  • Complaint Counsel's Response to Rambus's Supplemental Brief In
    UNITED STATES OF AMERICA BEFORE FEDERAL TRADE COMMISSION COMMISSIONERS: Deborah Platt Majoras, Chairman Thomas B. Leary Pamela Jones Harbour Jon Leibowitz In the Matter of Docket No. 9302 RAMBUS INCORPORATED, PUBLIC a corporation. COMPLAINT COUNSEL’S RESPONSE TO RAMBUS’S SUPPLEMENTAL BRIEF IN SUPPORT OF MOTION TO REOPEN RECORD “[P]roof of price fixing by DRAM manufacturers . is immaterial to the issues in this case, including whether Rambus’ conduct alleged in the Complaint could tend to injure competition.” Judge Timony, Opinion Supporting Order Granting Motion of the United States Department of Justice to Limit Discovery Relating to the Grand Jury at 7 (January 15, 2003). In its Supplemental Brief, Rambus again fails to address the fundamental problem in its motion: the material it seeks to add to the record is irrelevant to this proceeding. Indeed, the information submitted by Rambus in its Supplemental Brief confirms this. The Samsung plea agreement attached to Rambus’s brief states, “The charged violation with respect to RDRAM occurred at times during the period from January 1, 2001 to June 15, 2002.” Samsung Plea Agreement, ¶ 4(d) (October 13, 2005). Thus, according to this description submitted by Rambus, the asserted RDRAM conspiracy did not begin until long after 1 Rambus’s conduct at issue in this case. Indeed, the RDRAM conspiracy apparently did not begin until: • after the industry had implemented use of the JEDEC SDRAM standard; • after JEDEC adopted the DDR SDRAM standard; • after JEDEC had completed substantial work on
    [Show full text]
  • ASUS P3C-E Rambus™ Motherboard
    R P3C-E Rambus™ Motherboard USER’S MANUAL USER'S NOTICE No part of this manual, including the products and software described in it, may be repro- duced, transmitted, transcribed, stored in a retrieval system, or translated into any language in any form or by any means, except documentation kept by the purchaser for backup purposes, without the express written permission of ASUSTeK COMPUTER INC. (“ASUS”). ASUS PROVIDES THIS MANUAL “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OR CONDITIONS OF MERCHANTABILITY OR FITNESS FOR A PAR- TICULAR PURPOSE. IN NO EVENT SHALL ASUS, ITS DIRECTORS, OFFICERS, EMPLOYEES OR AGENTS BE LIABLE FOR ANY INDIRECT, SPECIAL, INCIDEN- TAL, OR CONSEQUENTIAL DAMAGES (INCLUDING DAMAGES FOR LOSS OF PROFITS, LOSS OF BUSINESS, LOSS OF USE OR DATA, INTERRUPTION OF BUSI- NESS AND THE LIKE), EVEN IF ASUS HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES ARISING FROM ANY DEFECT OR ERROR IN THIS MANUAL OR PRODUCT. Product warranty or service will not be extended if: (1) the product is repaired, modified or altered, unless such repair, modification of alteration is authorized in writing by ASUS; or (2) the serial number of the product is defaced or missing. Products and corporate names appearing in this manual may or may not be registered trade- marks or copyrights of their respective companies, and are used only for identification or explanation and to the owners’ benefit, without intent to infringe. • Adobe and Acrobat are registered trademarks of Adobe Systems Incorporated. • Intel, LANDesk, and Pentium are registered trademarks of Intel Corporation.
    [Show full text]
  • Challenges and Solutions for Future Main Memory
    White Paper May 26, 2009 Challenges and Solutions for Future Main Memory DDR3 SDRAM is used in many computing systems today and offers data rates of up to 1600Mbps. To achieve performance levels beyond DDR3, future main memory subsystems must attain faster data rates while maintaining low power, high access efficiency and competitive cost. This whitepaper will outline some of the key challenges facing next generation main memory and the Rambus innovations that can be applied to advance the main memory roadmap. Challenges and Solutions for Future Main Memory 1 Introduction Rambus Innovations Enable: Demand for an enriched end-user • 3200Mbps Data Rates experience and increased performance in • 40% Lower Power next-generation computing applications is • Up to 50% Higher never-ending. Driven by recent multi-core computing, virtualization and processor Throughput integration trends, the computing industry is • 2-4x Higher Capacity in need of a next-generation main memory solution with anticipated data rates of up to In order to address these challenges and 3200 Megabits per second (Mbps) at a meet the increased performance demanded similar or lower power envelope as that of by recent multi-core computing, DDR3-1600 solutions. The divergence of virtualization and chip integration trends, these two requirements – increased Rambus has developed an architectural performance and reduced power – presents concept for future main memory. The a difficult challenge for the design of future architecture builds upon existing Rambus power-efficient memories. innovations and designs, such as FlexPhase™ circuitry, FlexClocking™ and In addition to power-efficiency challenges, Dynamic Point-to-Point (DPP) technologies, future memory solutions face potential and introduces new concepts, which include bottlenecks in access efficiency and Near Ground Signaling and Module capacity, both of which become more Threading.
    [Show full text]