Low Power DRAM Evolution Osamu Nagashima Executive Professional Micron Memory Japan

Total Page:16

File Type:pdf, Size:1020Kb

Low Power DRAM Evolution Osamu Nagashima Executive Professional Micron Memory Japan Low Power DRAM Evolution Osamu Nagashima Executive Professional Micron Memory Japan JEDEC Mobile Copyright © 2016 Micron Technology, Inc & IOT Forum How We Got Here • Low Power DRAM evolved from a lower- voltage, lower-performance version of PC-DRAM designed for mobile packages to become one of the highest bandwidth-per-pin DRAMs available • High resolution displays, high-resolution cameras, and 3D rendered content are the primary drivers for increased bandwidth in mobile devices Mainstream DRAM Datarate by Type and Year of Introduction 4500 4000 3500 3000 2500 2000 1500 1000 500 0 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 LPDDR PC-DDR Evolution of Mainstream DRAM Energy Power Evolution 50.00 DDR2 DDR3 DDR4 LPDDR2 LPDDR3 LPDDR4 ENERGY, PJ/BITENERGY, 0.00 500 1000 1500 2000 2500 3000 3500 4000 4500 DATA RATE, MBPS Typical Mobile Device Usage 2% 8% 100% 4% 4% • The percentage of 20% 80% active usage has 20% 24% greatly increased in 60% recent years, driving 22% 13% an increase in 40% 12% memory bandwidth 20% 37% 34% • This has shifted 0% limitations from Heavy user Light user standby battery life to active battery life and Read Energy Write Energy Activate Energy thermal limits Standby Energy Refresh Energy Self Refresh Energy Near-Term Future • This evolution of system limitations is driving future LPDRAM architectures, beginning with the evolution of the LPDDR4 standard • Responding to the need for lower power, JEDEC is developing a reduced-I/O power version of LPDDR4, called LPDDR4X • LPDDR4X will reduce the Vddq level from 1.1v to 0.6v • Signaling swing will remain similar to LPDDR4 – This allows the same receiver designs and specifications to be used for both LPDDR4 and LPDDR4X LPDDR4X: I/O Energy Reduction • Reducing Vddq from 1.1v to 0.6v produces about 40% I/O energy savings LPDDR4 vs. LPDDR4X I/O Energy, Including Pre-Driver 4 3 LP4, 60 Ohms LP4, 120 Ohms 2 LP4, Unterm I/O Energy, pJ/bit LP4X, 60 Ohms LP4X, 120 Ohms 1 LP4X, Unterm ~40% Decrease ~40% 0 1000 1500 2000 2500 3000 3500 4000 4500 Data Rate, Mbps LPDDR4X I/O • LPDDR4X reduces I/O channel energy to where it can sometimes be lower than pre- driver energy • With LPDDR4X we’ve reached the point where further reductions in Vddq have limited opportunity for energy savings – The DRAM Rx limits the minimum signal swing – Significant reductions in channel loading would be required to reduce pre-driver energy – The DRAM core energy is now much larger than DRAM I/O energy LPDDR4X: I/O Energy Breakdown • Notice that at the most-efficient operating points (toward the right end of each line), the pre-driver energy is comparable to the channel energy – Reducing Vddq below 0.6v will have limited impact, and may even increase total I/O energy LPDDR4X I/O Energy, Channel+Termination vs. Pre-Driver 1.5 1.0 Channel + Term, 60 Ohms Channel + Term, 120 Ohms Channel, Unterminated Pre-Driver, 60 Ohms I/O Energy, pJ/bit I/O Energy, 0.5 Pre-Driver, 120 Ohms Pre-Driver, Unterminated 0.0 1000 1500 2000 2500 3000 3500 4000 4500 Data Rate, Mbps I/O vs. DRAM Core Energy • LPDDR4X energy is dominated by the core • Future energy reductions should focus on core efficiency in order to be significant LPDDR4X Core Energy vs. I/O Energy 8 Core I/O, 48 Ohms I/O, 60 Ohms Energy, pJ/bit Energy, I/O, 120 Ohms I/O, Unterm 0 1000 1500 2000 2500 3000 3500 4000 4500 Data Rate, Mbps The Future – All About Power Efficiency • Power efficiency across a range of bandwidths is a more important attribute than peak bandwidth – And cost is still very important • JEDEC is beginning to consider LPDDR5 – Data rates of 6.4Gbps or even higher are being considered • Pushing DRAM performance to extreme speeds has consequences – Higher I/O speeds than LPDDR4 will reduce power efficiency at all speeds Effect of Increasing Speed Capability • 6.4Gbps is approaching the limits of the DRAM process – Pre-driver fanouts must be reduced – As a consequence, deploying a DRAM I/O circuit capable of 6.4Gbps will cause an increase in pre-driver power at all speeds – This will degrade the energy efficiency of the LP5 I/O compared to LPDDR4X at the lower-speeds - where it matters most! LPDDR5 vs. LPDDR4X I/O Energy, Including Pre-Driver 3 LP5, 48 Ohms LP5, 60 Ohms LP5, Unterm LP4X, 48 Ohms I/O Energy, pJ/bit I/O Energy, LP4X, 60 Ohms LP4X, Unterm >50% Increase @ 1600Mbps! >50% 0 1000 2000 3000 4000 5000 6000 7000 Data Rate, Mbps Energy Cost for Higher Speeds • Yes, DRAM can be made to function with very high data rates – GDDR5 is a good example • Adding GDDR5 to our power evolution chart, we see that energy/bit increases vs. LPDDR4 25.00 Power Evolution DDR3 DDR4 LPDDR3 LPDDR4 GDDR5 ENERGY, PJ/BIT ENERGY, 0.00 500 1500 2500 3500 4500 5500 6500 7500 8500 DATA RATE, MBPS Alternatives to LPDDR5 in Mobile Devices • The challenge of LPDDR5 is to minimize I/O energy while supporting data rates of 6.4Gbps or higher • Alternative solutions, include going wider – More LPDDR4X I/O’s could be a more power-efficient scheme – Maintaining LPDDR4X data rates could enable wider PoP solutions that don’t require significant changes in packaging • There is discussion about using LPDDR3 I/O speeds, but this will: – Require twice the I/O pins of LP4X for equivalent bandwidth, therefore increasing packaging costs and risks – Unlikely to be more power efficient • Another solution that could increase data per pin without a corresponding increase in pin count could be multi-level signaling – This could be applied to both PoP and xMCP configurations Attacking the Largest Component of DRAM Power • The non-I/O portion of DRAM power now dominates – DRAM manufacturers have already highly optimized designs to maximize power efficiency while meeting user requirements for high- frequencies and low latencies – Significant reduction in the DRAM array voltage is not practical • Lower voltage reduces the maximum charge that can be stored • The resulting loss in performance and necessary increase in refresh rate would offset any power improvements DRAM DVFS? • Dynamic Voltage Frequency Scaling – DVFS - has been used by many mobile components for years – DVFS presents significant challenges for DRAM • Building a DRAM that meets all of the demanding performance and reliability requirements at a wider voltage range is unlikely to improve efficiency • Array core voltage must remain static – Only peripheral circuits can operate at a wider range – This means the array and periphery voltages must be separated DRAM DVFS • Additionally, verification and test of wide voltage ranges for peripheral circuits could be expensive – This requirement is driven by the need in mobile applications to quickly change from low-frequency to higher-frequency operation while active • Expecting the DRAM to continue operating at the lower frequency while voltage is ramped is required to avoid a ‘stall’ – The DRAM process does not scale to lower voltage well – transistor performance decreases much faster than with today’s logic processes • Timing closure across an increased voltage range for DRAM is a very complex challenge Two-Step DRAM DVFS • More palatable to DRAM manufacturers could be a scheme that allowed for DRAM periphery operation at two discrete voltage levels, while leaving the array at one fixed level – Much of the power savings can be realized – Switching between these two discrete voltages must be fast enough that DRAM operation during the voltage ramp can be prohibited Two-Step DRAM DVFS • Low-to-high switching could be performed within the LPDDR4 tFC spec, therefore operation could be dis-allowed during the switch time • This means DVFS can be automatically applied when the user switches operation between Frequency Set Points (FSP) • Power efficiency can be improved by >30% Future Challenges • DRAM scaling challenges will add complexity to future memory systems – tWR will increase – Native refresh times will shorten • Especially if DRAM vendors reduce core voltage – Error detection and correction will become a requirement • ECC can reduce power and mitigate the performance impact of DRAM scaling challenges Heterogenous Memory Space • Mobile memory density continues to increase - do we really need maximum DRAM performance for the entire memory space? • Would mobile systems be better served with a smaller, high- speed “Local Main Memory” and a larger, non-volatile memory array? • A system like the one below could leverage emerging non-volatile memories that promise to be >1000x faster than NAND and much less costly than DRAM Storage Videos Music Apps OS Logic-to-Logic Signaling • DRAM I/O performance is nearing its limits • Continuing to push I/O performance will decrease energy efficiency • Addition of a logic die to the memory subsystem could enable higher-speed signaling, and therefore either higher performance or reduced SoC pincount • A wide, slower interface to multiple DRAM and/or NVRAM die would be restricted to inside the memory package • High-pincount packaging would not be required Thank You.
Recommended publications
  • Improving DRAM Performance by Parallelizing Refreshes
    Improving DRAM Performance by Parallelizing Refreshes with Accesses Kevin Kai-Wei Chang Donghyuk Lee Zeshan Chishti† [email protected] [email protected] [email protected] Alaa R. Alameldeen† Chris Wilkerson† Yoongu Kim Onur Mutlu [email protected] [email protected] [email protected] [email protected] Carnegie Mellon University †Intel Labs Abstract Each DRAM cell must be refreshed periodically every re- fresh interval as specified by the DRAM standards [11, 14]. Modern DRAM cells are periodically refreshed to prevent The exact refresh interval time depends on the DRAM type data loss due to leakage. Commodity DDR (double data rate) (e.g., DDR or LPDDR) and the operating temperature. While DRAM refreshes cells at the rank level. This degrades perfor- DRAM is being refreshed, it becomes unavailable to serve mance significantly because it prevents an entire DRAM rank memory requests. As a result, refresh latency significantly de- from serving memory requests while being refreshed. DRAM de- grades system performance [24, 31, 33, 41] by delaying in- signed for mobile platforms, LPDDR (low power DDR) DRAM, flight memory requests. This problem will become more preva- supports an enhanced mode, called per-bank refresh, that re- lent as DRAM density increases, leading to more DRAM rows freshes cells at the bank level. This enables a bank to be ac- to be refreshed within the same refresh interval. DRAM chip cessed while another in the same rank is being refreshed, alle- density is expected to increase from 8Gb to 32Gb by 2020 as viating part of the negative performance impact of refreshes.
    [Show full text]
  • NON-CONFIDENTIAL for Publication COMP
    EUROPEAN COMMISSION Brussels, 15.1.2010 SG-Greffe(2010) D/275 C(2010) 150 Subject: Case COMP/C-3/ 38 636 Rambus (Please quote this reference in all correspondence) […]* 1. I refer to Hynix' complaint to the Commission of 18 December 2002 lodged jointly with Infineon pursuant to Article 3 of Council Regulation No. 17/621 against Rambus Inc. ("Rambus"), an undertaking incorporated in 1990 in California and reincorporated in Delaware, USA, in 1997, with its principal place of business in Los Altos, California, regarding alleged violations of Article 101 and Article 102 of the Treaty on the Functioning of the European Union ("TFEU")2 in connection with computer memory chips which are known as synchronous DRAM chips ("the Complaint"). I also refer to the letters listed here below, by which Hynix provided additional information/explanations on the above matter, as well as the Commission’s letter of 13 October 2009 ["Article 7 letter"] addressed to Hynix in that matter and the response to the Article 7 letter of 12 November 2009. […] 2. For the reasons set out below, the Commission considers that there is no sufficient degree of Community interest for conducting a further investigation into the alleged infringement and rejects your complaint pursuant to Article 7(2) of the Commission Regulation (EC) 773/20043. * This version of the Commission Decision of 15.1.2010 does not contain any business secrets or other confidential information. 1 Regulation No. 17 of the Council of 6 February 1962, First Regulation implementing Articles 85 and 86 of the Treaty (OJ No. 13, 21.2.1962, p.
    [Show full text]
  • Refresh Now and Then Seungjae Baek, Member, IEEE, Sangyeun Cho, Senior Member, IEEE, and Rami Melhem, Fellow, IEEE
    IEEE TRANSACTIONS ON COMPUTERS 1 Refresh Now and Then Seungjae Baek, Member, IEEE, Sangyeun Cho, Senior Member, IEEE, and Rami Melhem, Fellow, IEEE Abstract—DRAM stores information in electric charge. Because DRAM cells lose stored charge over time due to leakage, they have to be “refreshed” in a periodic manner to retain the stored information. This refresh activity is a source of increased energy consumption as the DRAM density grows. It also incurs non-trivial performance loss due to the unavailability of memory arrays during refresh. This paper first presents a comprehensive measurement based characterization study of the cell-level data retention behavior of modern low-power DRAM chips. 99.7% of the cells could retain the stored information for longer than 1 second at a high temperature. This average cell retention behavior strongly indicates that we can deeply reduce the energy and performance penalty of DRAM refreshing with proper system support. The second part of this paper, accordingly, develops two practical techniques to reduce the frequency of DRAM refresh operations by excluding a few leaky memory cells from use and by skipping refreshing of unused DRAM regions. We have implemented the proposed techniques completely in the Linux OS for experimentation, and measured performance improvement of up to 17.2% with the refresh operation reduction of 93.8% on smartphone like low-power platforms. Index Terms—SDRAM, refresh operation, power consumption, performance improvement. ✦ 1 INTRODUCTION again. Refresh period, the time interval within which we RAM is commonly used in a computer system’s main must walk through all rows once, is typically 32 ms or D memory.
    [Show full text]
  • Memory & Storage Challenges and Solutions
    Memory & Storage Challenges and Solutions G S A 2 0 1 9 Jinman Han Senior Vice President, Memory Product Planning & Application Engineering Legal Disclaimer This presentation is intended to provide information concerning SSD and memory industry. We do our best to make sure that information presented is accurate and fully up-to-date. However, the presentation may be subject to technical inaccuracies, information that is not up-to-date or typographical errors. As a consequence, Samsung does not in any way guarantee the accuracy or completeness of information provided on this presentation. The information in this presentation or accompanying oral statements may include forward-looking statements. These forward-looking statements include all matters that are not historical facts, statements regarding the Samsung Electronics' intentions, beliefs or current expectations concerning, among other things, market prospects, growth, strategies, and the industry in which Samsung operates. By their nature, forward- looking statements involve risks and uncertainties, because they relate to events and depend on circumstances that may or may not occur in the future. Samsung cautions you that forward looking statements are not guarantees of future performance and that the actual developments of Samsung, the market, or industry in which Samsung operates may differ materially from those made or suggested by the forward-looking statements contained in this presentation or in the accompanying oral statements. In addition, even if the information contained herein or the oral statements are shown to be accurate, those developments may not be indicative developments in future periods. Abstract Memory-centric system innovation is the overarching theme of modern semiconductor technology and is one of the crucial driving forces of the future IT world.
    [Show full text]
  • SDRAM Memory Systems: Architecture Overview and Design Verification SDRAM Memory Systems: Architecture Overview and Design Verification Primer
    Primer SDRAM Memory Systems: Architecture Overview and Design Verification SDRAM Memory Systems: Architecture Overview and Design Verification Primer Table of Contents Introduction . 3 - 4 DRAM Trends . .3 DRAM . 4 - 6 SDRAM . 6 - 9 DDR SDRAM . .6 DDR2 SDRAM . .7 DDR3 SDRAM . .8 DDR4 SDRAM . .9 GDDR and LPDDR . .9 DIMMs . 9 - 13 DIMM Physical Size . 9 DIMM Data Width . 9 DIMM Rank . .10 DIMM Memory Size & Speed . .10 DIMM Architecture . .10 Serial Presence Detect . .12 Memory System Design . .13 - 15 Design Simulation . .13 Design Verification . .13 Verification Strategy . .13 SDRAM Verification . .14 Glossary . .16 - 19 2 www.tektronix.com/memory SDRAM Memory Systems: Architecture Overview and Design Verification Primer Introduction Memory needs to be compatible with a wide variety of memory controller hubs used by the computer DRAM (Dynamic Random Access Memory) is attractive to manufacturers. designers because it provides a broad range of performance Memory needs to work when a mixture of different and is used in a wide variety of memory system designs for manufacturer’s memories is used in the same memory computers and embedded systems. This DRAM memory system of the computer. primer provides an overview of DRAM concepts, presents potential future DRAM developments and offers an overview Open memory standards are useful in helping to ensure for memory design improvement through verification. memory compatibility. DRAM Trends On the other hand, embedded systems typically use a fixed There is a continual demand for computer memories to be memory configuration, meaning the user does not modify larger, faster, lower powered and physically smaller. These the memory system after purchasing the product.
    [Show full text]
  • LPDDR4 Is Expected to Crossover LPDDR3 in 2018
    S32 DDR Tools Suite Radu Ivan, Jacques Landry Presenter title goes here Second line title goes here October 2019 | Session #AMF-AUT-T3843 Company Public – NXP, the NXP logo, and NXP secure connections for a smarter world are trademarks of NXP B.V. All other product or service names are the property of their respective owners. © 2019 NXP B.V. Agenda • Industry Trends • Basic DDR SDRAM Structure • Next Generation of DDR Subsystems • S32 DDR Tool COMPANY PUBLIC 1 Industry Trends COMPANY PUBLIC 2 Industry Trend • DDR4 DRAM pricing is lower or same as DDR3\3L. The pricing crossover occurred around Q4 of 2015. Similarly, LPDDR4 is expected to crossover LPDDR3 in 2018. • Production DDR4 DRAM, DIMMs and LPDDR4 are available from most DRAM vendors. • The first NXP device with DDR4 support, T104x product, taped out in Q42013.The LS1043A also supports DDR4. Nearly 4 years of product experience with DDR4. − Many current and all future QorIQ products including T1, LS1, and LS2 products will support DDR4. • The first NXP device with LPDDR4 support is the i.MX8 Family. COMPANY PUBLIC 3 DDR3, DDR4 and LPDDR – Major Vendors Supported by all major memory vendors COMPANY PUBLIC 4 DRAM Migration Roadmap 100% 80% 60% DDR4 DDR3 40% DDR2 DDR 20% 0% 2016 2017 2018 2019 2016 2017 2018 2019 DDR 1% 1% 1% 1% DDR2 2% 1% 1% 1% DDR3 32% 20% 14% 8% DDR4 65% 78% 84% 91% COMPANY PUBLIC 5 DDR3/DDR3L/DDR4 Power Saving • DDR3 DRAM provides 20% power savings over DDR2 • DDR3L DRAM provides 10% power savings over DDR3 • DDR4 DRAM provides 37% power savings over DDR3L COMPANY PUBLIC 6 LPDDR2/LPDDR3/LPDDR4 Power Saving Reductions in operating voltage – LPDDR • LPDDR2 DRAM 1.8V (LPSDR, LPDDR) provides 36% power 1.2V (LPDDR2, LPDDR3) savings over LPDDR1 1.1V, 0.6V (LPDDR4/LPDDR4X) • LPDDR3 DRAM provides 10% power savings over LPDDR2 • PLDDR4 DRAM provides 37% power savings over LPDDR3 COMPANY PUBLIC 7 DDR SDRAM Highlights and Comparison Feature/Category DDR3 DDR4 LPDDR4 Package BGA only BGA only BGA.
    [Show full text]
  • AMBA DDR, LPDDR, and SDR Dynamic Memory Controller DMC-340 Technical Reference Manual
    AMBA® DDR, LPDDR, and SDR Dynamic Memory Controller DMC-340 Revision: r4p0 Technical Reference Manual Copyright © 2004-2007, 2009 ARM Limited. All rights reserved. ARM DDI 0331G (ID111809) AMBA DDR, LPDDR, and SDR Dynamic Memory Controller DMC-340 Technical Reference Manual Copyright © 2004-2007, 2009 ARM Limited. All rights reserved. Release Information The Change history table lists the changes made to this book. Change history Date Issue Confidentiality Change 22 June 2004 A Non-Confidential First release for r0p0. 31 August 2004 B Non-Confidential Second release for r0p0. 25 August 2005 C Non-Confidential Incorporate erratum. Additional information to Exclusive access on page 2-14. 09 June 2006 D Non-Confidential First release for r1p0. 16 May 2007 E Non-Confidential First release for r2p0. 30 November 2007 F Non-Confidential First release for r3p0. 05 November 2009 G Non-Confidential First release for r4p0. Proprietary Notice Words and logos marked with ® or ™ are registered trademarks or trademarks of ARM® in the EU and other countries, except as otherwise stated below in this proprietary notice. Other brands and names mentioned herein may be the trademarks of their respective owners. Neither the whole nor any part of the information contained in, or the product described in, this document may be adapted or reproduced in any material form except with the prior written permission of the copyright holder. The product described in this document is subject to continuous developments and improvements. All particulars of the product and its use contained in this document are given by ARM in good faith. However, all warranties implied or expressed, including but not limited to implied warranties of merchantability, or fitness for purpose, are excluded.
    [Show full text]
  • Dynamic Rams from Asynchrounos to DDR4
    Dynamic RAMs From Asynchrounos to DDR4 PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sun, 10 Feb 2013 17:59:42 UTC Contents Articles Dynamic random-access memory 1 Synchronous dynamic random-access memory 14 DDR SDRAM 27 DDR2 SDRAM 33 DDR3 SDRAM 37 DDR4 SDRAM 43 References Article Sources and Contributors 48 Image Sources, Licenses and Contributors 49 Article Licenses License 50 Dynamic random-access memory 1 Dynamic random-access memory Dynamic random-access memory (DRAM) is a type of random-access memory that stores each bit of data in a separate capacitor within an integrated circuit. The capacitor can be either charged or discharged; these two states are taken to represent the two values of a bit, conventionally called 0 and 1. Since capacitors leak charge, the information eventually fades unless the capacitor charge is refreshed periodically. Because of this refresh requirement, it is a dynamic memory as opposed to SRAM and other static memory. The main memory (the "RAM") in personal computers is dynamic RAM (DRAM). It is the RAM in laptop and workstation computers as well as some of the RAM of video game consoles. The advantage of DRAM is its structural simplicity: only one transistor and a capacitor are required per bit, compared to four or six transistors in SRAM. This allows DRAM to reach very high densities. Unlike flash memory, DRAM is volatile memory (cf. non-volatile memory), since it loses its data quickly when power is removed. The transistors and capacitors used are extremely small; billions can fit on a single memory chip.
    [Show full text]
  • Introducing Micron DDR5 SDRAM: More Than a Generational Update
    A MICRON WHITE PAPER Introducing Micron® DDR5 SDRAM: More Than a Generational Update By Scott Schlachter and Brian Drake Introduction DDR5, the successor of DDR4, has been developed to deliver performance improvements at a time when system designers are feeling increasing pressure from continuous technological advancements—where current memory bandwidth is simply unable to keep up with newer processor models that have increasing core counts. DDR5 is the fifth-generation double data rate (DDR) SDRAM, and the feature enhancements from DDR4 to DDR5 are the greatest yet. While previous generations focused on reducing power consumption and were driven by applications such as mobile and data center, DDR5’s primary driver has been the need for more bandwidth. Compared to DDR4 at an equivalent data rate of 3200 megatransfers per second (MT/s), a DDR5 system-level simulation example indicates an approximate performance increase of 1.36X effective bandwidth. At a higher data rate, DDR5-4800, the approximate performance increase becomes 1.87X—nearly double the bandwidth as compared to DDR4-3200. Figure 1: Effective Bandwidth: DDR4 vs. DDR51 Driven by data rates up to 6400 MT/s and key architectural improvements, Micron’s DDR5 is pushing potential system bandwidth even higher. This white paper discusses some of the key architectural improvements of DDR5 and, specifically, how they enable significant bandwidth growth over DDR4. 1. Source: Micron. Bandwidth normalized to x64 interface, 64B random accesses, 66% reads, dual-rank x4 simulation, 16Gb. Best estimates; subject to change. 1 A MICRON WHITE PAPER Meeting Next-Generation CPU Requirements At the system-level, despite only modest clock rate improvements, the transition to multicore CPU architectures has enabled continuous year-over-year compute performance gains.
    [Show full text]
  • 24 What Your DRAM Power Models Are Not Telling
    What Your DRAM Power Models Are Not Telling You: Lessons from a Detailed Experimental Study SAUGATA GHOSE, Carnegie Mellon University ABDULLAH GIRAY YAĞLIKÇI, ETH Zürich & Carnegie Mellon University RAGHAV GUPTA, Carnegie Mellon University DONGHYUK LEE, NVIDIA KAIS KUDROLLI, Carnegie Mellon University WILLIAM X. LIU, Carnegie Mellon University HASAN HASSAN, ETH Zürich KEVIN K. CHANG, Carnegie Mellon University NILADRISH CHATTERJEE, NVIDIA ADITYA AGRAWAL, NVIDIA MIKE O’CONNOR, NVIDIA & The University of Texas at Austin ONUR MUTLU, ETH Zürich & Carnegie Mellon University Main memory (DRAM) consumes as much as half of the total system power in a computer today, due to the increasing demand for memory capacity and bandwidth. There is a growing need to understand and analyze DRAM power consumption, which can be used to research new DRAM architectures and systems that consume less power. A major obstacle against such research is the lack of detailed and accurate information on the power consumption behavior of modern DRAM devices. Researchers have long relied on DRAM power models that are predominantly based off of a set of standardized current measurements provided byDRAM vendors, called IDD values. Unfortunately, we find that state-of-the-art DRAM power models are often highly inaccurate, as these models do not reflect the actual power consumed by real DRAM devices. To build an accurate model and provide insights into DRAM power consumption, we perform the first comprehensive experimental characterization of the power consumed by modern real-world
    [Show full text]
  • Pdfdownload.Jsp?Path=/Datasheet/Timing Device/DDR Device Operation.Pdf 9
    SpringerBriefs in Electrical and Computer Engineering For further volumes: http://www.springer.com/series/10059 Chulwoo Kim • Hyun-Woo Lee Junyoung Song High-Bandwidth Memory Interface 2123 Chulwoo Kim Hyun-Woo Lee Department of Electrical Engineering SK-Hynix Korea University Gyeonggi-do Seongbuk-gu, Seoul Korea Korea Junyoung Song Department of Electrical Engineering Korea University Seongbuk-gu, Seoul Korea ISSN 2191-8112 ISSN 2191-8120 (electronic) ISBN 978-3-319-02380-9 ISBN 978-3-319-02381-6 (eBook) DOI 10.1007/978-3-319-02381-6 Springer NewYork Heidelberg Dordrecht London Library of Congress Control Number: 2013950761 © The Author(s) 2014 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law.
    [Show full text]
  • Understanding and Improving the Latency of DRAM-Based Memory Systems
    Understanding and Improving the Latency of DRAM-Based Memory Systems Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical and Computer Engineering Kevin K. Chang M.S., Electrical & Computer Engineering, Carnegie Mellon University B.S., Electrical & Computer Engineering, Carnegie Mellon University arXiv:1712.08304v1 [cs.AR] 22 Dec 2017 Carnegie Mellon University Pittsburgh, PA May, 2017 Copyright ©2017, Kevin K. Chang Abstract Over the past two decades, the storage capacity and access bandwidth of main memory have improved tremendously, by 128x and 20x, respectively. These improvements are mainly due to the continuous technology scaling of DRAM (dynamic random-access memory), which has been used as the physical substrate for main memory. In stark contrast with capacity and bandwidth, DRAM latency has remained almost constant, reducing by only 1.3x in the same time frame. Therefore, long DRAM latency continues to be a critical performance bot- tleneck in modern systems. Increasing core counts, and the emergence of increasingly more data-intensive and latency-critical applications further stress the importance of providing low-latency memory access. In this dissertation, we identify three main problems that contribute significantly to long latency of DRAM accesses. To address these problems, we present a series of new techniques. Our new techniques significantly improve both system performance and energy efficiency. We also examine the critical relationship between supply voltage and latency in modern DRAM chips and develop new mechanisms that exploit this voltage-latency trade-off to improve energy efficiency. First, while bulk data movement is a key operation in many applications and operating systems, contemporary systems perform this movement inefficiently, by transferring data from DRAM to the processor, and then back to DRAM, across a narrow off-chip channel.
    [Show full text]