Performance Evaluation and Feasibility Study of Near-Data Processing on DRAM Modules (DIMM-NDP) for Scientific Applications Matthias Gries, Pau Cabré, Julio Gago

Total Page:16

File Type:pdf, Size:1020Kb

Performance Evaluation and Feasibility Study of Near-Data Processing on DRAM Modules (DIMM-NDP) for Scientific Applications Matthias Gries, Pau Cabré, Julio Gago Performance Evaluation and Feasibility Study of Near-data Processing on DRAM Modules (DIMM-NDP) for Scientific Applications Matthias Gries, Pau Cabré, Julio Gago To cite this version: Matthias Gries, Pau Cabré, Julio Gago. Performance Evaluation and Feasibility Study of Near-data Processing on DRAM Modules (DIMM-NDP) for Scientific Applications. [Technical Report] Huawei Technologies Duesseldorf GmbH, Munich Research Center (MRC). 2019. hal-02100477 HAL Id: hal-02100477 https://hal.archives-ouvertes.fr/hal-02100477 Submitted on 15 Apr 2019 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. TECHNICAL REPORT MRC-2019-04-15-R1, HUAWEI TECHNOLOGIES, MUNICH RESEARCH CENTER, GERMANY, APRIL 2019 1 Performance Evaluation and Feasibility Study of Near-data Processing on DRAM Modules (DIMM-NDP) for Scientific Applications Matthias Gries , Pau Cabre,´ Julio Gago Abstract—As the performance of DRAM devices falls more and more behind computing capabilities, the limitations of the memory and power walls are imminent. We propose a practical Near-Data Processing (NDP) architecture DIMM-NDP for mitigating the effects of the memory wall in the nearer-term targeting server applications for scientific computing. DIMM-NDP exploits existing but unused DRAM bandwidth on memory modules (DIMMs) and takes advantage of a subset of the forthcoming JEDEC NVDIMM-P protocol in order to integrate application-specific, programmable functionality near memory. DIMM-NDP works on shared memory with the host CPU by definition, takes advantage of abundant memory capacity in the main memory subsystem and remains economic by relying on standard, unmodified DRAM devices. We evaluate DIMM-NDP with a range of bandwidth, latency and compute-bound workloads from the domains of predictive data analytics and machine learning that depend on dense and sparse linear algebra. Simulation results show up to 6.3x better performance for bandwidth-limited applications, representing 79% of the theoretical peak, and up to 3x improved energy efficiency. We complement the evaluation with feasibility checks for DIMM-like form factors to offer 32GB to 128GB capacity per DIMM, hardware overhead costs (below 20%), and power envelopes for standard (13W) and custom DIMMs (40W). A sensitivity analysis of interface properties for comparison with traditional accelerator coupling over PCIe, as well as a case study on porting software kernels, showing in the order of one month programming effort per application, outline reasonable operating points for DIMM-NDP. Index Terms—Processing In-Memory, near-memory accelerator, NUMA, software porting, performance engineering, case study. ✦ CONTENTS 5 Hardware Setups for DIMM-NDP 5 5.1 Power & Area Analysis of Blocks . 6 1 Introduction 2 5.2 Chip Packages and Form Factors . 6 1.1 LimitedAdaptationofNDPsofar . 2 5.3 ResultingNDP-enabledMemModules . 6 1.2 MotivationforDIMM-NDP . 2 5.4 Cost Analysis for Memory Module . 7 2 Related Work 2 6 Evaluation Setup and Flow 7 2.1 Synchronization between Host and NDP 3 6.1 SelectedWorkloads . 7 2.2 Integration of NDP Technology . 3 6.2 SystemConfiguration. 8 2.3 Variants of NDP Functionality . 3 6.3 EvaluationFlow.............. 9 2.4 FormFactors................ 3 2.5 PositioningDIMM-NDP . 3 7 Performance Evaluation of DIMM-NDP 10 7.1 Performance and Efficiency Results . 10 3 DIMM-NDP Architecture 3 7.2 Near-datavs.LooselyCoupledAcc . 11 3.1 ArchitectureOverview . 3 7.3 NVDIMM-P Asynchronous Accesses . 13 3.2 NDPUnitSetup.............. 4 7.4 Software Porting Effort . 14 3.3 LateralTransfersbetweenUnits . 4 7.5 Discussion of Further NDP Variants . 14 3.4 ECCHandling............... 4 8 Concluding Remarks 15 4 Software View of DIMM-NDP 4 4.1 Workload Partitioning and Placement . 4 References 15 4.2 Protection/Virtual Address Spaces . 5 4.3 Synchronization. ... ... ... .... 5 4.4 SoftwareImplementationFlow . 5 • Matthias Gries is with Huawei Technologies D¨usseldorf GmbH at the All product or company names that may be mentioned in this Munich Research Center (MRC), Riesstr. 25, 80992 M¨unchen, Germany report are tradenames, trademarks or registered trademarks of their E-mail: [email protected] • Pau Cabr´eand Julio Gago are with Metempsy, Pamplona 88, Barcelona respective owners. 08018, Spain E-mail: {pau.cabre|julio.gago}@gmail.com Research Report released on April 15th, 2019 TECHNICAL REPORT MRC-2019-04-15-R1, HUAWEI TECHNOLOGIES, MUNICH RESEARCH CENTER, GERMANY, APRIL 2019 2 1 INTRODUCTION limited capacity, may depend on 2.5D integration with The implications of the memory and power walls on mem- additional interposers and are subject to further thermal ory subsystems are imminent [1], [2]. Operations have to stress and testing challenges. • move closer to data in the memory hierarchy in order Lacking standards in the past for interface protocols to reduce redundant data movement as one of the major and programming, such that NDP has been perceived sources of wastage [3], [4], [5]. As a result of decades of as an application-specific and proprietary solution. scaling DRAM devices for memory capacity, the memory subsystem has been adapted to keep pace in terms of 1.2 Motivation for DIMM-NDP bandwidth, as shown in Fig. 1 for the stream [6] benchmark. The main features of DIMM-NDP are: “Brute force” effects, such as adding more memory chan- • Building on standard IP: NDP units enhance the Media nels, are nearly exploited and challenged by signal integrity, Controller (MedC) on a memory module. The MedC is a pin limitations and process technology scaling, as the design discrete buffer chip positioned side-by-side the DRAM of analog frontends for IO interfaces becomes more com- devices on the module and needed for forthcoming plex [7]. Recent approaches to mitigate the memory wall interface standards like JEDEC NVDIMM-P [14], [15], Gen-Z [16] and CCIX [17]. 1000000 • DIMM-NDP employs unmodified DRAM chips and Dual socket, 3 channels exploits unused rank-level bandwidth on DIMM, such integrated MC, DDR3-1333 100000 that we follow the economy of scale of manufacturing Dual channel DDR2-400 standard DRAM. 10000 Dual socket, 4 channels • Dual channel integrated MC, DDR4-2133 Memory capacity is not wasted. We focus on NDP in the RDRAM shared memory subsystem of the host, where both sides 1000 (host and NDP) have concurrent access to memory. ooo core The memory module appears as normal Load-Reduced 100 DIMM if NDP is switched off. 64b SDRAM, single channel, in-order core • streamcopy bandwidth[MB/s] Versatility: We use standard cores with special func- 10 tional units as programmable NDP units that different 1990 1995 2000 2005 2010 2015 2020 application domains can take advantage of, as expected year for a server setup. • We take advantage of recent standard protocols to Fig. 1. Reported Stream [6] bandwidth on system-level, annotated with ease the implementation, namely forthcoming JEDEC architecture modifications to sustain memory performance. NVDIMM-P [14], [15] that introduces asynchronous transfers as a subset, as well as NUMA and math follow either in-memory (on the same technology process, libraries like BLAS [18] for programming. such as [8] and references therein) or near-memory (e.g., 3D stacked, as in [9], [10], [11], [12]) ideas. These represent By relying on standard DRAM, modular DIMMs and com- either longer-term research, since the economic manufac- mon programming abstractions, we see DIMM-NDP as turing at high density still has to be shown, or relatively an enabler for gradually introducing NDP into general- costly solutions with stacked DRAM, like High-Bandwidth purpose servers for use with many applications that benefit Memory (HBM) and the Hybrid Memory Cube (HMC), that from CPU-centric programming and high memory capacity. are limited by memory capacity and more costly to integrate We lower the bar for adapting Near-Data Processing (NDP) than standard DRAM devices. in the main memory subsystem, since DIMM-NDP does not However, the effects of dark silicon [13] lead designers waste DRAM capacity. Application programmers can thus to more heterogeneous architectures. Programmers become gradually migrate memory-bound code to DIMM-NDP by more experienced with using accelerators and non-uniform working on shared memory. memory in return. We can rely on this trend to introduce an The contributions of this paper are a) the proposal of acceptable solution by building on the concept of NUMA the DIMM-NDP architecture and host-CPU centric pro- (non-uniform memory accesses) to associate memory loca- gramming view amenable to scientific computing b) the tions with near-data accelerators. performance evaluation of DIMM-NDP, c) a case study of the usability looking at the software porting effort, and d) revealing design tradeoffs in terms of form factor, perfor- 1.1 Limited Adaptation of Near-Data Processing so far mance, power and costs with feasibility checks. The lack of adapting Near-Data Processing (NDP) in- or After surveying related work in the next section, sec- near-memory may be attributed to the following challenges: tions 3 and 4 introduce the architecture and software stack for DIMM-NDP. Section 5 shows feasibility checks for form • The integration of processing has been tried on the factors, power and overhead costs. We describe the eval- same technology process with the memory, either Non- uation flow, performance results and the software porting Volatile Memory (NVM) or DRAM. In case of DRAM, efforts in sections 6 and 7, followed by concluding remarks. this leads to either not scalable and slow, or not very complex designs, whereas the area of NVM integration and manufacturing is an open field of research.
Recommended publications
  • Product Guide SAMSUNG ELECTRONICS RESERVES the RIGHT to CHANGE PRODUCTS, INFORMATION and SPECIFICATIONS WITHOUT NOTICE
    May. 2018 DDR4 SDRAM Memory Product Guide SAMSUNG ELECTRONICS RESERVES THE RIGHT TO CHANGE PRODUCTS, INFORMATION AND SPECIFICATIONS WITHOUT NOTICE. Products and specifications discussed herein are for reference purposes only. All information discussed herein is provided on an "AS IS" basis, without warranties of any kind. This document and all information discussed herein remain the sole and exclusive property of Samsung Electronics. No license of any patent, copyright, mask work, trademark or any other intellectual property right is granted by one party to the other party under this document, by implication, estoppel or other- wise. Samsung products are not intended for use in life support, critical care, medical, safety equipment, or similar applications where product failure could result in loss of life or personal or physical harm, or any military or defense application, or any governmental procurement to which special terms or provisions may apply. For updates or additional information about Samsung products, contact your nearest Samsung office. All brand names, trademarks and registered trademarks belong to their respective owners. © 2018 Samsung Electronics Co., Ltd. All rights reserved. - 1 - May. 2018 Product Guide DDR4 SDRAM Memory 1. DDR4 SDRAM MEMORY ORDERING INFORMATION 1 2 3 4 5 6 7 8 9 10 11 K 4 A X X X X X X X - X X X X SAMSUNG Memory Speed DRAM Temp & Power DRAM Type Package Type Density Revision Bit Organization Interface (VDD, VDDQ) # of Internal Banks 1. SAMSUNG Memory : K 8. Revision M: 1st Gen. A: 2nd Gen. 2. DRAM : 4 B: 3rd Gen. C: 4th Gen. D: 5th Gen.
    [Show full text]
  • You Need to Know About Ddr4
    Overcoming DDR Challenges in High-Performance Designs Mazyar Razzaz, Applications Engineering Jeff Steinheider, Product Marketing September 2018 | AMF-NET-T3267 Company Public – NXP, the NXP logo, and NXP secure connections for a smarter world are trademarks of NXP B.V. All other product or service names are the property of their respective owners. © 2018 NXP B.V. Agenda • Basic DDR SDRAM Structure • DDR3 vs. DDR4 SDRAM Differences • DDR Bring up Issues • Configurations and Validation via QCVS Tool COMPANY PUBLIC 1 BASIC DDR SDRAM STRUCTURE COMPANY PUBLIC 2 Single Transistor Memory Cell Access Transistor Column (bit) line Row (word) line G S D “1” => Vcc “0” => Gnd “precharged” to Vcc/2 Cbit Ccol Storage Parasitic Line Capacitor Vcc/2 Capacitance COMPANY PUBLIC 3 Memory Arrays B0 B1 B2 B3 B4 B5 B6 B7 ROW ADDRESS DECODER ADDRESS ROW W0 W1 W2 SENSE AMPS & WRITE DRIVERS COLUMN ADDRESS DECODER COMPANY PUBLIC 4 Internal Memory Banks • Multiple arrays organized into banks • Multiple banks per memory device − DDR3 – 8 banks, and 3 bank address (BA) bits − DDR4 – 16 banks with 4 banks in each of 4 sub bank groups − Can have one active row in each bank at any given time • Concurrency − Can be opening or closing a row in one bank while accessing another bank Bank 0 Bank 1 Bank 2 Bank 3 Row 0 Row 1 Row 2 Row 3 Row … Row Buffers COMPANY PUBLIC 5 Memory Access • A requested row is ACTIVATED and made accessible through the bank’s row buffers • READ and/or WRITE are issued to the active row in the row buffers • The row is PRECHARGED and is no longer
    [Show full text]
  • Improving DRAM Performance by Parallelizing Refreshes
    Improving DRAM Performance by Parallelizing Refreshes with Accesses Kevin Kai-Wei Chang Donghyuk Lee Zeshan Chishti† [email protected] [email protected] [email protected] Alaa R. Alameldeen† Chris Wilkerson† Yoongu Kim Onur Mutlu [email protected] [email protected] [email protected] [email protected] Carnegie Mellon University †Intel Labs Abstract Each DRAM cell must be refreshed periodically every re- fresh interval as specified by the DRAM standards [11, 14]. Modern DRAM cells are periodically refreshed to prevent The exact refresh interval time depends on the DRAM type data loss due to leakage. Commodity DDR (double data rate) (e.g., DDR or LPDDR) and the operating temperature. While DRAM refreshes cells at the rank level. This degrades perfor- DRAM is being refreshed, it becomes unavailable to serve mance significantly because it prevents an entire DRAM rank memory requests. As a result, refresh latency significantly de- from serving memory requests while being refreshed. DRAM de- grades system performance [24, 31, 33, 41] by delaying in- signed for mobile platforms, LPDDR (low power DDR) DRAM, flight memory requests. This problem will become more preva- supports an enhanced mode, called per-bank refresh, that re- lent as DRAM density increases, leading to more DRAM rows freshes cells at the bank level. This enables a bank to be ac- to be refreshed within the same refresh interval. DRAM chip cessed while another in the same rank is being refreshed, alle- density is expected to increase from 8Gb to 32Gb by 2020 as viating part of the negative performance impact of refreshes.
    [Show full text]
  • Datasheet DDR4 SDRAM Revision History
    Rev. 1.4, Apr. 2018 M378A5244CB0 M378A1K43CB2 M378A2K43CB1 288pin Unbuffered DIMM based on 8Gb C-die 78FBGA with Lead-Free & Halogen-Free (RoHS compliant) datasheet SAMSUNG ELECTRONICS RESERVES THE RIGHT TO CHANGE PRODUCTS, INFORMATION AND SPECIFICATIONS WITHOUT NOTICE. Products and specifications discussed herein are for reference purposes only. All information discussed herein is provided on an "AS IS" basis, without warranties of any kind. This document and all information discussed herein remain the sole and exclusive property of Samsung Electronics. No license of any patent, copyright, mask work, trademark or any other intellectual property right is granted by one party to the other party under this document, by implication, estoppel or other- wise. Samsung products are not intended for use in life support, critical care, medical, safety equipment, or similar applications where product failure could result in loss of life or personal or physical harm, or any military or defense application, or any governmental procurement to which special terms or provisions may apply. For updates or additional information about Samsung products, contact your nearest Samsung office. All brand names, trademarks and registered trademarks belong to their respective owners. © 2018 Samsung Electronics Co., Ltd.GG All rights reserved. - 1 - Rev. 1.4 Unbuffered DIMM datasheet DDR4 SDRAM Revision History Revision No. History Draft Date Remark Editor 1.0 - First SPEC. Release 27th Jun. 2016 - J.Y.Lee 1.1 - Deletion of Function Block Diagram [M378A1K43CB2] on page 11 29th Jun. 2016 - J.Y.Lee - Change of Physical Dimensions [M378A1K43CB1] on page 41 1.11 - Correction of Typo 7th Mar. 2017 - J.Y.Lee 1.2 - Change of Physical Dimensions [M378A1K43CB1] on page 41 23h Mar.
    [Show full text]
  • DDR4 8Gb C Die Unbuffered SODIMM Rev1.5 Apr.18.Book
    Rev. 1.5, Apr. 2018 M471A5244CB0 M471A1K43CB1 M471A2K43CB1 260pin Unbuffered SODIMM based on 8Gb C-die 78FBGA with Lead-Free & Halogen-Free (RoHS compliant) datasheet SAMSUNG ELECTRONICS RESERVES THE RIGHT TO CHANGE PRODUCTS, INFORMATION AND SPECIFICATIONS WITHOUT NOTICE. Products and specifications discussed herein are for reference purposes only. All information discussed herein is provided on an "AS IS" basis, without warranties of any kind. This document and all information discussed herein remain the sole and exclusive property of Samsung Electronics. No license of any patent, copyright, mask work, trademark or any other intellectual property right is granted by one party to the other party under this document, by implication, estoppel or other- wise. Samsung products are not intended for use in life support, critical care, medical, safety equipment, or similar applications where product failure could result in loss of life or personal or physical harm, or any military or defense application, or any governmental procurement to which special terms or provisions may apply. For updates or additional information about Samsung products, contact your nearest Samsung office. All brand names, trademarks and registered trademarks belong to their respective owners. © 2018 Samsung Electronics Co., Ltd. All rights reserved. - 1 - Rev. 1.5 Unbuffered SODIMM datasheet DDR4 SDRAM Revision History Revision No. History Draft Date Remark Editor 1.0 - First SPEC. Release 27th Jun. 2016 - J.Y.Lee 1.1 - Change of Function Block Diagram [M471A1K43CB1] on page 10~11 29th Jun. 2016 - J.Y.Lee - Change of Physical Dimensions on page 42~43 1.2 - Change of Physical Dimensions on page 42~43 24th Feb.
    [Show full text]
  • SDRAM Memory Systems: Architecture Overview and Design Verification SDRAM Memory Systems: Architecture Overview and Design Verification Primer
    Primer SDRAM Memory Systems: Architecture Overview and Design Verification SDRAM Memory Systems: Architecture Overview and Design Verification Primer Table of Contents Introduction . 3 - 4 DRAM Trends . .3 DRAM . 4 - 6 SDRAM . 6 - 9 DDR SDRAM . .6 DDR2 SDRAM . .7 DDR3 SDRAM . .8 DDR4 SDRAM . .9 GDDR and LPDDR . .9 DIMMs . 9 - 13 DIMM Physical Size . 9 DIMM Data Width . 9 DIMM Rank . .10 DIMM Memory Size & Speed . .10 DIMM Architecture . .10 Serial Presence Detect . .12 Memory System Design . .13 - 15 Design Simulation . .13 Design Verification . .13 Verification Strategy . .13 SDRAM Verification . .14 Glossary . .16 - 19 2 www.tektronix.com/memory SDRAM Memory Systems: Architecture Overview and Design Verification Primer Introduction Memory needs to be compatible with a wide variety of memory controller hubs used by the computer DRAM (Dynamic Random Access Memory) is attractive to manufacturers. designers because it provides a broad range of performance Memory needs to work when a mixture of different and is used in a wide variety of memory system designs for manufacturer’s memories is used in the same memory computers and embedded systems. This DRAM memory system of the computer. primer provides an overview of DRAM concepts, presents potential future DRAM developments and offers an overview Open memory standards are useful in helping to ensure for memory design improvement through verification. memory compatibility. DRAM Trends On the other hand, embedded systems typically use a fixed There is a continual demand for computer memories to be memory configuration, meaning the user does not modify larger, faster, lower powered and physically smaller. These the memory system after purchasing the product.
    [Show full text]
  • Ddr4 Sdram Memory
    DDR4 SDRAM MEMORY 3D4D32G72LB2758 Datasheet 2. KEY FEATURES Including decoupling and termination Industrial up to Military temperature range Organization: 512M x 64 bits + 8b ECC Max Clock rate available: 1200MHz Max Transfer Rate 2400MT/s VDD/VDDQ = 1.2V (1.14V to 1.26V) 1. PRODUCT OVERVIEW VPP = 2.5V (2.375V to 2.75V) Command/Address latency The 3D4D32G72LB2758 is a 32Gbits high speed Burst length (BL): 8 (BL8) and Burst Chop 4 DDR4 SDRAM, organized as 512M x 64 bits and (BC4) modes 8 bits additional for ECC. Bi-directional Differential Data Strobe (DQS) This synchronous device operates at a maximum DLL aligns DQ and DQS transitions with CK transfer rate of up to 2400 MT/s (DDR4-2400), for an aggregate data rate of 153.6Gbps. The module Commands entered on each positive CK is designed to comply with the key DDR4 SDRAM transition features including: (1) posted CAS by Data and data mask are referenced to both programmable additive latency, (2) On Die edges of a differential data strobe pair Termination, (3) programmable driver strength (Double data rate) data, (4) clock rate of 1200MHz. Programmable on-die termination (ODT) All the control and address inputs are Refresh: Self-refresh, Auto refresh and synchronized with a pair of externally supplied Partial array Self-refresh differential clocks. Inputs are latched at the cross ZQ calibration for DQ drive and ODT point of differential clocks (CK_t rising and CK_c falling). All I/Os are synchronized with a pair of Asynchronous RESET pin for Power-up initialization and reset function bidirectional differential data strobes (DQS_t and DQS_c) in a source synchronous fashion.
    [Show full text]
  • Datakommunikasjon Og Maskinvare
    Institutt for økonomi og IT Datakommunikasjon og maskinvare Kompendium til emnet 6105N Windows Server og datanett Jon Kvisli Universitetet i Sørøst-Norge, Januar 2020 © Jon Kvisli, 2020 Windows® og Windows Server® er varemerker registrert av Microsoft Corporation. Det må ikke kopieres fra dette heftet ut over det som er tillatt etter bestemmelser i «Lov om opphavsrett til åndsverk», og avtaler om kopiering inngått med Kopinor. Forsidebilde: Colourbox Sats: Jon Kvisli 2 Innhold 1. GRUNNPRINSIPPER I DATAKOMMUNIKASJON .......................................................................... 4 1.1. Punkt-til-punkt kommunikasjon ................................................................................................................ 4 1.2. Kommunikasjonsbusser ............................................................................................................................. 8 1.3. Signalering ................................................................................................................................................ 10 1.4. Dempning, støy og forvrengning.............................................................................................................. 14 1.5. Digitale og analoge signaler ..................................................................................................................... 16 1.6. Noen kommunikasjonsstandarder ........................................................................................................... 17 1.7. Sammendrag ...........................................................................................................................................
    [Show full text]
  • AN5097, Hardware and Layout Design Considerations for DDR4
    NXP Semiconductors Document Number: AN5097 Application Note Rev. 1, 07/2016 Hardware and Layout Design Considerations for DDR4 SDRAM Memory Interfaces Contents 1 About this document 1 About this document................................................. 1 2 Recommended resources................... ....................... 1 This document provides general hardware and layout considerations and guidelines for hardware engineers 3 DDR4 design checklist................... ..........................2 implementing a DDR4 memory subsystem. 4 Selecting termination resistors......... ........................ 9 The rules and recommendations in this document serve as an 5 Avoiding VREF noise problems........ .......................9 initial baseline for board designers to begin their specific implementations, such as fly-by memory topology. 6 Calculating VTT current.................. ........................ 9 NOTE 7 Layout guidelines for DDR signal It is strongly recommended that the board groups..................................................... ................ 10 designer verifies that all aspects, such as 8 Using simulation models................. .......................15 signal integrity, electrical timings, and so on, are addressed by using simulation 9 Revision history...................................................... 16 models before board fabrication. A LS1088A DDR layout routing break out........................................................................... 17 B DRAM reset signal considerations......................... 23 2 Recommended
    [Show full text]
  • 4Gb: X16 DDR4 SDRAM Features DDR4 SDRAM EDY4016A - 256Mb X 16
    4Gb: x16 DDR4 SDRAM Features DDR4 SDRAM EDY4016A - 256Mb x 16 Features • Databus write cyclic redundancy check (CRC) • Per-DRAM addressability •VDD = VDDQ = 1.2V ±60mV • Connectivity test •VPP = 2.5V, –125mV/+250mV • JEDEC JESD-79-4 compliant • On-die, internal, adjustable VREFDQ generation • 1.2V pseudo open-drain I/O Options1 Marking •TC of 0°C to 95°C • Revision A – 64ms, 8192-cycle refresh at 0°C to 85°C • FBGA package size – 32ms at 85°C to 95°C – 96-ball (7.5mm x 13.5mm) BG • 8 internal banks: 2 groups of 4 banks each • Timing – cycle time •8n-bit prefetch architecture – 0.625ns @ CL = 24 (DDR4-3200) -JD • Programmable data strobe preambles – 0.750ns @ CL = 19 (DDR4-2666) -GX • Data strobe preamble training – 0.833ns @ CL = 16 (DDR4-2400) -DR • Command/Address latency (CAL) • Packaging • Multipurpose register READ and WRITE capability – Lead-free (RoHS-compliant) and hal- - F • Write and read leveling ogen-free • Self refresh mode • Low-power auto self refresh (LPASR) Notes: 1. Not all options listed can be combined to • Temperature controlled refresh (TCR) define an offered product. Use the part • Fine granularity refresh catalog search on http://www.micron.com • Self refresh abort for available offerings. • Maximum power saving 2. Restricted and limited availability. • Output driver calibration • Nominal, park, and dynamic on-die termination (ODT) • Data bus inversion (DBI) for data bus • Command/Address (CA) parity Table 1: Key Timing Parameters Speed Grade Data Rate (MT/s) Target tRCD-tRP-CL tRCD (ns) tRP (ns) CL (ns) -JD1 3200 24-24-24 15.0 15.0 15.0 -GX2 2666 19-19-19 14.25 14.25 14.25 -DR3 2400 16-16-16 13.32 13.32 13.32 Notes: 1.
    [Show full text]
  • DDR4 SDRAM Registered DIMM Based on 8Gb A-Die
    288pin DDR4 SDRAM Registered DIMM DDR4 SDRAM Registered DIMM Based on 8Gb A-die HMA81GR7AFR8N HMA82GR7AFR8N HMA82GR7AFR4N HMA84GR7AFR4N *SK hynix reserves the right to change products or specifications without notice. Rev. 1.2 / Jun.2016 1 Revision History Revision No. History Draft Date Remark 0.1 Initial Release Oct.2015 0.2 Added 2666Mbps Feb.2016 0.9 Updated IDD Specification Mar.2016 1.0 Updated IDD Specification(1Rx4/2Rx4) Mar.2016 1.1 Updated IPP Specification Apr.2016 1.2 Updated IDD/IPP Specification Jun.2016 Rev. 1.2 / Jun.2016 2 Description SK hynix Registered DDR4 SDRAM DIMMs (Registered Double Data Rate Synchronous DRAM Dual In-Line Memory Modules) are low power, high-speed operation memory modules that use DDR4 SDRAM devices. These Registered SDRAM DIMMs are intended for use as main memory when installed in systems such as servers and workstations. Features • Power Supply: VDD=1.2V (1.14V to 1.26V) • VDDQ = 1.2V (1.14V to 1.26V) • VPP - 2.5V (2.375V to 2.75V) • VDDSPD=2.25V to 2.75V • Functionality and operations comply with the DDR4 SDRAM datasheet • 16 internal banks • Bank Grouping is applied, and CAS to CAS latency (tCCD_L, tCCD_S) for the banks in the same or dif- ferent bank group accesses are available • Data transfer rates: PC4-2666, PC4-2400, PC4-2133, PC4-1866, PC4-1600 • Bi-Directional Differential Data Strobe • 8 bit pre-fetch • Burst Length (BL) switch on-the-fly BL8 or BC4(Burst Chop) • Supports ECC error correction and detection • On-Die Termination (ODT) • Temperature sensor with integrated SPD • This product is in compliance with the RoHS directive.
    [Show full text]
  • DDR Sdrams a Suncam Online Continuing Education Course
    Memories in Computers—Part 2: DDR SDRAMs A SunCam online continuing education course Memories in Computers Part 2: DDR SDRAMs by Dr. William R. Huber, P.E. www.SunCam.com Copyright 2011 William R. Huber (updated 2018) Page 1 of 43 Memories in Computers—Part 2: DDR SDRAMs A SunCam online continuing education course A. Introduction To understand the operation and advantages of Double Data Rate (DDR) Synchronous Dynamic Random Access Memories (DDR SDRAMs), we will start by reviewing the architecture of a single data rate (SDR) SDRAM. In keeping with industry practice, the balance of this course will refer to SDR SDRAMs simply as SDRAMs. A Micron data sheet provides a functional block diagram.1 1 Micron MT48LC128M4A2 512Mb SDRAM Data Sheet, Rev. M 6/10 EN, page 8 www.SunCam.com Copyright 2011 William R. Huber (updated 2018) Page 2 of 43 Memories in Computers—Part 2: DDR SDRAMs A SunCam online continuing education course SDRAM architecture is very similar to that of an asynchronous DRAM, with the addition of Command Inputs (CS#, WE#, CAS# and RAS#)2, Clock Inputs (CKE, CLK) and Control Logic including Command Decode and Mode Register. Some of the significant differences in the operation of an SDRAM as compared to a DRAM are as follows: • SDRAM operation is initiated by the low to high transition of the external clock, and data bits are input or output in synchronism with the external clock; • In an SDRAM, the operational mode (ACTIVE, READ, WRITE, REFRESH, etc.) is established by a command issued at the beginning of the cycle; • The SDRAM command is defined by the logic state of several inputs (e.g.; RAS#, CAS#, WE# and A10) when CS# (Chip Select) is low, CKE (Clock Enable) is high, and CLK (external Clock) transitions from low to high; • The SDRAM is organized into banks, which can operate virtually independently of each other.
    [Show full text]