Techniquesfor

Total Page:16

File Type:pdf, Size:1020Kb

Techniquesfor TECHNIQUESFOR Design Examples, Signaling and Memory Technologies, Fiber Optics, Modeling and Simulation to Ensure Signal Zntegrity Tom Granberg, P1i.D. PRENTICEHALL PTR UPPER SADDLERIVER, NJ 07458 WWW.PHiTRCOM PTR Preface xxxvii How This Book 1s Organized xxxvii This Textbook Was Written with Educational Institutions in Mind xxxix University Courses for Which This Book 1s Suitable xl Solutions Manual 1s Available xl Cash for Identifying Textbook Errors xl How This Book Was Prepared xli Personal Acknowledgments xli Technical Acknowledgments xliii Part 1 Introduction 1 Chapter 1 Trends in High-Speed Design 1.1 Everything Keeps Getting Faster and Faster! 1.2 Emerging Technologies and Industry Trends 1.2.1 Major Drivers of Printed Circuit Board (PCB) Technology 1.2.2 Drivers of Innovation 1.2.3 110 Signaling Standards 1.2.4 Web Site as Retailer 1.2.5 Memories 1.2.6 On-Die Terminations 1.3 Trends in Bus Architecture 1.3.1 Moving from Parallel to Serial 1.3.2 The Power of Tools 1.3.3 ASSPs and ASMs 1.4 High-Speed Design as an Offshoot from Microwave Theory 1.5 Background Disciplines Needed for High-Speed Design 1.5.1 High-Speed Conferences and Forums 1.6 Book Organization 1.7 Exercises X Contents cha&er 2 ASICs, Backplane Configurations, and SerDes Technology 2.1 Application-Specific Integrated Circuits (ASICs) 2.2 Bus Configurations 2.2.1 Single-Termination Multidrop 2.2.2 Double-Termination Multidrop 2.2.3 Data Distribution with Point-to-Point Links 2.2.4 Multipoint 2.2.5 Switch Matrix Mesh und Fahric Point-to-Point Bus Architectures 2.3 SerDes Devices 2.3.1 SerDes Device Fundamentals 2.3.2 SerDes at 5 Gbps 2.3.3 SerDes Multibit Signal Encoding 2.4 Electrical Interconnects vs. Fiber Optics 2.5 Subtleties of Device Families 2.5.1 Logic vs. Interface Families 2.5.2 Murky Device Categories 2.5.3 Logic Family vs. Signaling Standard 2.6 EDN Magazine's Microprocessor Directory 2.7 Exercises Chapter 3 A Few Basics on Signal Integrity 3.1 Transmission Lines and Termination 3.1.1 Transmission Line Equations 3.1.2 Reflection Coefficients, Lattice Diagrams, and Termination 3.2 Important High-Speed Concepts 3.2.1 Rise Time and Edge Rate 3.2.2 Length of the Rising Edge 3.2.3 Knee Frequency 3.2.4 Single-Ended vs. Differential Transmission 3.2.5 Fast Edge Rate Effects 3.2.6 Parasitics 3.3 High-Frequency Effects: Skin Effect, Crowding Effect, Return Path Resistance, and Frequency-Dependent Dielectric Loss 3.4 Jitter Measurements Using Eye Patterns 3.5 BER Testing 3.6 Exercises Contents xi Part 2 Signaling Technologies and Devices 49 Chapter 4 Gunning Transceiver Logic (GTL, GTLP, GTL+, AGTL+) 4.1 Evolution from Backplane Transceiver Logic (BTL) 4.2 Gunning Transceiver Logic (GTL) 4.3 Gunning Transceiver Logic Plus (GTLP) 4.3.1 GTLP General Description and Applications 4.3.2 GTLP Throughput and Performance 4.3.3 GTLP Signaling Levels, Noise Margins, and Current Drive 4.3.4 GTLP Device Features Live Insertion and Extraction Controlled Edge Rates Bushold (A Port) 4.3.5 GTLP Backplane Design Considerations 4.3.6 GTLP Power Consumption 4.4 Intel's AGTL+ and GTL+ 4.5 GTLP/GTL/GTL+/AGTL+ Summary 4.6 Exercises Chapter 5 Low Voltage Differential Signaling (LVDS) 5.1 Introduction to LVDS 5.1.1 How LVDS Works 5.1.2 Why Low Swing Differential? 5.1.3 The LVDS and M-LVDS Standards The TIMEIA-644-A Standard 5.1.4 Appearance of Laboratory LVDS Waveforms More Discussion of the Evaluation Board Common-Mode Noise Probing of High-Speed LVDS Signals 5.1.5 Easy Termination 5.1.6 Maximum Switching Speed 5.1.7 Saving Power 5.1.8 LVDS Configurations 5.1.9 Low Voltage Differential Signaling (LVDS) Families 5.1.10 LVDS as a Low-Cost Design Solution 5.1.11 Example of the Wide Range of LVDS Solutions xii Contents 5.2 Comparison of LVDS to Other Signaling Technologies Using Design Examples 5.2.1 LVDS Drivers and Receivers 5.2.2 100 Mbps Serial Interconnect 5.2.3 LVDS Channel Link Serializers 5.2.4 1 Gbps 16-Bit Interconnect 5.2.5 1.4 Gbps 56-Bit Backplane 5.3 Summary of LVDS Features and Applications 5.4 Exercises Chapter 6 Bus LVDS (BLVDS), LVDS Multipoint (LVDM), and Multipoint LVDS (M-LVDS) 6.1 Justification for Enhanced Versions of LVDS 6.2 Bus LVDS (BLVDS) 6.2.1 System Benefits of Bus LVDS 6.2.2 High-Speed Capability 6.2.3 Low Power 6.2.4 Low Swing, Low Noise, and Low EMI 6.2.5 Low System Cost 6.2.6 Bus Failsafe Biasing 6.2.7 Hot Plugging (Live Insertion) 6.3 LVDS Multipoint (LVDM) 6.4 Multipoint LVDS (M-LVDS) 6.4.1 The TIAIEIA-899 Standard 6.5 Selecting BLVDS, BLVM, and M-LVDS Devices 6.6 Exercises Chapter 7 High-Speed Transceiver Logic (HSTL) and Stub-Series Terminated Logic (SSTL) 7.1 High-Speed Transceiver Logic (HSTL) 7.1.1 The HSTL Standard 7.1.2 Supply Voltages and Logic Levels 7.1.3 Classes of HSTL Output Buffers 7.1.4 FPGAs with HSTL UOs 7.1.5 HSTL Summary 7.2 Stub-Series Terminated Logic (SSTL) 7.2.1 SSTL-3 Supply Voltage und Logic Input Levels SSTL-3 Output Buffers I Contents xiii 7.2.2 SSTL-2 SSTL-2 for Single-Ended Inputs and Outputs SSTL-2 for Differential Inputs and Outputs Illustration of SSTL-2 Thresholds Comparison of SSTL-2 with LVTTL SSTL-2 Design Example - DDR SDRAM Memory Subsystem 7.2.3 SSTL-18 7.2.4 Summary of SSTL 7.3 Exercises Chapter 8 Emitter Coupled Logic (ECL, PECL, LVPECL, ECLinPS Lite and Plus, SiGe, ECL Pro, GigaPro and GigaComm) 8.1 A Fast Technology - Edge Rates of 20 ps at 12 Gbps! 8.1.1 The ECL Families 8.1.2 ECL Vendor Products 8.1.3 Comparison of Several ECL Family Members Power Consumption of ECL Family Devices 8.2 Basic Device Operation 8.3 The Two Major ECL Standards - 10K and lOOK 8.3.1 ECL Output Load Drive Characteristics 8.3.2 The "1 0" and "1 00" Prefixes - Both Family and Standard 8.3.3 Five Kinds of ECL Family Outputs 8.4 Single-Ended and Differential Signaling 8.4.1 Standard ECL Interface: Differential Driver and Receiver Advantages and Disadvantages of Single-Ended and D~rerentialInterconnects 8.4.2 'single-~ndedInterface VBB Reference The Voltage Reference Source VBB Dedicated Single-Ended Input Structure Single-Ended Interface Between 10 and 100 Standards Voltage Transfer Curves 8.4.3 Differential Interface VIHCMR Dtfferential Interface Between 10 and 100 Standards ECL Noise Margins 8.5 Component Nomenclature 8.6 The ECL Families and Their Characteristics 8.6.1 A Little MECL History xiv Contents 10K 1 84 10H 185 Dual Meaning of IOH Prejx 185 1OOK 185 1OOH 186 IOOH Used as Designation for Clock Drivers/Translators 186 Caution: IOH und IOOH Devices with "L" Suffix May Use Other Power Options 186 Micrel's IOH and IOOH 187 ECL, PECL, Psuedo ECL, NECL, LVECL. LVPECL, and LVNECL 187 300 Series ECL 187 Super-300K ECL 188 9300 and 9400 Series ECUPECL 188 ON Serniconductor's GigaComm Family (SiGe) 188 Hot Swapping PECL Risk: Powered Driver und Unpowered Receiver 189 ECLinPS and Low Voltage ECLinPS 189 ECLinPS Lite, Low Voltage ECLinPs Lite, and ECL Lite 189 ECLinPS Plus, ECL Pro, ECLinPS Pro, and Low Voltage ECLinPS Plus 19 1 Reduced Swing ECL (RSECL, RSPECL, RSNECL) and Variable Outputs 19 1 Reduced-Swing ECL W. Low Voltage ECL I93 8.7 Summary of the ECL Families 8.8 Exercises Chapter 9 Current-Mode Logic (CML) 9.1 CML Overview 9.2 CML Output Structure 9.3 CML Input Structure 9.4 ac- and dc-Coupled CML Circuits 9.5 XAUI Interface Standard 9.6 CML Design Considerations 9.6.1 Pre-Emphasis, De-Emphasis, Transmit Equalization, and Receive Equalization 9.6.2 ac Coupling Requires 8BIIOB Encoding or dc-Balanced Signal 9.7 How CML and ECL Differ 9.8 SuperLite CML and GigaProTM CML 9.9 Vendor-SpecificCMLExamples 9.9.1 Texas Instruments' SN65CML 100 9.9.2 Texas Instruments' TLK2501 1.5 to 2.5 Gbps Transceiver Contente 9.9.3 Maxim's MAX3800 3.2 Gbps Adaptive Equalizer and Cable Driver Adaptive Equalization 9.10 Summary of Current-Mode Logic (CML) 9.1 1 Exercises Chapter 10 FPGAs - 3.125 Gbps RocketIOs and HardCopy Devices 10.1 Industry Trends 10.2 Altera FPGAs and CPLDs 10.2.1 Altera FPGAs with Embedded High-Speed Transceivers Stratix GX FPGAs with up to 20 Channels of 3.1825 Gbps SerDes Mercury FPGAs with up to 45 Gbps of Bandwidth 10.2.2 Altera HardCopy Devices Elimination of ASlC Risk HardCopy Devices Designed with Quartus I1 Sofrware HardCopy Stratk and APEX Devices 10.2.3 High-Density FPGAs Stratix FPGAs APEX FPGAs 10.2.4 Low-CostIHigh-Volume FPGAs Cyclone FPGAs ACEX FPGAs 10.2.5 Altera FPGAs with Embedded Processors Excalibur Devices 10.2.6 Altera CPLDs MAX 3000 CPLDs MAX 7000 CPLDs MAX 7000AE CPLDs MAX 7000B CPLDs I MAX 7000s CPLDs 10.2% Configuration Devices 10.3 Slinx FPGAs and CPLDs 10.3.1 Virtex FPGAs 10.3.2 Spartan FPGAs 10.3.3 CPLDs CoolRunner CPLDs XC9500 10.3.4 More About the Virtex-I1 Pro FPGA 10.3.5 Virtex-I1 Pro Rocket10 Multi-Gigabit Transceiver 10.3.6 The Virtex-11 Pro PowerPC 405 Processor Core PPC405x3 Hardware Organization xvi Contents 10.3.7 Applications of the Virtex-I1 Pro Data Plpes Reducing PCB Compl~xity 10.3.8 Support of Communications Standards System-on-a-Chip (SOC)Designs Network Processing Protocol Bridges 10.3.9 Other Features of Virtex-I1 Pro Devices Global Clock Networks Single-Ended SelertlOTM-Ultra Resources LVDS U0 LVPECL U0 Block SelectRAhPM Memory Distributed SelectRAM Memory Bit~treamEncryption Loopback Digital Clock Managers (DCMs) Digitallj Controlled Impedance (DCI) Double-Data-Rate (DDR) I/O 10.3.10 IBIS and SPICE Models for Xilinx Devices 10.3.1 1 Xilinx Intellectual Property (IP) Cores 10.4 Exercises Chapter 11 Fiber-Optic Components 263 11.1 Getting On Board with Optics 263 11.1.
Recommended publications
  • Zynq-7000 All Programmable Soc and 7 Series Devices Memory Interface Solutions User Guide (UG586) [Ref 2]
    Zynq-7000 AP SoC and 7 Series Devices Memory Interface Solutions (v4.1) DS176 April 4, 2018 Advance Product Specification • I/O Power Reduction option reduces average I/O Introduction power by automatically disabling DQ/DQS IBUFs and The Xilinx® Zynq®-7000 All Programmable SoC and internal terminations during writes and periods of 7 series FPGAs memory interface solutions cores provide inactivity high-performance connections to DDR3 and DDR2 • Internal VREF support SDRAMs, QDR II+ SRAM, RLDRAM II/RLDRAM 3, and • LPDDR2 SDRAM. Multicontroller support for up to eight controllers • Two controller request processing modes: DDR3 and DDR2 SDRAMs o Normal: reorder requests to optimize system throughput and latency This section discusses the features, applications, and functional description of Xilinx 7 series FPGAs memory o Strict: memory requests are processed in the order interface solutions in DDR3 and DDR2 SDRAMs. These received solutions are available with an optional AXI4 slave interface. LogiCORE™ IP Facts Table Core Specifics DDR3 SDRAM Features Supported Zynq®-7000 All Programmable SoC (1) • Component support for interface widths up to 72 bits Device Family 7series(2) FPGAs • Supported DDR3 Component and DIMM, DDR2 Single and dual rank UDIMM, RDIMM, and SODIMM Memory support Component and DIMM, QDR II+, RLDRAM II, RLDRAM 3, and LPDDR2 SDRAM Components • DDR3 (1.5V) and DDR3L (1.35V) Resources See Table 1. • 1, 2, 4, and 8 Gb density device support Provided with Core • 8-bank support Documentation Product Specification • x8 and x16 device
    [Show full text]
  • Micron Technology Inc
    MICRON TECHNOLOGY INC FORM 10-K (Annual Report) Filed 10/26/10 for the Period Ending 09/02/10 Address 8000 S FEDERAL WAY PO BOX 6 BOISE, ID 83716-9632 Telephone 2083684000 CIK 0000723125 Symbol MU SIC Code 3674 - Semiconductors and Related Devices Industry Semiconductors Sector Technology Fiscal Year 03/10 http://www.edgar-online.com © Copyright 2010, EDGAR Online, Inc. All Rights Reserved. Distribution and use of this document restricted under EDGAR Online, Inc. Terms of Use. UNITED STATES SECURITIES AND EXCHANGE COMMISSION Washington, D.C. 20549 FORM 10-K (Mark One) ANNUAL REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934 For the fiscal year ended September 2, 2010 OR TRANSITION REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934 For the transition period from to Commission file number 1-10658 Micron Technology, Inc. (Exact name of registrant as specified in its charter) Delaware 75 -1618004 (State or other jurisdiction of (IRS Employer incorporation or organization) Identification No.) 8000 S. Federal Way, Boise, Idaho 83716 -9632 (Address of principal executive offices) (Zip Code) Registrant ’s telephone number, including area code (208) 368 -4000 Securities registered pursuant to Section 12(b) of the Act: Title of each class Name of each exchange on which registered Common Stock, par value $.10 per share NASDAQ Global Select Market Securities registered pursuant to Section 12(g) of the Act: None (Title of Class) Indicate by check mark if the registrant is a well-known seasoned issuer, as defined in Rule 405 of the Securities Act.
    [Show full text]
  • Understanding and Exploiting Design-Induced Latency Variation in Modern DRAM Chips
    Understanding and Exploiting Design-Induced Latency Variation in Modern DRAM Chips Donghyuk Leeyz Samira Khan@ Lavanya Subramaniany Saugata Ghosey Rachata Ausavarungniruny Gennady Pekhimenkoy{ Vivek Seshadriy{ Onur Mutluyx yCarnegie Mellon University zNVIDIA @University of Virginia {Microsoft Research xETH Zürich ABSTRACT 40.0%/60.5%, which translates to an overall system perfor- Variation has been shown to exist across the cells within mance improvement of 14.7%/13.7%/13.8% (in 2-/4-/8-core a modern DRAM chip. Prior work has studied and exploited systems) across a variety of workloads, while ensuring reli- several forms of variation, such as manufacturing-process- able operation. or temperature-induced variation. We empirically demon- strate a new form of variation that exists within a real DRAM 1 INTRODUCTION chip, induced by the design and placement of different compo- In modern systems, DRAM-based main memory is sig- nents in the DRAM chip: different regions in DRAM, based nificantly slower than the processor. Consequently, pro- on their relative distances from the peripheral structures, cessors spend a long time waiting to access data from require different minimum access latencies for reliable oper- main memory [5, 66], making the long main memory ac- ation. In particular, we show that in most real DRAM chips, cess latency one of the most critical bottlenecks in achiev- cells closer to the peripheral structures can be accessed much ing high performance [48, 64, 67]. Unfortunately, the la- faster than cells that are farther. We call this phenomenon tency of DRAM has remained almost constant in the past design-induced variation in DRAM.
    [Show full text]
  • Leveraging Heterogeneity in DRAM Main Memories to Accelerate Critical Word Access ∗
    Leveraging Heterogeneity in DRAM Main Memories to Accelerate Critical Word Access ∗ Niladrish Chatterjee‡ Manjunath Shevgoor‡ Rajeev Balasubramonian‡ Al Davis‡ Zhen Fang§† Ramesh Illikkal∞ Ravi Iyer∞ ‡University of Utah §Nvidia Corporation ∞Intel Labs {nil,shevgoor,rajeev,ald}@cs.utah.edu [email protected] {ramesh.g.illikkal,ravishankar.iyer}@intel.com Abstract of DRAM chips to build and exploit a heterogeneous memory sys- tem. This paper takes an important step in uncovering the potential The DRAM main memory system in modern servers is largely ho- of such a heterogeneous DRAM memory system. mogeneous. In recent years, DRAM manufacturers have produced The DRAM industry already produces chips with varying proper- chips with vastly differing latency and energy characteristics. This ties. Micron offers a Reduced Latency DRAM (RLDRAM) product provides the opportunity to build a heterogeneous main memory sys- that offers lower latency and lower capacity, and is targeted at high tem where different parts of the address space can yield different performance routers, switches, and network processing [7]. Mi- latencies and energy per access. The limited prior work in this area cron also offers a Low Power DRAM (LPDRAM) product that of- has explored smart placement of pages with high activities. In this fers lower energy and longer latencies and that has typically been paper, we propose a novel alternative to exploit DRAM heterogene- employed in the mobile market segment [6]. Our work explores ity. We observe that the critical word in a cache line can be easily innovations that can exploit a main memory that includes regular recognized beforehand and placed in a low-latency region of the DDR chips as well as RLDRAM and LPDRAM chips.
    [Show full text]
  • Dynamic Rams from Asynchrounos to DDR4
    Dynamic RAMs From Asynchrounos to DDR4 PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sun, 10 Feb 2013 17:59:42 UTC Contents Articles Dynamic random-access memory 1 Synchronous dynamic random-access memory 14 DDR SDRAM 27 DDR2 SDRAM 33 DDR3 SDRAM 37 DDR4 SDRAM 43 References Article Sources and Contributors 48 Image Sources, Licenses and Contributors 49 Article Licenses License 50 Dynamic random-access memory 1 Dynamic random-access memory Dynamic random-access memory (DRAM) is a type of random-access memory that stores each bit of data in a separate capacitor within an integrated circuit. The capacitor can be either charged or discharged; these two states are taken to represent the two values of a bit, conventionally called 0 and 1. Since capacitors leak charge, the information eventually fades unless the capacitor charge is refreshed periodically. Because of this refresh requirement, it is a dynamic memory as opposed to SRAM and other static memory. The main memory (the "RAM") in personal computers is dynamic RAM (DRAM). It is the RAM in laptop and workstation computers as well as some of the RAM of video game consoles. The advantage of DRAM is its structural simplicity: only one transistor and a capacitor are required per bit, compared to four or six transistors in SRAM. This allows DRAM to reach very high densities. Unlike flash memory, DRAM is volatile memory (cf. non-volatile memory), since it loses its data quickly when power is removed. The transistors and capacitors used are extremely small; billions can fit on a single memory chip.
    [Show full text]
  • Memory Systems Cache, DRAM, Disk
    Memory Systems Cache, DRAM, Disk Bruce Jacob University of Maryland at College Park Spencer W. Ng Hitachi Global Storage Technologies David T. Wang MetaRAM With Contributions By Samuel Rodriguez Advanced Micro Devices Xmmmk JÜOBSK'1'"'" AMSTERDAM • BOSTON • HEIDELBERG LONDON NEW YORK • OXFORD • PARIS • SAN DIEGO SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO :<* ELSEVIER Morgan Kaufmann is an imprint of Elsevier MORGAN KAUFMANN PUBLISHERS Preface "It's the Memory, Stupid!" xxxi Overview On Memory Systems and Their Design 1 Ov.l Memory Systems 2 Ov.1.1 Locality ofReference Breeds the Memory Hierarchy 2 Ov.1.2 ImportantFigures ofMerit 7 Ov.1.3 The Goal ofa Memory Hierarchy 10 Ov.2 Four Anecdotes on Modular Design 14 Ov.2.1 Anecdote I: Systemic Behaviors Exist 15 Ov.2.2 Anecdote II: The DLL in DDR SDRAM 17 Ov.2.3 Anecdote III: A Catch-22 in the Search for Bandwidth 18 Ov.2.4 Anecdote IV: Proposais to Exploit Variability in CellLeakage 19 Ov.2.5 Perspective 19 Ov.3 Cross-Cutting Issues 20 Ov.3.1 Cost/Performance Analysis 20 Ov.3.2 Power and Energy 26 Ov.3.3 Reliability 32 Ov.3.4 Virtual Memory 34 Ov.4 An Example Holistic Analysis 41 Ov.4.1 Fully-Buffered DIMM vs. the Disk Cache 41 Ov.4.2 FullyBufferedDIMM: Basics 43 Ov.4.3 Disk Caches: Basics 46 Ov.4.4 Experimental Results 47 Ov.4.5 Conclusions 52 Ov.5 What to Expect 54 IX - X Contents Part I Cache 55 Chapter 1 An Overview of Cache Principles 57 1.1 Caches, 'Caches/ and "Caches" 59 1.2 Locality Principles 62 1.2.1 Temporal Locality 63 1.2.2 Spatial Locality 63 1.2.3 Algorithmic Locality 64 1.2.4
    [Show full text]
  • MOCA: Memory Object Classification and Allocation in Heterogeneous Memory Systems
    MOCA: Memory Object Classification and Allocation in Heterogeneous Memory Systems Aditya Narayan∗, Tiansheng Zhang∗, Shaizeen Agay, Satish Narayanasamyy and Ayse K. Coskun∗ ∗Boston University, Boston, MA 02215, USA; Email: fadityan, tszhang, [email protected] yUniversity of Michigan, Ann Arbor, MI 48109, USA; Email: fshaizeen, [email protected] Abstract—In the era of abundant-data computing, main mem- Traditionally, the main memory (e.g., DRAM) of a com- ory’s latency and power significantly impact overall system puting system is composed of a set of homogeneous memory performance and power. Today’s computing systems are typically modules. The key attributes that determine the efficiency of a composed of homogeneous memory modules, which are optimized to provide either low latency, high bandwidth, or low power. Such memory system are latency, bandwidth, and power. An ideal memory modules do not cater to a wide range of applications with memory system should provide the highest data bandwidth diverse memory access behavior. Thus, heterogeneous memory at the lowest latency with minimum power consumption. systems, which include several memory modules with distinct However, there is no such perfect memory system as a memory performance and power characteristics, are becoming promising module with high performance generally has a high power alternatives. In such a system, allocating applications to their best-fitting memory modules improves system performance and density. Hence, memory modules come in different flavors, energy efficiency. However, such an approach still leaves the optimized to either improve system performance or reduce full potential of heterogeneous memory systems under-utilized power consumption. because not only applications, but also the memory objects within Due to diverse memory access behavior across workloads, a that application differ in their memory access behavior.
    [Show full text]
  • Zynq-7000 All Programmable Soc and 7 Series Devices Memory Interface Solutions User Guide (UG586) [Ref 2]
    Zynq-7000 AP SoC and 7 Series Devices Memory Interface Solutions (v2.3) DS176 June 24, 2015 Advance Product Specification Introduction • Strict: memory requests are processed in the order received The Xilinx® Zynq®-7000 All Programmable SoC and 7 series FPGAs memory interface solutions cores provide LogiCORE™ IP Facts Table high-performance connections to DDR3 and DDR2 Core Specifics SDRAMs, QDR II+ SRAM, RLDRAM II/RLDRAM 3, Supported Zynq®-7000All Programmable SoC, Virtex®-7(2), Kintex®-7(2), and LPDDR2 SDRAM. Device ® Family(1) Artix -7 Supported DDR3 Component and DIMM, DDR2 Component and DIMM, DDR3 and DDR2 SDRAMs Memory QDR II+, RLDRAM II, RLDRAM 3, and LPDDR2 SDRAM Components This section discusses the features, applications, and Resources See Table 1. functional description of Xilinx 7 series FPGAs memory interface solutions in DDR3 and DDR2 SDRAMs. These Provided with Core solutions are available with an optional AXI4 slave Documentation Product Specification User Guide interface. Design Files Verilog, VHDL (top-level files only) Example DDR3 SDRAM Features Design Verilog, VHDL (top-level files only) • Component support for interface widths up to 72 bits Test Bench Not Provided • Single and dual rank UDIMM, RDIMM, and Constraints XDC SODIMM support File • DDR3 (1.5V) and DDR3L (1.35V) Supported N/A • 1, 2, 4, and 8 Gb density device support S/W Driver • 8-bank support Tested Design Flows(3) • x8 and x16 device support Design Entry Vivado® Design Suite • 8:1 DQ:DQS ratio support Simulation For supported simulators, see the • Configurable data bus widths (multiples of 8, up to Xilinx Design Tools: Release Notes Guide.
    [Show full text]
  • External Memory Interface Handbook Volume 2: Design Guidelines
    External Memory Interface Handbook Volume 2: Design Guidelines Last updated for Altera Complete Design Suite: 15.0 Subscribe EMI_DG 101 Innovation Drive 2015.05.04 San Jose, CA 95134 Send Feedback www.altera.com TOC-2 Selecting Your Memory Contents Selecting Your Memory.......................................................................................1-1 DDR SDRAM Features............................................................................................................................... 1-2 DDR2 SDRAM Features............................................................................................................................. 1-3 DDR3 SDRAM Features............................................................................................................................. 1-3 QDR, QDR II, and QDR II+ SRAM Features..........................................................................................1-4 RLDRAM II and RLDRAM 3 Features.....................................................................................................1-4 LPDDR2 Features........................................................................................................................................ 1-6 Memory Selection........................................................................................................................................ 1-6 Example of High-Speed Memory in Embedded Processor....................................................................1-9 Example of High-Speed Memory in Telecom......................................................................................
    [Show full text]
  • I/O: a Detailed Example
    ECE 485/585 Microprocessor System Design Lecture 5: DRAM Basics DRAM Evolution SDRAM-based Memory Systems Zeshan Chishti Electrical and Computer Engineering Dept. Maseeh College of Engineering and Computer Science Sources: Lecture based on materials provided by Mark F. Jacob’s DRAM Systems article Memory component datasheets Outline Taxonomy of Memories Memory Hierarchy SRAM ◼ Basic Cell, Devices, Timing DRAM ◼ Basic Cell, Timing Memory Organization ◼ Multiple banks, interleaving DRAM Evolution DDR3 SDRAM DRAM modules Error Correction Memory Controllers ECE 485/585 Dynamic RAM (DRAM) ECE 485/585 DRAM Technology DRAM Cell • Write – Drive bit line word line – Select desired word (“row”) • Read – Pre-charge bit line Bit state (1 or 0) stored as – Select desired word (“row”) charge on a tiny capacitor – Sense charge bit line – Write value back (restore) 1 transistor • Refresh! – Periodically read each cell • (forcing write-back) Read is destructive → must restore value Charge leaks out over time → refresh ECE 485/585 Volatile Memory Comparison SRAM Cell DRAM Cell word line word line bit line bit line bit line Larger cell Smaller cell ◼ lower density, higher cost/bit ◼ higher density, lower cost/bit Non-destructive Read Destructive Read No refresh required Needs periodic refresh Simple read faster access Complex read longer access time Standard IC process natural for Special IC process difficult to integration with logic integrate with logic circuits Non-multiplexed address lines Multiplexed address lines ◼ Density
    [Show full text]
  • CROW: a Low-Cost Substrate for Improving DRAM Performance, Energy Efficiency, and Reliability Hasan Hassan† Minesh Patel† Jeremie S
    CROW: A Low-Cost Substrate for Improving DRAM Performance, Energy Efficiency, and Reliability Hasan Hassany Minesh Pately Jeremie S. Kimy§ A. Giray Yaglikciy Nandita Vijaykumary§ Nika Mansouri Ghiasiy Saugata Ghose§ Onur Mutluy§ yETH Zürich §Carnegie Mellon University ABSTRACT instruction and data cache miss rates, and 3) have low memory-level DRAM has been the dominant technology for architecting main parallelism. While manufacturers offer latency-optimized DRAM memory for decades. Recent trends in multi-core system design and modules [72, 98], these modules have significantly lower capac- large-dataset applications have amplified the role of DRAM asa ity and higher cost compared to commodity DRAM [8, 53, 58]. critical system bottleneck. We propose Copy-Row DRAM (CROW), Thus, reducing the high DRAM access latency without trading off a flexible substrate that enables new mechanisms for improving capacity and cost in commodity DRAM remains an important chal- DRAM performance, energy efficiency, and reliability. We use the lenge [17, 18, 58, 78]. CROW substrate to implement 1) a low-cost in-DRAM caching Second, the high DRAM refresh overhead is a challenge to im- mechanism that lowers DRAM activation latency to frequently- proving system performance and energy consumption. A DRAM accessed rows by 38% and 2) a mechanism that avoids the use of cell stores data in a capacitor that leaks charge over time. To main- short-retention-time rows to mitigate the performance and energy tain correctness, every DRAM cell requires periodic refresh opera- overhead of DRAM refresh operations. CROW’s flexibility allows tions that restore the charge level in a cell.
    [Show full text]
  • In-DRAM Cache Management for Low Latency and Low Power 3D-Stacked Drams
    micromachines Article In-DRAM Cache Management for Low Latency and Low Power 3D-Stacked DRAMs Ho Hyun Shin 1,2 and Eui-Young Chung 2,* 1 Samsung Electronics Company, Ltd., Hwasung 18448, Korea; [email protected] 2 School of Electrical and Electronic Engineering, Yonsei University, Seoul 03722, Korea * Correspondence: [email protected]; Tel.: +82-2-2123-5866 Received: 24 December 2018; Accepted: 5 February 2019; Published: 14 February 2019 Abstract: Recently, 3D-stacked dynamic random access memory (DRAM) has become a promising solution for ultra-high capacity and high-bandwidth memory implementations. However, it also suffers from memory wall problems due to long latency, such as with typical 2D-DRAMs. Although there are various cache management techniques and latency hiding schemes to reduce DRAM access time, in a high-performance system using high-capacity 3D-stacked DRAM, it is ultimately essential to reduce the latency of the DRAM itself. To solve this problem, various asymmetric in-DRAM cache structures have recently been proposed, which are more attractive for high-capacity DRAMs because they can be implemented at a lower cost in 3D-stacked DRAMs. However, most research mainly focuses on the architecture of the in-DRAM cache itself and does not pay much attention to proper management methods. In this paper, we propose two new management algorithms for the in-DRAM caches to achieve a low-latency and low-power 3D-stacked DRAM device. Through the computing system simulation, we demonstrate the improvement of energy delay product up to 67%. Keywords: 3D-stacked; DRAM; in-DRAM cache; low-latency; low-power 1.
    [Show full text]